dpo训练报错AttributeError: 'CustomDPOTrainer' object has no attribute '_peft_has_been_casted_to_bf16' #2164
Closed
1 task done
Labels
solved
This problem has been already solved
Reminder
Reproduction
Running tokenizer on dataset: 100%|██████████| 20000/20000 [00:10<00:00, 1970.54 examples/s]
/root/.local/lib/python3.10/site-packages/transformers/training_args.py:1751: FutureWarning:
--push_to_hub_token
is deprecated and will be removed in version 5 of 🤗 Transformers. Use--hub_token
instead.warnings.warn(
[WARNING|trainer.py:1520] 2024-01-12 00:07:22,322 >> No valid checkpoint found in output directory (/mnt/workspace/TuningFactory/Outputs/20240111/test_dpo_best_20240111_235439), training from scratch.
0%| | 0/936 [00:00<?, ?it/s][WARNING|logging.py:314] 2024-01-12 00:07:32,879 >> You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the
__call__
method is faster than using a method to encode the text followed by a call to thepad
method to get a padded encoding.Traceback (most recent call last):
File "/checkpoint/binary/train_package/src/train_bash.py", line 24, in
main()
File "/checkpoint/binary/train_package/src/train_bash.py", line 7, in main
run_exp()
File "/checkpoint/binary/train_package/src/llmtuner/train/tuner.py", line 36, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/checkpoint/binary/train_package/src/llmtuner/train/dpo/workflow.py", line 64, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/root/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1561, in train
return inner_training_loop(
File "/root/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1878, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/root/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2773, in training_step
loss = self.compute_loss(model, inputs)
File "/root/.local/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1052, in compute_loss
compute_loss_context_manager = torch.cuda.amp.autocast if self._peft_has_been_casted_to_bf16 else nullcontext
AttributeError: 'CustomDPOTrainer' object has no attribute '_peft_has_been_casted_to_bf16'
0%| | 0/936 [00:00<?, ?it/s]
Expected behavior
No response
System Info
No response
Others
No response
The text was updated successfully, but these errors were encountered: