Description
Reminder
- I have read the README and searched the existing issues.
System Info
llamafactory
version: 0.8.3.dev0- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- PyTorch version: 2.3.0+cu121 (GPU)
- Transformers version: 4.41.2
- Datasets version: 2.20.0
- Accelerate version: 0.31.0
- PEFT version: 0.11.1
- TRL version: 0.9.4
- GPU type: NVIDIA A100-SXM4-40GB
Reproduction
在colab上运行的
!llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml # 正常结果
!llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml # 这部报错
Traceback (most recent call last):
File "/usr/local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/cli.py", line 110, in main
run_exp()
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/tuner.py", line 54, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 58, in run_ppo
ppo_trainer = CustomPPOTrainer(
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 118, in init
PPOTrainer.init(
File "/usr/local/lib/python3.10/dist-packages/trl/trainer/ppo_trainer.py", line 227, in init
self.accelerator.init_trackers(
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 685, in _inner
return PartialState().on_main_process(function)(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2586, in init_trackers
tracker.store_init_configuration(config)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 79, in execute_on_main_process
return PartialState().on_main_process(function)(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 211, in store_init_configuration
self.writer.add_hparams(values, metric_dict={})
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/writer.py", line 341, in add_hparams
exp, ssi, sei = hparams(hparam_dict, metric_dict, hparam_domain_discrete)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/summary.py", line 316, in hparams
raise ValueError(
ValueError: value should be one of int, float, str, bool, or torch.Tensor
Expected behavior
No response
Others
No response
Activity