Skip to content

PPO 跑example例子报错:value should be one of int, float, str, bool, or torch.Tensor #4458

Closed
@xudong2019

Description

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.8.3.dev0
  • Platform: Linux-6.1.85+-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • PyTorch version: 2.3.0+cu121 (GPU)
  • Transformers version: 4.41.2
  • Datasets version: 2.20.0
  • Accelerate version: 0.31.0
  • PEFT version: 0.11.1
  • TRL version: 0.9.4
  • GPU type: NVIDIA A100-SXM4-40GB

Reproduction

在colab上运行的
!llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml # 正常结果
!llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml # 这部报错

image

Traceback (most recent call last):
File "/usr/local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/cli.py", line 110, in main
run_exp()
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/tuner.py", line 54, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 58, in run_ppo
ppo_trainer = CustomPPOTrainer(
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 118, in init
PPOTrainer.init(
File "/usr/local/lib/python3.10/dist-packages/trl/trainer/ppo_trainer.py", line 227, in init
self.accelerator.init_trackers(
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 685, in _inner
return PartialState().on_main_process(function)(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2586, in init_trackers
tracker.store_init_configuration(config)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 79, in execute_on_main_process
return PartialState().on_main_process(function)(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 211, in store_init_configuration
self.writer.add_hparams(values, metric_dict={})
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/writer.py", line 341, in add_hparams
exp, ssi, sei = hparams(hparam_dict, metric_dict, hparam_domain_discrete)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/summary.py", line 316, in hparams
raise ValueError(
ValueError: value should be one of int, float, str, bool, or torch.Tensor

Expected behavior

No response

Others

No response

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    solvedThis problem has been already solved

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions