Description
Reminder
- I have read the README and searched the existing issues.
System Info
llamafactory
version: 0.8.1.dev0- Platform: Linux-5.4.0-155-generic-x86_64-with-glibc2.31
- Python version: 3.10.14
- PyTorch version: 2.1.2+cu121 (GPU)
- Transformers version: 4.41.2
- Datasets version: 2.18.0
- Accelerate version: 0.31.0
- PEFT version: 0.11.1
- TRL version: 0.9.4
- GPU type: NVIDIA A800-SXM4-80GB
- DeepSpeed version: 0.14.0
- Bitsandbytes version: 0.43.0
- vLLM version: 0.4.0.post1
Reproduction
运行命令:
llamafactory-cli train
--stage ppo
--do_train True
--model_name_or_path saves/LLaMA3-8B/full/train_2024-06-10-00-24-00
--preprocessing_num_workers 16
--finetuning_type full
--template default
--flash_attn auto
--dataset_dir data
--dataset Taiyi_test
--cutoff_len 1024
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--optim adamw_torch
--packing False
--report_to none
--output_dir saves/LLaMA3-8B/full/train_2024-06-10-16-43-48
--fp16 True
--plot_loss True
--ddp_timeout 180000000
--include_num_input_tokens_seen True
--reward_model saves/LLaMA3-8B/full/train_2024-06-10-00-24-00
--reward_model_type full
--deepspeed cache/ds_z3_offload_config.json
--top_k 0
--top_p 0.6
报错
06/10/2024 13:46:22 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
Traceback (most recent call last):
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in
launch()
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 40, in run_ppo
reward_model = create_reward_model(model, model_args, finetuning_args)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/utils.py", line 151, in create_reward_model
reward_model = load_model(
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/model/loader.py", line 116, in load_model
patch_config(config, tokenizer, model_args, init_kwargs, is_trainable)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/model/patcher.py", line 82, in patch_config
if init_kwargs["device_map"] == "auto":
KeyError: 'device_map'
解决:
Expected behavior
运行ppo时报错KeyError: 'device_map', LLaMA-Factory/src/llamafactory/model/patcher.py line:82行
可能需要添加一条判断语句
if "device_map" in init_kwargs and init_kwargs["device_map"] == "auto":
init_kwargs["offload_folder"] = model_args.offload_folder
这部分如果有特殊逻辑,请根据特殊逻辑调整~
Others
No response
Activity