Description
System Info
win11
Python3.12.6
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
I'm from https://www.llama.com/ I downloaded the meta lama/Llama-3.2-3B-Instruction model and then https://github.com/huggingface/transformers After downloading convert_1lama_ weights_to-hf-py and executing python convert_1lama_ weights_to-hf-py -- input-dir llama-v3p2-3b base -- model_2 3B -- export-dir llama-v3p2-3b base huggingface, an error occurred:
python convert_llama_weights_to_hf.py --input_dir llama-v3p2-3b-base --model_size 3B --output_dir llama-v3p2-3b-base-huggingface
Expected behavior
Converting the tokenizer.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in #24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Saving a LlamaTokenizerFast to llama-v3p2-3b-base-huggingface.
Converting the model.
Fetching all parameters from the checkpoint at llama-v3p2-3b-base.
Loading the checkpoint in a Llama model.
Loading checkpoint shards: 72%|███████████████████████████████████████▊ | 21/29 [00:02<00:00, 9.50it/s]
Traceback (most recent call last):
File "C:\WORK\AITools\convert_llama_weights_to_hf.py", line 601, in
main()
File "C:\WORK\AITools\convert_llama_weights_to_hf.py", line 587, in main
write_model(
File "C:\WORK\AITools\convert_llama_weights_to_hf.py", line 417, in write_model
model = LlamaForCausalLM.from_pretrained(tmp_model_path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 262, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 4319, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 4897, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 896, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "C:\ProgramData\anaconda3\Lib\site-packages\accelerate\utils\modeling.py", line 287, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([128256, 3072]) in "weight" (which has shape torch.Size([128003, 3072])), this looks incorrect.
Activity