Skip to content

ValueError: Trying to set a tensor of shape torch.Size([128256, 3072]) in "weight" (which has shape torch.Size([128003, 3072])), this looks incorrect #36350

Open
@yourtiger

Description

System Info

win11
Python3.12.6

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I'm from https://www.llama.com/ I downloaded the meta lama/Llama-3.2-3B-Instruction model and then https://github.com/huggingface/transformers After downloading convert_1lama_ weights_to-hf-py and executing python convert_1lama_ weights_to-hf-py -- input-dir llama-v3p2-3b base -- model_2 3B -- export-dir llama-v3p2-3b base huggingface, an error occurred:
python convert_llama_weights_to_hf.py --input_dir llama-v3p2-3b-base --model_size 3B --output_dir llama-v3p2-3b-base-huggingface

Expected behavior

Converting the tokenizer.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in #24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Saving a LlamaTokenizerFast to llama-v3p2-3b-base-huggingface.
Converting the model.
Fetching all parameters from the checkpoint at llama-v3p2-3b-base.
Loading the checkpoint in a Llama model.
Loading checkpoint shards: 72%|███████████████████████████████████████▊ | 21/29 [00:02<00:00, 9.50it/s]
Traceback (most recent call last):
File "C:\WORK\AITools\convert_llama_weights_to_hf.py", line 601, in
main()
File "C:\WORK\AITools\convert_llama_weights_to_hf.py", line 587, in main
write_model(
File "C:\WORK\AITools\convert_llama_weights_to_hf.py", line 417, in write_model
model = LlamaForCausalLM.from_pretrained(tmp_model_path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 262, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 4319, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 4897, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\transformers\modeling_utils.py", line 896, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "C:\ProgramData\anaconda3\Lib\site-packages\accelerate\utils\modeling.py", line 287, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([128256, 3072]) in "weight" (which has shape torch.Size([128003, 3072])), this looks incorrect.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions