AttributeError: 'AutoModelForCausalLMWithValueHead' object has no attribute 'get_input_embeddings' #1831
Closed
1 task done
Labels
solved
This problem has been already solved
Reminder
Reproduction
python src/train_bash.py --stage rm --model_name_or_path /home/bihai/.cache/modelscope/hub/ZhipuAI/chatglm3-6b/ --do_train True --finetuning_type lora --quantization_bit 8 --template chatglm3 --flash_attn False --shift_attn False --dataset_dir data --dataset comparison_gpt4_en --cutoff_len 1024 --learning_rate 5e-05 --num_train_epochs 20.0 --max_samples 100000 --per_device_train_batch_size 3 --gradient_accumulation_steps 3 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --neftune_noise_alpha 0 --train_on_prompt False --upcast_layernorm True --lora_rank 8 --lora_dropout 0.1 --lora_target query_key_value --resume_lora_training True --output_dir /home/bihai/datas/LLM/train_2023-12-13-16-37-53 --fp16 True --plot_loss True
Expected behavior
No response
System Info
AttributeError: 'AutoModelForCausalLMWithValueHead' object has no attribute 'get_input_embeddings'
Others
the log shows that the weight has been loaded successfully as following:
12/13/2023 17:15:04 - INFO - llmtuner.data.loader - Loading dataset comparison_gpt4_data_en.json...
Using custom data configuration default-9a44b34ac295f56e
Loading Dataset Infos from /home/bihai/anaconda3/envs/llm-fine-tune/lib/python3.10/site-packages/datasets/packaged_modules/json
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /home/bihai/.cache/huggingface/datasets/json/default-9a44b34ac295f56e/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96
Found cached dataset json (/home/bihai/.cache/huggingface/datasets/json/default-9a44b34ac295f56e/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96)
Loading Dataset info from /home/bihai/.cache/huggingface/datasets/json/default-9a44b34ac295f56e/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96
[INFO|tokenization_utils_base.py:2024] 2023-12-13 17:15:05,756 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2024] 2023-12-13 17:15:05,756 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2024] 2023-12-13 17:15:05,756 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2024] 2023-12-13 17:15:05,756 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2024] 2023-12-13 17:15:05,756 >> loading file tokenizer.json
[INFO|configuration_utils.py:737] 2023-12-13 17:15:05,861 >> loading configuration file /home/bihai/.cache/modelscope/hub/ZhipuAI/chatglm3-6b/config.json
[INFO|configuration_utils.py:737] 2023-12-13 17:15:05,862 >> loading configuration file /home/bihai/.cache/modelscope/hub/ZhipuAI/chatglm3-6b/config.json
[INFO|configuration_utils.py:802] 2023-12-13 17:15:05,862 >> Model config ChatGLMConfig {
"_name_or_path": "/home/bihai/.cache/modelscope/hub/ZhipuAI/chatglm3-6b/",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMModel"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
},
"bias_dropout_fusion": true,
"classifier_dropout": null,
"eos_token_id": 2,
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1e-05,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_layers": 28,
"original_rope": true,
"pad_token_id": 0,
"padded_vocab_size": 65024,
"post_layer_norm": true,
"pre_seq_len": null,
"prefix_projection": false,
"quantization_bit": 0,
"rmsnorm": true,
"seq_length": 8192,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.36.0",
"use_cache": true,
"vocab_size": 65024
}
12/13/2023 17:15:05 - INFO - llmtuner.model.loader - Quantizing model to 8 bit.
[INFO|modeling_utils.py:3329] 2023-12-13 17:15:05,893 >> loading weights file /home/bihai/.cache/modelscope/hub/ZhipuAI/chatglm3-6b/pytorch_model.bin.index.json
[INFO|modeling_utils.py:1341] 2023-12-13 17:15:05,893 >> Instantiating ChatGLMForConditionalGeneration model under default dtype torch.float16.
[INFO|configuration_utils.py:826] 2023-12-13 17:15:05,894 >> Generate config GenerationConfig {
"eos_token_id": 2,
"pad_token_id": 0
}
[INFO|modeling_utils.py:3469] 2023-12-13 17:15:06,342 >> Detected 8-bit loading: activating 8-bit loading for this model
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:12<00:00, 1.72s/it]
[INFO|modeling_utils.py:4173] 2023-12-13 17:15:20,012 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[INFO|modeling_utils.py:4181] 2023-12-13 17:15:20,012 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /home/bihai/.cache/modelscope/hub/ZhipuAI/chatglm3-6b/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training.
[INFO|modeling_utils.py:3739] 2023-12-13 17:15:20,016 >> Generation config file not found, using a generation config created from the model config.
12/13/2023 17:15:20 - WARNING - llmtuner.model.utils - Current model does not support resizing token embeddings.
12/13/2023 17:15:20 - INFO - llmtuner.model.utils - Upcasting weights in layernorm in float32.
The text was updated successfully, but these errors were encountered: