Support internlm2 #1392

RunningLeon · 2024-04-02T10:01:19Z

This PR supports the conversion of internlm2 from hf to trt-llm checkpoints with :

fp16/bf16
tp

PaulX1029 · 2024-04-09T10:04:05Z

做了相应修改后engine build不成功，请问可以怎样解决呢？
我的命令是：
python3 convert_checkpoint.py --model_dir /mnt/checkpoint/models/internlm2-chat-20b/ --dtype float16 --output_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1

trtllm-build --checkpoint_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1/ --output_dir /mnt/trt_engines/internlm2-chat-20b/fp16/1-gpu/ --gemm_plugin float16 --max_batch_size=32 --max_input_len=2048 --max_output_len=2048

PaulX1029 · 2024-04-09T10:14:50Z

做了相应修改后engine build不成功，请问可以怎样解决呢？我的命令是： python3 convert_checkpoint.py --model_dir /mnt/checkpoint/models/internlm2-chat-20b/ --dtype float16 --output_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1

trtllm-build --checkpoint_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1/ --output_dir /mnt/trt_engines/internlm2-chat-20b/fp16/1-gpu/ --gemm_plugin float16 --max_batch_size=32 --max_input_len=2048 --max_output_len=2048
版本信息：

RunningLeon · 2024-04-10T06:25:12Z

@PaulX1029 Hi, have you rebuilt and reinstalled tensorrt-llm? You can find the installation location by pip3 show tensorrt_llmand check tensorrt_llm/models/__init__.py to see if MODEL_MAP is as expected:

PaulX1029 · 2024-04-10T08:35:39Z

@RunningLeon 请问您是采用什么build的方式，我是从pip安装的trtllm，我想要跟您对齐build方式重新进行

RunningLeon · 2024-04-10T09:34:28Z

@RunningLeon 请问您是采用什么build的方式，我是从pip安装的trtllm，我想要跟您对齐build方式重新进行

#266 (comment)

cqy930325 · 2024-04-24T06:16:58Z

@RunningLeon 很感谢你的工作！想问下internvl-1.5的internlm2-20b 网络跟普通internlm2-20b有什么区别吗，我用了PR里的转换脚本转出来之后都是乱码。

ChengYouFancy · 2024-04-25T09:01:25Z

@RunningLeon hi，we use lora finetuned the internlm2 model. Now we can convert the base model to llama, but not lora part, we tried to change the code of InternLM/tools/convert2llame.py to transfer lora to llama style, but did not work. Is there any other tools could work for lora?
tips: we can't export base model and lora to one single model, because we want to use https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama#run-llama-with-several-lora-checkpoints this feature. Thus, if we can transfer internlm2 lora to llama style, then we can use examples/hf_lora_convert.py to build trt-llm

RunningLeon · 2024-05-08T02:39:57Z

@nv-guomingz Hi, sorry to bother you, but when will this PR be merged? Do I need to fix the conflicts?
THX

cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttentionUtils.h

RunningLeon · 2024-05-21T07:29:38Z

@nv-guomingz hi, the conflicts with main branch are resolved. Looking forward to your review comments. THX.

DefTruth · 2024-05-23T10:07:30Z

@RunningLeon 请问下为什么internlm2需要单独一个convert_checkpoint.py呢？而不是复用llama的convert_checkpoint.py，internlm是直接使用llama的convert_checkpoint.py

RunningLeon · 2024-05-23T11:49:20Z

@RunningLeon 请问下为什么internlm2需要单独一个convert_checkpoint.py呢？而不是复用llama的convert_checkpoint.py，internlm是直接使用llama的convert_checkpoint.py

hi, internlm2 W_qkv是在一起的，其次一些参数命名是和llama没有对齐的。因而没法直接使用llama的convert_checkpoint.py

DefTruth · 2024-05-24T03:25:25Z

@RunningLeon 请问下为什么internlm2需要单独一个convert_checkpoint.py呢？而不是复用llama的convert_checkpoint.py，internlm是直接使用llama的convert_checkpoint.py

hi, internlm2 W_qkv是在一起的，其次一些参数命名是和llama没有对齐的。因而没法直接使用llama的convert_checkpoint.py

Thank you for this explanation!

nv-guomingz · 2024-05-28T06:40:56Z

Hi @RunningLeon sorry for late response due to internal task priority.
Would u please rebase the code firstly and I'll try to merge your MR into main branch this week.

RunningLeon · 2024-05-28T12:38:54Z

Hi @RunningLeon sorry for late response due to internal task priority. Would u please rebase the code firstly and I'll try to merge your MR into main branch this week.

@nv-guomingz Done. Hope merging with main is OK.

nv-guomingz · 2024-05-29T04:54:55Z

Hi @RunningLeon sorry for late response due to internal task priority. Would u please rebase the code firstly and I'll try to merge your MR into main branch this week.

@nv-guomingz Done. Hope merging with main is OK.

Thanks @RunningLeon. Could u please rebase your commits into one single commit? That would be more easy for further integration.

nv-guomingz · 2024-06-03T08:07:58Z

Hi @RunningLeon I've managed to file the merge request in our internal repo and testing is on-going.
If everything goes well, this MR would be upstreamed next week.
Thanks for your contributing again.

nv-guomingz · 2024-06-04T13:26:37Z

@RunningLeon Internlm2 had been added into today's update.
Please see notes here. #1726 (comment)

This was referenced Apr 7, 2024

[Feature] 是否已经支持tensorrt-llm或者计划支持？ InternLM/InternLM#714

Closed

Support InternLM models #266

Closed

byshiue requested a review from nv-guomingz May 15, 2024 10:18

byshiue assigned nv-guomingz May 15, 2024

byshiue added the triaged Issue has been triaged by maintainers label May 15, 2024

RunningLeon commented May 21, 2024

View reviewed changes

cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttentionUtils.h Outdated Show resolved Hide resolved

support internlm2

94f57cc

RunningLeon force-pushed the support-internlm2 branch from 5a8ee31 to 94f57cc Compare May 29, 2024 08:21

nv-guomingz added the Merged label Jun 3, 2024

nv-guomingz closed this Jun 3, 2024

kaiyux mentioned this pull request Jun 4, 2024

Update TensorRT-LLM #1725

Merged

nv-guomingz mentioned this pull request Jun 20, 2024

internlm2-chat-20b model convert_checkpoint.py does not have “--int8_kv_cache” option #1817

Closed

kaiyux mentioned this pull request Jul 17, 2024

TensorRT-LLM v0.11 Update #1969

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support internlm2 #1392

Support internlm2 #1392

RunningLeon commented Apr 2, 2024 •

edited

Loading

PaulX1029 commented Apr 9, 2024

PaulX1029 commented Apr 9, 2024

RunningLeon commented Apr 10, 2024 •

edited

Loading

PaulX1029 commented Apr 10, 2024

RunningLeon commented Apr 10, 2024

cqy930325 commented Apr 24, 2024

ChengYouFancy commented Apr 25, 2024

RunningLeon commented May 8, 2024

RunningLeon commented May 21, 2024

DefTruth commented May 23, 2024

RunningLeon commented May 23, 2024

DefTruth commented May 24, 2024

nv-guomingz commented May 28, 2024

RunningLeon commented May 28, 2024

nv-guomingz commented May 29, 2024

nv-guomingz commented Jun 3, 2024

nv-guomingz commented Jun 4, 2024

Support internlm2 #1392

Support internlm2 #1392

Conversation

RunningLeon commented Apr 2, 2024 • edited Loading

PaulX1029 commented Apr 9, 2024

PaulX1029 commented Apr 9, 2024

RunningLeon commented Apr 10, 2024 • edited Loading

PaulX1029 commented Apr 10, 2024

RunningLeon commented Apr 10, 2024

cqy930325 commented Apr 24, 2024

ChengYouFancy commented Apr 25, 2024

RunningLeon commented May 8, 2024

RunningLeon commented May 21, 2024

DefTruth commented May 23, 2024

RunningLeon commented May 23, 2024

DefTruth commented May 24, 2024

nv-guomingz commented May 28, 2024

RunningLeon commented May 28, 2024

nv-guomingz commented May 29, 2024

nv-guomingz commented Jun 3, 2024

nv-guomingz commented Jun 4, 2024

RunningLeon commented Apr 2, 2024 •

edited

Loading

RunningLeon commented Apr 10, 2024 •

edited

Loading