kv-int8 output wrong result #133

sleepwalker2017 · 2023-10-26T06:26:07Z

using fp16, the result is:

Input: "<s>Resolving the Israel @-@ Palestine Conflict : University of British Columbia , Vancouver , Jan 21 , 2009 ."
Output: "

The conflict between Israel and Palestine has been a longstanding issue that has been a source of tension and violence in the Middle East for decades. The situation has been further complicated by the involvement of other countries and international organizations, making it a complex and multifaceted issue.

One of the main challenges in resolving the conflict is the deep-seated animosity and mistrust between the two sides. Both Israelis"

using int8, the result is:

Input: "<s>Resolving the Israel @-@ Palestine Conflict : University of British Columbia , Vancouver , Jan 21 , 2009 ."
Output: "
I g ishaz,
IQ on the
IQ about this ish hashts to
I,1 on the
I,1 Blog; weighed" data
Reg inter-
Reg hobser,11111111 and the
In
In
In
In
In
In
The sne (and#8:
The sne gives two websites"

Here is my convert script:

python build.py --model_dir /data/vicuna-13b/vicuna-13b-v1.5/ \
                --dtype float16 \
                --use_gpt_attention_plugin float16 \
                --use_gemm_plugin float16 \
                --output_dir ./tmp/llama/13B-kv-int8/trt_engines/fp16/2-gpu/ \
                --enable_context_fmha_fp32_acc \
                --world_size 2 \
                --tp_size 2 \
                --max_batch_size 32 \
                --int8_kv_cache

how to run:

mpirun  -n 2 --allow-run-as-root python3 run.py --max_output_len=96 \
               --tokenizer_dir /data/models/llama-7b-hf \
               --engine_dir=./tmp/llama/13B-kv-int8/trt_engines/fp16/2-gpu/

The text was updated successfully, but these errors were encountered:

byshiue · 2023-10-26T08:05:30Z

How do you get the scales for kv cache? It is sensitive in int8.

sleepwalker2017 · 2023-10-26T08:21:33Z

in

seems I didn't do extra things to get scales.
Is that any documents about this?

byshiue · 2023-10-26T08:50:26Z

Here is document https://github.com/NVIDIA/TensorRT-LLM/tree/release/0.5.0/examples/llama#int8-weight-only--int8-kv-cache

juney-nvidia · 2023-10-28T13:07:50Z

@sleepwalker2017 Hi, is there anything new after following @byshiue's suggestion? Or can we close this issue?

sleepwalker2017 · 2023-10-29T11:01:54Z

hello  it's done. The output is normal now. Thank you.

…

------------------ Original ------------------ From: juney-nvidia ***@***.***> Date: Sat,Oct 28,2023 9:08 PM To: NVIDIA/TensorRT-LLM ***@***.***> Cc: fade_away ***@***.***>, Mention ***@***.***> Subject: Re: [NVIDIA/TensorRT-LLM] kv-int8 output wrong result (Issue #133) @sleepwalker2017 Hi, is there anything new after following @byshiue's suggestion? Or can we close this issue? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

jdemouth-nvidia · 2023-10-30T05:17:26Z

Thanks. I'm closing the issue.

sleepwalker2017 changed the title ~~kv-int8 output result seems error~~ kv-int8 output result seems meaningless Oct 26, 2023

sleepwalker2017 changed the title ~~kv-int8 output result seems meaningless~~ kv-int8 output wrong result Oct 26, 2023

byshiue self-assigned this Oct 26, 2023

byshiue added the triaged Issue has been triaged by maintainers label Oct 26, 2023

jdemouth-nvidia closed this as completed Oct 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv-int8 output wrong result #133

kv-int8 output wrong result #133

sleepwalker2017 commented Oct 26, 2023 •

edited

Loading

byshiue commented Oct 26, 2023

sleepwalker2017 commented Oct 26, 2023

byshiue commented Oct 26, 2023

juney-nvidia commented Oct 28, 2023

sleepwalker2017 commented Oct 29, 2023 via email

jdemouth-nvidia commented Oct 30, 2023

kv-int8 output wrong result #133

kv-int8 output wrong result #133

Comments

sleepwalker2017 commented Oct 26, 2023 • edited Loading

byshiue commented Oct 26, 2023

sleepwalker2017 commented Oct 26, 2023

byshiue commented Oct 26, 2023

juney-nvidia commented Oct 28, 2023

sleepwalker2017 commented Oct 29, 2023 via email

jdemouth-nvidia commented Oct 30, 2023

sleepwalker2017 commented Oct 26, 2023 •

edited

Loading