-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv-int8 output wrong result #133
Labels
triaged
Issue has been triaged by maintainers
Comments
sleepwalker2017
changed the title
kv-int8 output result seems error
kv-int8 output result seems meaningless
Oct 26, 2023
sleepwalker2017
changed the title
kv-int8 output result seems meaningless
kv-int8 output wrong result
Oct 26, 2023
How do you get the scales for kv cache? It is sensitive in int8. |
seems I didn't do extra things to get scales. |
@sleepwalker2017 Hi, is there anything new after following @byshiue's suggestion? Or can we close this issue? |
hello it's done. The output is normal now. Thank you.
…------------------ Original ------------------
From: juney-nvidia ***@***.***>
Date: Sat,Oct 28,2023 9:08 PM
To: NVIDIA/TensorRT-LLM ***@***.***>
Cc: fade_away ***@***.***>, Mention ***@***.***>
Subject: Re: [NVIDIA/TensorRT-LLM] kv-int8 output wrong result (Issue #133)
@sleepwalker2017 Hi, is there anything new after following @byshiue's suggestion? Or can we close this issue?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thanks. I'm closing the issue. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
using fp16, the result is:
using int8, the result is:
Here is my convert script:
how to run:
The text was updated successfully, but these errors were encountered: