Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: GPU is using not for all models #4352

Open
1 of 3 tasks
Iowerth opened this issue Dec 29, 2024 · 4 comments
Open
1 of 3 tasks

bug: GPU is using not for all models #4352

Iowerth opened this issue Dec 29, 2024 · 4 comments
Assignees
Labels
category: hardware type: bug Something isn't working

Comments

@Iowerth
Copy link

Iowerth commented Dec 29, 2024

Jan version

0.5.11

Describe the Bug

Windows 10 x64 / RTX 3080 / GPU Acceleration enabled

I have two models installed:

  • Openchat-3.5 7B Q4
  • Qwen2.5 Coder 14B Instruct Q4
    If I use Openchat then GPU Acceleration works. If I use Qwen then not. Why so?

I have the next NVIDIA drivers and CUDA installed:

Sun Dec 29 23:13:47 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 566.36                 Driver Version: 566.36         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080      WDDM  |   00000000:01:00.0  On |                  N/A |
|  0%   48C    P5             38W /  349W |    7233MiB /  10240MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Oct_30_01:18:48_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.6, V12.6.85
Build cuda_12.6.r12.6/compiler.35059454_0

settings.json file

{
  "notify": true,
  "run_mode": "gpu",
  "nvidia_driver": {
    "exist": true,
    "version": "566.36"
  },
  "cuda": {
    "exist": true,
    "version": "12"
  },
  "gpus": [
    {
      "id": "0",
      "vram": "10240",
      "name": "NVIDIA GeForce RTX 3080",
      "arch": "ampere"
    }
  ],
  "gpu_highest_vram": "0",
  "gpus_in_use": [
    "0"
  ],
  "is_initial": false,
  "vulkan": false
}

Steps to Reproduce

No response

Screenshots / Logs

No response

What is your OS?

  • MacOS
  • Windows
  • Linux
@Iowerth Iowerth added the type: bug Something isn't working label Dec 29, 2024
@github-project-automation github-project-automation bot moved this to Investigating in Jan & Cortex Dec 29, 2024
@imtuyethan
Copy link
Contributor

cc @TC117 to help reproduce

@imtuyethan imtuyethan changed the title GPU is using not for all models bug: GPU is using not for all models Dec 30, 2024
@Iowerth
Copy link
Author

Iowerth commented Dec 30, 2024

On video I recorded you may see a situation I described.

1.mp4

@imtuyethan
Copy link
Contributor

imtuyethan commented Jan 2, 2025

@Iowerth Can I get your detailed parameters settings for each model? Especially the NGL number in engine settings (in right sidebar)
It seems right in the recording because the token speed is higher for smaller model 7B compared with 14B model

@imtuyethan imtuyethan assigned louis-jan and unassigned TC117 Jan 2, 2025
@Iowerth
Copy link
Author

Iowerth commented Jan 2, 2025

@imtuyethan settings are default. But if you need them anyway, I can provide them only after my vacation approx 12th of January.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: hardware type: bug Something isn't working
Projects
Status: Investigating
Development

No branches or pull requests

4 participants