bug: GPU is using not for all models #4352

Iowerth · 2024-12-29T20:16:06Z

Jan version

0.5.11

Describe the Bug

Windows 10 x64 / RTX 3080 / GPU Acceleration enabled

I have two models installed:

Openchat-3.5 7B Q4
Qwen2.5 Coder 14B Instruct Q4
If I use Openchat then GPU Acceleration works. If I use Qwen then not. Why so?

I have the next NVIDIA drivers and CUDA installed:

Sun Dec 29 23:13:47 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 566.36                 Driver Version: 566.36         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080      WDDM  |   00000000:01:00.0  On |                  N/A |
|  0%   48C    P5             38W /  349W |    7233MiB /  10240MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Oct_30_01:18:48_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.6, V12.6.85
Build cuda_12.6.r12.6/compiler.35059454_0

settings.json file

{
  "notify": true,
  "run_mode": "gpu",
  "nvidia_driver": {
    "exist": true,
    "version": "566.36"
  },
  "cuda": {
    "exist": true,
    "version": "12"
  },
  "gpus": [
    {
      "id": "0",
      "vram": "10240",
      "name": "NVIDIA GeForce RTX 3080",
      "arch": "ampere"
    }
  ],
  "gpu_highest_vram": "0",
  "gpus_in_use": [
    "0"
  ],
  "is_initial": false,
  "vulkan": false
}

Steps to Reproduce

No response

Screenshots / Logs

No response

What is your OS?

MacOS
Windows
Linux

The text was updated successfully, but these errors were encountered:

imtuyethan · 2024-12-30T02:39:19Z

cc @TC117 to help reproduce

Iowerth · 2024-12-30T08:22:09Z

On video I recorded you may see a situation I described.

1.mp4

imtuyethan · 2025-01-02T06:32:41Z

@Iowerth Can I get your detailed parameters settings for each model? Especially the NGL number in engine settings (in right sidebar)
It seems right in the recording because the token speed is higher for smaller model 7B compared with 14B model

Iowerth · 2025-01-02T09:35:18Z

@imtuyethan settings are default. But if you need them anyway, I can provide them only after my vacation approx 12th of January.

Iowerth added the type: bug Something isn't working label Dec 29, 2024

github-project-automation bot added this to Jan & Cortex Dec 29, 2024

github-project-automation bot moved this to Investigating in Jan & Cortex Dec 29, 2024

imtuyethan assigned TC117 Dec 30, 2024

imtuyethan changed the title ~~GPU is using not for all models~~ bug: GPU is using not for all models Dec 30, 2024

imtuyethan added the category: hardware label Dec 30, 2024

imtuyethan assigned louis-jan and unassigned TC117 Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: GPU is using not for all models #4352

bug: GPU is using not for all models #4352

Iowerth commented Dec 29, 2024

imtuyethan commented Dec 30, 2024

Iowerth commented Dec 30, 2024

imtuyethan commented Jan 2, 2025 •

edited

Loading

Iowerth commented Jan 2, 2025

bug: GPU is using not for all models #4352

bug: GPU is using not for all models #4352

Comments

Iowerth commented Dec 29, 2024

Jan version

Describe the Bug

Steps to Reproduce

Screenshots / Logs

What is your OS?

imtuyethan commented Dec 30, 2024

Iowerth commented Dec 30, 2024

imtuyethan commented Jan 2, 2025 • edited Loading

Iowerth commented Jan 2, 2025

imtuyethan commented Jan 2, 2025 •

edited

Loading