Improve Dockerize support #2849

S3Studio · 2024-03-15T01:22:06Z

Here are some improvements based on the discussions of PR # (2743).
In addressing a compatibility issue, I made some slight alterations on the existing codes. Feel free to discuss with me.

What does this PR do?

Improve PR # (2743)

Before submitting

Did you read the contributor guideline?

Modify installation method of extra python library. Utilize shared memory of the host machine to increase training performance.

Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.

hiyouga

Please see the comment

Dockerfile

hiyouga · 2024-03-28T13:52:21Z

Hello @S3Studio , how to specify the device id in dockerize training, such as CUDA_VISIBLE_DEVICES=0. I think it is necessary in readme.md since we will check the device num before training.

LLaMA-Factory/src/llmtuner/webui/runner.py

Lines 71 to 72 in 1e43319

    
           if not from_preview and get_device_count() > 1: 
        
               return ALERTS["err_device_count"][lang]

hiyouga · 2024-03-28T14:03:51Z

You can also review this commit: c1fe6ce

S3Studio · 2024-04-02T09:47:22Z

You can also review this commit: c1fe6ce

I believe this commit solves the issue.

hiyouga · 2024-04-02T10:20:57Z

thanks!

S3Studio added 2 commits March 15, 2024 08:57

improve Docker build and runtime parameters

6a5693d

Modify installation method of extra python library. Utilize shared memory of the host machine to increase training performance.

hiyouga self-requested a review March 15, 2024 04:24

hiyouga reviewed Mar 15, 2024

View reviewed changes

Dockerfile Show resolved Hide resolved

hiyouga added the pending This problem is yet to be addressed label Mar 15, 2024

hiyouga merged commit 113cc04 into hiyouga:main Mar 15, 2024
1 check passed

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Mar 15, 2024

S3Studio deleted the DockerizeSupport branch March 16, 2024 13:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Dockerize support #2849

Improve Dockerize support #2849

S3Studio commented Mar 15, 2024

hiyouga left a comment

hiyouga commented Mar 28, 2024

hiyouga commented Mar 28, 2024

S3Studio commented Apr 2, 2024

hiyouga commented Apr 2, 2024

Improve Dockerize support #2849

Improve Dockerize support #2849

Conversation

S3Studio commented Mar 15, 2024

What does this PR do?

Before submitting

hiyouga left a comment

Choose a reason for hiding this comment

hiyouga commented Mar 28, 2024

hiyouga commented Mar 28, 2024

S3Studio commented Apr 2, 2024

hiyouga commented Apr 2, 2024