-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Display model name in process #1891
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
frostyplanet
force-pushed
the
model_load_opt
branch
8 times, most recently
from
July 20, 2024 19:40
f7b1127
to
e57454e
Compare
frostyplanet
force-pushed
the
model_load_opt
branch
from
July 29, 2024 02:06
e57454e
to
2136fdd
Compare
frostyplanet
changed the title
ENH: Model loading optimization
ENH: Display model name in process
Jul 29, 2024
frostyplanet
force-pushed
the
model_load_opt
branch
5 times, most recently
from
September 5, 2024 10:01
b5bb0b6
to
530ede9
Compare
Add the requirements to https://github.com/xorbitsai/inference/tree/main/xinference/deploy/docker requirements.txt and requirements_cpu.txt |
frostyplanet
force-pushed
the
model_load_opt
branch
from
September 9, 2024 14:59
530ede9
to
a23c57d
Compare
@qinxuye done |
frostyplanet
force-pushed
the
model_load_opt
branch
from
September 21, 2024 09:45
a23c57d
to
d1de7d4
Compare
There is conflict again, please resolve it. I think this PR is helpful when we want to see what the model is running. |
frostyplanet
force-pushed
the
model_load_opt
branch
from
November 5, 2024 05:19
d1de7d4
to
2af5229
Compare
frostyplanet
force-pushed
the
model_load_opt
branch
from
November 5, 2024 06:03
172455f
to
d528b87
Compare
The error might encounter on aws newly launched instance File "/opt/inference/xinference/core/model.py", line 239, in load self._model.load() File "/opt/inference/xinference/model/image/stable_diffusion/core.py", line 61, in load self._model = move_model_to_available_device(self._model) File "/opt/inference/xinference/device_utils.py", line 56, in move_model_to_available_device return model.to(device) File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 418, in to module.to(device, dtype) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1160, in to return self._apply(convert) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 833, in _apply param_applied = fn(param) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1158, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) RuntimeError: [address=172.31.25.185:39061, pid=44] CUDA error: CUDA-capable device(s) is/are busy or unavailable CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
`ps auxf` will see "Model: XXX", instead of "python -c from multiprocessing.forkserver import main" ModelActor add replica_model_uid arg because attribute in _model has no uniform
It's good for distinguish log_async for DEBUG differenent models A chat() log will look like: 2024-07-14 10:18:23,974 xinference.core.model 2168589 DEBUG Enter wrapped_func, args: (ModelActor(qwen1.5-chat:1_8), 'aaaa', None,
frostyplanet
force-pushed
the
model_load_opt
branch
from
November 5, 2024 08:36
d528b87
to
ecf8cfa
Compare
qinxuye
approved these changes
Nov 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
在进程名中显示模型的名字,方便管理调试. 顺带提交一些日志优化
Retry 3 times to load model when CUDA busy error, the error can be encountered on AWS instance which is just launched.
ModelActor.repr ( add replica_model_uid to generate wrapper function log )
Previously Worker use forkserver to spawn model process, ps auxf will show
we cannot distinguish model from process name. (Some times we might need to debug the network connection or resource of a model)
Add optional dependency
setproctitle
to rename the process name, toModel: {replica_model_uid}
Because not all model type has self._model.model_uid, self._model.repl_id, so I added replica_model_uid arg to ModelActor)