forked from xorbitsai/inference
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' of https://github.com/xorbitsai/inference
* 'main' of https://github.com/xorbitsai/inference: FEAT: support qwen2.5-coder-instruct and qwen2.5 sglang (xorbitsai#2332) DOC: update models for doc and readme (xorbitsai#2330) BUG: fix stable diffusion from dify tool (xorbitsai#2336) BUG: support old register llm format (xorbitsai#2335) FEAT: Support Qwen 2.5 (xorbitsai#2325) BUG: Fix CosyVoice missing output (xorbitsai#2320) BUG: [UI] Fix registration page bug. (xorbitsai#2315) Bug: modify vllm image version (xorbitsai#2312) BUG: modify vllm image version (xorbitsai#2311) FEAT: qwen2 audio (xorbitsai#2271) BUG: fix sampler_name for img2img (xorbitsai#2301) FEAT: Support yi-coder-chat (xorbitsai#2302) FEAT: support flux.1 image2image and inpainting (xorbitsai#2296) FEAT: support sdapi/img2img (xorbitsai#2293) ENH: Support fish speech 1.4 (xorbitsai#2295) FEAT: Update Qwen2-VL-Model to support flash_attention_2 implementation (xorbitsai#2289) FEAT: support deepseek-v2 and 2.5 (xorbitsai#2292) # Conflicts: # xinference/model/audio/cosyvoice.py
- Loading branch information
Showing
84 changed files
with
4,671 additions
and
1,864 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
.. _models_builtin_fishspeech-1.4: | ||
|
||
============== | ||
FishSpeech-1.4 | ||
============== | ||
|
||
- **Model Name:** FishSpeech-1.4 | ||
- **Model Family:** FishAudio | ||
- **Abilities:** text-to-audio | ||
- **Multilingual:** True | ||
|
||
Specifications | ||
^^^^^^^^^^^^^^ | ||
|
||
- **Model ID:** fishaudio/fish-speech-1.4 | ||
|
||
Execute the following command to launch the model:: | ||
|
||
xinference launch --model-name FishSpeech-1.4 --model-type audio |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
.. _models_llm_deepseek-v2-chat-0628: | ||
|
||
======================================== | ||
deepseek-v2-chat-0628 | ||
======================================== | ||
|
||
- **Context Length:** 128000 | ||
- **Model Name:** deepseek-v2-chat-0628 | ||
- **Languages:** en, zh | ||
- **Abilities:** chat | ||
- **Description:** DeepSeek-V2-Chat-0628 is an improved version of DeepSeek-V2-Chat. | ||
|
||
Specifications | ||
^^^^^^^^^^^^^^ | ||
|
||
|
||
Model Spec 1 (pytorch, 236 Billion) | ||
++++++++++++++++++++++++++++++++++++++++ | ||
|
||
- **Model Format:** pytorch | ||
- **Model Size (in billions):** 236 | ||
- **Quantizations:** 4-bit, 8-bit, none | ||
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none) | ||
- **Model ID:** deepseek-ai/DeepSeek-V2-Chat-0628 | ||
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Chat-0628>`__ | ||
|
||
Execute the following command to launch the model, remember to replace ``${quantization}`` with your | ||
chosen quantization method from the options listed above:: | ||
|
||
xinference launch --model-engine ${engine} --model-name deepseek-v2-chat-0628 --size-in-billions 236 --model-format pytorch --quantization ${quantization} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
.. _models_llm_deepseek-v2-chat: | ||
|
||
======================================== | ||
deepseek-v2-chat | ||
======================================== | ||
|
||
- **Context Length:** 128000 | ||
- **Model Name:** deepseek-v2-chat | ||
- **Languages:** en, zh | ||
- **Abilities:** chat | ||
- **Description:** DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. | ||
|
||
Specifications | ||
^^^^^^^^^^^^^^ | ||
|
||
|
||
Model Spec 1 (pytorch, 16 Billion) | ||
++++++++++++++++++++++++++++++++++++++++ | ||
|
||
- **Model Format:** pytorch | ||
- **Model Size (in billions):** 16 | ||
- **Quantizations:** 4-bit, 8-bit, none | ||
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none) | ||
- **Model ID:** deepseek-ai/DeepSeek-V2-Lite-Chat | ||
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite-Chat>`__ | ||
|
||
Execute the following command to launch the model, remember to replace ``${quantization}`` with your | ||
chosen quantization method from the options listed above:: | ||
|
||
xinference launch --model-engine ${engine} --model-name deepseek-v2-chat --size-in-billions 16 --model-format pytorch --quantization ${quantization} | ||
|
||
|
||
Model Spec 2 (pytorch, 236 Billion) | ||
++++++++++++++++++++++++++++++++++++++++ | ||
|
||
- **Model Format:** pytorch | ||
- **Model Size (in billions):** 236 | ||
- **Quantizations:** 4-bit, 8-bit, none | ||
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none) | ||
- **Model ID:** deepseek-ai/DeepSeek-V2-Chat | ||
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Chat>`__ | ||
|
||
Execute the following command to launch the model, remember to replace ``${quantization}`` with your | ||
chosen quantization method from the options listed above:: | ||
|
||
xinference launch --model-engine ${engine} --model-name deepseek-v2-chat --size-in-billions 236 --model-format pytorch --quantization ${quantization} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
.. _models_llm_deepseek-v2.5: | ||
|
||
======================================== | ||
deepseek-v2.5 | ||
======================================== | ||
|
||
- **Context Length:** 128000 | ||
- **Model Name:** deepseek-v2.5 | ||
- **Languages:** en, zh | ||
- **Abilities:** chat | ||
- **Description:** DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. | ||
|
||
Specifications | ||
^^^^^^^^^^^^^^ | ||
|
||
|
||
Model Spec 1 (pytorch, 236 Billion) | ||
++++++++++++++++++++++++++++++++++++++++ | ||
|
||
- **Model Format:** pytorch | ||
- **Model Size (in billions):** 236 | ||
- **Quantizations:** 4-bit, 8-bit, none | ||
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none) | ||
- **Model ID:** deepseek-ai/DeepSeek-V2.5 | ||
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2.5>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2.5>`__ | ||
|
||
Execute the following command to launch the model, remember to replace ``${quantization}`` with your | ||
chosen quantization method from the options listed above:: | ||
|
||
xinference launch --model-engine ${engine} --model-name deepseek-v2.5 --size-in-billions 236 --model-format pytorch --quantization ${quantization} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
.. _models_llm_deepseek-v2: | ||
|
||
======================================== | ||
deepseek-v2 | ||
======================================== | ||
|
||
- **Context Length:** 128000 | ||
- **Model Name:** deepseek-v2 | ||
- **Languages:** en, zh | ||
- **Abilities:** generate | ||
- **Description:** DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. | ||
|
||
Specifications | ||
^^^^^^^^^^^^^^ | ||
|
||
|
||
Model Spec 1 (pytorch, 16 Billion) | ||
++++++++++++++++++++++++++++++++++++++++ | ||
|
||
- **Model Format:** pytorch | ||
- **Model Size (in billions):** 16 | ||
- **Quantizations:** 4-bit, 8-bit, none | ||
- **Engines**: Transformers | ||
- **Model ID:** deepseek-ai/DeepSeek-V2-Lite | ||
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite>`__ | ||
|
||
Execute the following command to launch the model, remember to replace ``${quantization}`` with your | ||
chosen quantization method from the options listed above:: | ||
|
||
xinference launch --model-engine ${engine} --model-name deepseek-v2 --size-in-billions 16 --model-format pytorch --quantization ${quantization} | ||
|
||
|
||
Model Spec 2 (pytorch, 236 Billion) | ||
++++++++++++++++++++++++++++++++++++++++ | ||
|
||
- **Model Format:** pytorch | ||
- **Model Size (in billions):** 236 | ||
- **Quantizations:** 4-bit, 8-bit, none | ||
- **Engines**: Transformers | ||
- **Model ID:** deepseek-ai/DeepSeek-V2 | ||
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2>`__ | ||
|
||
Execute the following command to launch the model, remember to replace ``${quantization}`` with your | ||
chosen quantization method from the options listed above:: | ||
|
||
xinference launch --model-engine ${engine} --model-name deepseek-v2 --size-in-billions 236 --model-format pytorch --quantization ${quantization} | ||
|
Oops, something went wrong.