Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
* 'main' of https://github.com/xorbitsai/inference:
  FEAT: support qwen2.5-coder-instruct and qwen2.5 sglang (xorbitsai#2332)
  DOC: update models for doc and readme (xorbitsai#2330)
  BUG: fix stable diffusion from dify tool (xorbitsai#2336)
  BUG: support old register llm format (xorbitsai#2335)
  FEAT: Support Qwen 2.5 (xorbitsai#2325)
  BUG: Fix CosyVoice missing output (xorbitsai#2320)
  BUG: [UI] Fix registration page bug. (xorbitsai#2315)
  Bug: modify vllm image version (xorbitsai#2312)
  BUG: modify vllm image version (xorbitsai#2311)
  FEAT: qwen2 audio (xorbitsai#2271)
  BUG: fix sampler_name for img2img (xorbitsai#2301)
  FEAT: Support yi-coder-chat (xorbitsai#2302)
  FEAT: support flux.1 image2image and inpainting (xorbitsai#2296)
  FEAT: support sdapi/img2img (xorbitsai#2293)
  ENH: Support fish speech 1.4 (xorbitsai#2295)
  FEAT: Update Qwen2-VL-Model to support flash_attention_2 implementation (xorbitsai#2289)
  FEAT: support deepseek-v2 and 2.5 (xorbitsai#2292)

# Conflicts:
#	xinference/model/audio/cosyvoice.py
  • Loading branch information
Vanocore committed Sep 22, 2024
2 parents 1c5e6f2 + 5de46e9 commit e3df693
Show file tree
Hide file tree
Showing 84 changed files with 4,671 additions and 1,864 deletions.
1 change: 1 addition & 0 deletions .github/workflows/python.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ jobs:
${{ env.SELF_HOST_PYTHON }} -m pip install -U "loguru"
${{ env.SELF_HOST_PYTHON }} -m pip install -U "natsort"
${{ env.SELF_HOST_PYTHON }} -m pip install -U "loralib"
${{ env.SELF_HOST_PYTHON }} -m pip install -U "ormsgpack"
${{ env.SELF_HOST_PYTHON }} -m pip uninstall -y opencc
${{ env.SELF_HOST_PYTHON }} -m pip uninstall -y "faster_whisper"
${{ env.SELF_HOST_PYTHON }} -m pytest --timeout=1500 \
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,14 @@ potential of cutting-edge AI models.
- Support speech recognition model: [#929](https://github.com/xorbitsai/inference/pull/929)
- Metrics support: [#906](https://github.com/xorbitsai/inference/pull/906)
### New Models
- Built-in support for [Qwen 2.5 Series](https://qwenlm.github.io/blog/qwen2.5/): [#2325](https://github.com/xorbitsai/inference/pull/2325)
- Built-in support for [Fish Speech V1.4](https://huggingface.co/fishaudio/fish-speech-1.4): [#2295](https://github.com/xorbitsai/inference/pull/2295)
- Built-in support for [DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5): [#2292](https://github.com/xorbitsai/inference/pull/2292)
- Built-in support for [Qwen2-Audio](https://github.com/QwenLM/Qwen2-Audio): [#2271](https://github.com/xorbitsai/inference/pull/2271)
- Built-in support for [Qwen2-vl-instruct](https://github.com/QwenLM/Qwen2-VL): [#2205](https://github.com/xorbitsai/inference/pull/2205)
- Built-in support for [MiniCPM3-4B](https://huggingface.co/openbmb/MiniCPM3-4B): [#2263](https://github.com/xorbitsai/inference/pull/2263)
- Built-in support for [CogVideoX](https://github.com/THUDM/CogVideo): [#2049](https://github.com/xorbitsai/inference/pull/2049)
- Built-in support for [flux.1-schnell & flux.1-dev](https://www.basedlabs.ai/tools/flux1): [#2007](https://github.com/xorbitsai/inference/pull/2007)
- Built-in support for [MiniCPM-V 2.6](https://github.com/OpenBMB/MiniCPM-V): [#2031](https://github.com/xorbitsai/inference/pull/2031)
- Built-in support for [Kolors](https://huggingface.co/Kwai-Kolors/Kolors): [#2028](https://github.com/xorbitsai/inference/pull/2028)
- Built-in support for [SenseVoice](https://github.com/FunAudioLLM/SenseVoice): [#2008](https://github.com/xorbitsai/inference/pull/2008)
- Built-in support for [Mistral Large 2](https://mistral.ai/news/mistral-large-2407/): [#1944](https://github.com/xorbitsai/inference/pull/1944)
### Integrations
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
- [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.
Expand Down
8 changes: 4 additions & 4 deletions README_zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,14 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布
- 支持语音识别模型: [#929](https://github.com/xorbitsai/inference/pull/929)
- 增加 Metrics 统计信息: [#906](https://github.com/xorbitsai/inference/pull/906)
### 新模型
- 内置 [Qwen 2.5 Series](https://qwenlm.github.io/blog/qwen2.5/): [#2325](https://github.com/xorbitsai/inference/pull/2325)
- 内置 [Fish Speech V1.4](https://huggingface.co/fishaudio/fish-speech-1.4): [#2295](https://github.com/xorbitsai/inference/pull/2295)
- 内置 [DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5): [#2292](https://github.com/xorbitsai/inference/pull/2292)
- 内置 [Qwen2-Audio](https://github.com/QwenLM/Qwen2-Audio): [#2271](https://github.com/xorbitsai/inference/pull/2271)
- 内置 [Qwen2-vl-instruct](https://github.com/QwenLM/Qwen2-VL): [#2205](https://github.com/xorbitsai/inference/pull/2205)
- 内置 [MiniCPM3-4B](https://huggingface.co/openbmb/MiniCPM3-4B): [#2263](https://github.com/xorbitsai/inference/pull/2263)
- 内置 [CogVideoX](https://github.com/THUDM/CogVideo): [#2049](https://github.com/xorbitsai/inference/pull/2049)
- 内置 [flux.1-schnell & flux.1-dev](https://www.basedlabs.ai/tools/flux1): [#2007](https://github.com/xorbitsai/inference/pull/2007)
- 内置 [MiniCPM-V 2.6](https://github.com/OpenBMB/MiniCPM-V): [#2031](https://github.com/xorbitsai/inference/pull/2031)
- 内置 [Kolors](https://huggingface.co/Kwai-Kolors/Kolors): [#2028](https://github.com/xorbitsai/inference/pull/2028)
- 内置 [SenseVoice](https://github.com/FunAudioLLM/SenseVoice): [#2008](https://github.com/xorbitsai/inference/pull/2008)
- 内置 [Mistral Large 2](https://mistral.ai/news/mistral-large-2407/): [#1944](https://github.com/xorbitsai/inference/pull/1944)
### 集成
- [FastGPT](https://doc.fastai.site/docs/development/custom-models/xinference/):一个基于 LLM 大模型的开源 AI 知识库构建平台。提供了开箱即用的数据处理、模型调用、RAG 检索、可视化 AI 工作流编排等能力,帮助您轻松实现复杂的问答场景。
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。
Expand Down
4 changes: 3 additions & 1 deletion doc/source/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ Currently, supported models include:
- ``codestral-v0.1``
- ``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``
- ``code-llama``, ``code-llama-python``, ``code-llama-instruct``
- ``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-instruct``
- ``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-instruct``, ``deepseek-v2-chat``, ``deepseek-v2-chat-0628``, ``deepseek-v2.5``
- ``yi-coder``, ``yi-coder-chat``
- ``codeqwen1.5``, ``codeqwen1.5-chat``
- ``baichuan-2-chat``
- ``internlm2-chat``
Expand All @@ -56,6 +57,7 @@ Currently, supported models include:
- ``codegeex4``
- ``qwen1.5-chat``, ``qwen1.5-moe-chat``
- ``qwen2-instruct``, ``qwen2-moe-instruct``
- ``qwen2.5-instruct``
- ``gemma-it``, ``gemma-2-it``
- ``orion-chat``, ``orion-chat-rag``
- ``c4ai-command-r-v01``
Expand Down
19 changes: 0 additions & 19 deletions doc/source/models/builtin/audio/fishspeech-1.2-sft.rst

This file was deleted.

19 changes: 19 additions & 0 deletions doc/source/models/builtin/audio/fishspeech-1.4.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. _models_builtin_fishspeech-1.4:

==============
FishSpeech-1.4
==============

- **Model Name:** FishSpeech-1.4
- **Model Family:** FishAudio
- **Abilities:** text-to-audio
- **Multilingual:** True

Specifications
^^^^^^^^^^^^^^

- **Model ID:** fishaudio/fish-speech-1.4

Execute the following command to launch the model::

xinference launch --model-name FishSpeech-1.4 --model-type audio
2 changes: 1 addition & 1 deletion doc/source/models/builtin/audio/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The following is a list of built-in audio models in Xinference:

cosyvoice-300m-sft

fishspeech-1.2-sft
fishspeech-1.4

sensevoicesmall

Expand Down
2 changes: 1 addition & 1 deletion doc/source/models/builtin/image/flux.1-dev.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ FLUX.1-dev

- **Model Name:** FLUX.1-dev
- **Model Family:** stable_diffusion
- **Abilities:** text2image
- **Abilities:** text2image, image2image, inpainting
- **Available ControlNet:** None

Specifications
Expand Down
2 changes: 1 addition & 1 deletion doc/source/models/builtin/image/flux.1-schnell.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ FLUX.1-schnell

- **Model Name:** FLUX.1-schnell
- **Model Family:** stable_diffusion
- **Abilities:** text2image
- **Abilities:** text2image, image2image, inpainting
- **Available ControlNet:** None

Specifications
Expand Down
31 changes: 31 additions & 0 deletions doc/source/models/builtin/llm/deepseek-v2-chat-0628.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
.. _models_llm_deepseek-v2-chat-0628:

========================================
deepseek-v2-chat-0628
========================================

- **Context Length:** 128000
- **Model Name:** deepseek-v2-chat-0628
- **Languages:** en, zh
- **Abilities:** chat
- **Description:** DeepSeek-V2-Chat-0628 is an improved version of DeepSeek-V2-Chat.

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 236 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 236
- **Quantizations:** 4-bit, 8-bit, none
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none)
- **Model ID:** deepseek-ai/DeepSeek-V2-Chat-0628
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Chat-0628>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine ${engine} --model-name deepseek-v2-chat-0628 --size-in-billions 236 --model-format pytorch --quantization ${quantization}

47 changes: 47 additions & 0 deletions doc/source/models/builtin/llm/deepseek-v2-chat.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
.. _models_llm_deepseek-v2-chat:

========================================
deepseek-v2-chat
========================================

- **Context Length:** 128000
- **Model Name:** deepseek-v2-chat
- **Languages:** en, zh
- **Abilities:** chat
- **Description:** DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference.

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 16 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 16
- **Quantizations:** 4-bit, 8-bit, none
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none)
- **Model ID:** deepseek-ai/DeepSeek-V2-Lite-Chat
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite-Chat>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine ${engine} --model-name deepseek-v2-chat --size-in-billions 16 --model-format pytorch --quantization ${quantization}


Model Spec 2 (pytorch, 236 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 236
- **Quantizations:** 4-bit, 8-bit, none
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none)
- **Model ID:** deepseek-ai/DeepSeek-V2-Chat
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Chat>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine ${engine} --model-name deepseek-v2-chat --size-in-billions 236 --model-format pytorch --quantization ${quantization}

31 changes: 31 additions & 0 deletions doc/source/models/builtin/llm/deepseek-v2.5.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
.. _models_llm_deepseek-v2.5:

========================================
deepseek-v2.5
========================================

- **Context Length:** 128000
- **Model Name:** deepseek-v2.5
- **Languages:** en, zh
- **Abilities:** chat
- **Description:** DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 236 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 236
- **Quantizations:** 4-bit, 8-bit, none
- **Engines**: vLLM, Transformers, SGLang (vLLM and SGLang only available for quantization none)
- **Model ID:** deepseek-ai/DeepSeek-V2.5
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2.5>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2.5>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine ${engine} --model-name deepseek-v2.5 --size-in-billions 236 --model-format pytorch --quantization ${quantization}

47 changes: 47 additions & 0 deletions doc/source/models/builtin/llm/deepseek-v2.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
.. _models_llm_deepseek-v2:

========================================
deepseek-v2
========================================

- **Context Length:** 128000
- **Model Name:** deepseek-v2
- **Languages:** en, zh
- **Abilities:** generate
- **Description:** DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference.

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 16 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 16
- **Quantizations:** 4-bit, 8-bit, none
- **Engines**: Transformers
- **Model ID:** deepseek-ai/DeepSeek-V2-Lite
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine ${engine} --model-name deepseek-v2 --size-in-billions 16 --model-format pytorch --quantization ${quantization}


Model Spec 2 (pytorch, 236 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 236
- **Quantizations:** 4-bit, 8-bit, none
- **Engines**: Transformers
- **Model ID:** deepseek-ai/DeepSeek-V2
- **Model Hubs**: `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-V2>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-V2>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-engine ${engine} --model-name deepseek-v2 --size-in-billions 236 --model-format pytorch --quantization ${quantization}

Loading

0 comments on commit e3df693

Please sign in to comment.