中文版
新特性
- 支持使用
swift app
开启可视化推理创空间,参考这里 - 支持大模型的RM和PPO训练,参考这里
- 支持
SequenceClassification
模型(含BERT)的BNB/GPTQ量化,参考这里 - 支持reward model的推理、部署和BNB/GPTQ量化
新模型
- ZhipuAI/cogagent-9b-20241220
- Reward Models: Shanghai_AI_Laboratory/internlm2-1_8b-reward系列, Qwen/Qwen2-Math-RM-72B系列, AI-ModelScope/Skywork-Reward-Llama-3.1-8B系列, AI-ModelScope/GRM_Llama3.1_8B_rewardmodel-ft系列
- AIDC-AI/Ovis1.6-Gemma2-27B, AIDC-AI/Ovis1.6-Llama3.2-3B
- PowerInfer/SmallThinker-3B-Preview
新数据集
- PowerInfer/LONGCOT-Refine-500K, PowerInfer/QWQ-LONGCOT-500K
English Version
New Features
- Support for using
swift app
to launch a visual inference creative space, see here - Support for RM and PPO training of large models, see here
- Support for BNB/GPTQ quantization of
SequenceClassification
models (including BERT), see here - Support for inference, deployment, and BNB/GPTQ quantization of reward models
New Models
- ZhipuAI/cogagent-9b-20241220
- Reward Models: Shanghai_AI_Laboratory/internlm2-1_8b-reward series, Qwen/Qwen2-Math-RM-72B series, AI-ModelScope/Skywork-Reward-Llama-3.1-8B series, AI-ModelScope/GRM_Llama3.1_8B_rewardmodel-ft series
- AIDC-AI/Ovis1.6-Gemma2-27B, AIDC-AI/Ovis1.6-Llama3.2-3B
- PowerInfer/SmallThinker-3B-Preview
New Datasets
- PowerInfer/LONGCOT-Refine-500K, PowerInfer/QWQ-LONGCOT-500K
What's Changed
- Fix app-ui dropdown by @tastelikefeet in #2787
- fix multi-lora by @Jintao-Huang in #2790
- fix stream infer by @Jintao-Huang in #2793
- fix some web-ui bugs by @tastelikefeet in #2794
- support swift app by @Jintao-Huang in #2792
- fix pt batch infer by @Jintao-Huang in #2800
- fix world_size by @Jintao-Huang in #2801
- update base_model deploy example by @Jintao-Huang in #2803
- fix glm4v by @Jintao-Huang in #2806
- fix swift deploy log error (repeat log) by @Jintao-Huang in #2808
- support ZhipuAI/cogagent-9b-20241220 by @Jintao-Huang in #2810
- fix citest by @Jintao-Huang in #2812
- fix enable_cache by @Jintao-Huang in #2813
- update docs (specific model arguments) by @Jintao-Huang in #2822
- add 'right' option for 'truncation_strategy' by @zsxm1998 in #2754
- Fix glm4v suffix by @Jintao-Huang in #2829
- Update padding side by @Jintao-Huang in #2832
- Update base_to_chat shell by @Jintao-Huang in #2833
- Fix bugs by @Jintao-Huang in #2838
- Fix some bugs by @tastelikefeet in #2848
- support reward_model by @Jintao-Huang in #2849
- Move optimizer to create_optimizer by @tastelikefeet in #2851
- fix post_init by @Jintao-Huang in #2855
- fix cache_name_file by @Jintao-Huang in #2856
- fix telechat template by @Jintao-Huang in #2857
- Update more models by @Jintao-Huang in #2852
- Support quant bert reward by @Jintao-Huang in #2859
- fix jsonl writer by @Jintao-Huang in #2860
- support reward model train by @Jintao-Huang in #2862
- fix vllm video by @Jintao-Huang in #2864
- support mps by @Jintao-Huang in #2866
- Update agent demo by @Jintao-Huang in #2867
- fix bugs by @Jintao-Huang in #2869
- Support ppo by @Jintao-Huang in #2783
- update citest by @Jintao-Huang in #2873
- fix dataset cache bugs by @Jintao-Huang in #2876
New Contributors
Full Changelog: v3.0.1...v3.0.2