支持的模型和数据集
模型
下表介绍了ms-swift接入的模型的相关信息:
Model ID: ModelScope模型id
HF Model ID: HuggingFace模型id
Model Type: 模型类型
Default Template: 默认对话模板
Requires: 使用该模型的额外依赖
Tags: 模型的tags
大语言模型
Model ID |
Model Type |
Default Template |
Requires |
Support Megatron |
Tags |
HF Model ID |
|---|---|---|---|---|---|---|
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
- |
||
qwen |
qwen |
- |
✘ |
financial |
||
qwen |
qwen |
- |
✘ |
financial |
- |
|
qwen |
qwen |
- |
✘ |
financial |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
coding |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
coding |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
coding |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
math |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
math |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
math |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
math |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
math |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
math |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2 |
qwen |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
- |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✘ |
coding |
||
qwen2_5 |
qwen2_5 |
transformers>=4.37 |
✔ |
- |
||
qwen2_5_math |
qwen2_5_math |
transformers>=4.37 |
✔ |
math |
||
qwen2_5_math |
qwen2_5_math |
transformers>=4.37 |
✔ |
math |
||
qwen2_5_math |
qwen2_5_math |
transformers>=4.37 |
✔ |
math |
||
qwen2_5_math |
qwen2_5_math |
transformers>=4.37 |
✔ |
math |
||
qwen2_5_math |
qwen2_5_math |
transformers>=4.37 |
✔ |
math |
||
qwen2_5_math |
qwen2_5_math |
transformers>=4.37 |
✔ |
math |
||
qwen2_moe |
qwen |
transformers>=4.40 |
✔ |
- |
||
qwen2_moe |
qwen |
transformers>=4.40 |
✔ |
- |
||
qwen2_moe |
qwen |
transformers>=4.40 |
✘ |
- |
||
qwen2_moe |
qwen |
transformers>=4.40 |
✔ |
- |
||
qwen2_moe |
qwen |
transformers>=4.40 |
✔ |
- |
||
qwen2_moe |
qwen |
transformers>=4.40 |
✘ |
- |
||
qwq_preview |
qwq_preview |
transformers>=4.37 |
✔ |
- |
||
qwq |
qwq |
transformers>=4.37 |
✔ |
- |
||
qwq |
qwq |
transformers>=4.37 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3 |
qwen3 |
transformers>=4.51 |
✘ |
- |
- |
|
qwen3_guard |
qwen3_guard |
transformers>=4.51 |
✘ |
- |
||
qwen3_guard |
qwen3_guard |
transformers>=4.51 |
✘ |
- |
||
qwen3_guard |
qwen3_guard |
transformers>=4.51 |
✘ |
- |
||
qwen3_thinking |
qwen3_thinking |
transformers>=4.51 |
✔ |
- |
||
qwen3_thinking |
qwen3_thinking |
transformers>=4.51 |
✘ |
- |
||
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✔ |
- |
||
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✘ |
- |
||
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✔ |
- |
||
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✘ |
- |
||
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✘ |
- |
- |
|
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✔ |
- |
||
qwen3_nothinking |
qwen3_nothinking |
transformers>=4.51 |
✘ |
- |
||
qwen3_coder |
qwen3_coder |
transformers>=4.51 |
✔ |
coding |
||
qwen3_coder |
qwen3_coder |
transformers>=4.51 |
✘ |
coding |
||
qwen3_coder |
qwen3_coder |
transformers>=4.51 |
✔ |
coding |
||
qwen3_coder |
qwen3_coder |
transformers>=4.51 |
✘ |
coding |
||
qwen3_coder |
qwen3_coder |
transformers>=4.51 |
✘ |
coding |
- |
|
qwen3_moe |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✘ |
- |
||
qwen3_moe |
qwen3 |
transformers>=4.51 |
✔ |
- |
||
qwen3_moe_thinking |
qwen3_thinking |
transformers>=4.51 |
✔ |
- |
||
qwen3_moe_thinking |
qwen3_thinking |
transformers>=4.51 |
✘ |
- |
||
qwen3_moe_thinking |
qwen3_thinking |
transformers>=4.51 |
✔ |
- |
||
qwen3_moe_thinking |
qwen3_thinking |
transformers>=4.51 |
✘ |
- |
||
qwen3_moe_thinking |
qwen3_thinking |
transformers>=4.51 |
✘ |
- |
- |
|
qwen3_next |
qwen3_nothinking |
transformers>=4.57 |
✔ |
- |
- |
|
qwen3_next |
qwen3_nothinking |
transformers>=4.57 |
✘ |
- |
- |
|
qwen3_next_thinking |
qwen3_thinking |
transformers>=4.57 |
✔ |
- |
- |
|
qwen3_next_thinking |
qwen3_thinking |
transformers>=4.57 |
✘ |
- |
- |
|
qwen3_emb |
qwen3_emb |
- |
✘ |
- |
||
qwen3_emb |
qwen3_emb |
- |
✘ |
- |
||
qwen3_emb |
qwen3_emb |
- |
✘ |
- |
||
qwen3_reranker |
qwen3_reranker |
- |
✘ |
- |
||
qwen3_reranker |
qwen3_reranker |
- |
✘ |
- |
||
qwen3_reranker |
qwen3_reranker |
- |
✘ |
- |
||
qwen2_gte |
dummy |
- |
✘ |
- |
||
qwen2_gte |
dummy |
- |
✘ |
- |
||
bge_reranker |
bge_reranker |
- |
✘ |
- |
||
bge_reranker |
bge_reranker |
- |
✘ |
- |
||
bge_reranker |
bge_reranker |
- |
✘ |
- |
||
codefuse_qwen |
codefuse |
- |
✘ |
coding |
||
modelscope_agent |
modelscope_agent |
- |
✘ |
- |
- |
|
modelscope_agent |
modelscope_agent |
- |
✘ |
- |
- |
|
marco_o1 |
marco_o1 |
transformers>=4.37 |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
- |
✔ |
- |
||
llama |
llama |
transformers>=4.38, aqlm, torch>=2.2.0 |
✘ |
- |
||
llama3 |
llama3 |
- |
✔ |
- |
||
llama3 |
llama3 |
- |
✔ |
- |
||
llama3 |
llama3 |
- |
✔ |
- |
||
llama3 |
llama3 |
- |
✔ |
- |
||
llama3 |
llama3 |
- |
✘ |
- |
||
llama3 |
llama3 |
- |
✘ |
- |
||
llama3 |
llama3 |
- |
✘ |
- |
||
llama3 |
llama3 |
- |
✘ |
- |
||
llama3 |
llama3 |
- |
✘ |
- |
||
llama3 |
llama3 |
- |
✘ |
- |
||
llama3 |
llama3 |
- |
✔ |
- |
||
llama3 |
llama3 |
- |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_1 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_2 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_2 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_2 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_2 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_2 |
llama3_2 |
transformers>=4.43 |
✔ |
- |
||
llama3_2 |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
reflection |
reflection |
transformers>=4.43 |
✔ |
- |
||
megrez |
megrez |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✔ |
- |
||
yi |
chatml |
- |
✘ |
- |
||
yi |
chatml |
- |
✘ |
- |
||
yi |
chatml |
- |
✘ |
- |
||
yi |
chatml |
- |
✘ |
- |
||
yi |
chatml |
- |
✘ |
- |
||
yi |
chatml |
- |
✘ |
- |
||
yi_coder |
yi_coder |
- |
✔ |
coding |
||
yi_coder |
yi_coder |
- |
✔ |
coding |
||
yi_coder |
yi_coder |
- |
✔ |
coding |
||
yi_coder |
yi_coder |
- |
✔ |
coding |
||
sus |
sus |
- |
✔ |
- |
||
gpt_oss |
gpt_oss |
transformers>=4.55 |
✘ |
- |
||
gpt_oss |
gpt_oss |
transformers>=4.55 |
✘ |
- |
||
seed_oss |
seed_oss |
transformers>=4.56 |
✘ |
- |
||
seed_oss |
seed_oss |
transformers>=4.56 |
✘ |
- |
||
seed_oss |
seed_oss |
transformers>=4.56 |
✘ |
- |
||
codefuse_codellama |
codefuse_codellama |
- |
✔ |
coding |
||
mengzi3 |
mengzi |
- |
✔ |
- |
||
ziya |
ziya |
- |
✔ |
- |
||
ziya |
ziya |
- |
✔ |
- |
||
numina |
numina |
- |
✔ |
math |
||
atom |
atom |
- |
✘ |
- |
||
atom |
atom |
- |
✘ |
- |
||
chatglm2 |
chatglm2 |
transformers<4.42 |
✘ |
- |
||
chatglm2 |
chatglm2 |
transformers<4.42 |
✘ |
- |
||
chatglm2 |
chatglm2 |
transformers<4.34 |
✘ |
coding |
||
chatglm3 |
glm4 |
transformers<4.42 |
✘ |
- |
||
chatglm3 |
glm4 |
transformers<4.42 |
✘ |
- |
||
chatglm3 |
glm4 |
transformers<4.42 |
✘ |
- |
||
chatglm3 |
glm4 |
transformers<4.42 |
✘ |
- |
||
glm4 |
glm4 |
transformers>=4.42 |
✘ |
- |
||
glm4 |
glm4 |
transformers>=4.42 |
✘ |
- |
||
glm4 |
glm4 |
transformers>=4.42 |
✘ |
- |
||
glm4 |
glm4 |
transformers>=4.42 |
✘ |
- |
||
glm4_0414 |
glm4_0414 |
transformers>=4.51 |
✘ |
- |
||
glm4_0414 |
glm4_0414 |
transformers>=4.51 |
✘ |
- |
||
glm4_0414 |
glm4_0414 |
transformers>=4.51 |
✘ |
- |
||
glm4_0414 |
glm4_0414 |
transformers>=4.51 |
✘ |
- |
||
glm4_0414 |
glm4_0414 |
transformers>=4.51 |
✘ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✔ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✔ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✘ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✔ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✔ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✘ |
- |
||
glm4_5 |
glm4_5 |
transformers>=4.54 |
✔ |
- |
||
glm4_z1_rumination |
glm4_z1_rumination |
transformers>4.51 |
✘ |
- |
||
glm_edge |
glm4 |
transformers>=4.46 |
✘ |
- |
||
glm_edge |
glm4 |
transformers>=4.46 |
✘ |
- |
||
codefuse_codegeex2 |
codefuse |
transformers<4.34 |
✘ |
coding |
||
codegeex4 |
codegeex4 |
transformers<4.42 |
✘ |
coding |
||
longwriter_llama3_1 |
longwriter_llama |
transformers>=4.43 |
✔ |
- |
||
internlm |
internlm |
- |
✘ |
- |
||
internlm |
internlm |
- |
✘ |
- |
||
internlm |
internlm |
- |
✘ |
- |
- |
|
internlm |
internlm |
- |
✘ |
- |
||
internlm |
internlm |
- |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
math |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
math |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
math |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
math |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm2 |
internlm2 |
transformers>=4.38 |
✘ |
- |
||
internlm3 |
internlm2 |
transformers>=4.48 |
✔ |
- |
||
deepseek |
deepseek |
- |
✔ |
- |
||
deepseek |
deepseek |
- |
✔ |
- |
||
deepseek |
deepseek |
- |
✔ |
- |
||
deepseek |
deepseek |
- |
✔ |
- |
||
deepseek |
deepseek |
- |
✔ |
math |
||
deepseek |
deepseek |
- |
✔ |
math |
||
deepseek |
deepseek |
- |
✔ |
math |
||
deepseek |
deepseek |
- |
✔ |
coding |
||
deepseek |
deepseek |
- |
✔ |
coding |
||
deepseek |
deepseek |
- |
✔ |
coding |
||
deepseek |
deepseek |
- |
✔ |
coding |
||
deepseek |
deepseek |
- |
✔ |
coding |
||
deepseek |
deepseek |
- |
✔ |
coding |
||
deepseek_moe |
deepseek |
- |
✔ |
- |
||
deepseek_moe |
deepseek |
- |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2 |
deepseek |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✘ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✘ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v2_5 |
deepseek_v2_5 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✘ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✘ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1 |
deepseek_r1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
transformers>=4.37 |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
transformers>=4.37 |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
transformers>=4.37 |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
transformers>=4.37 |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
transformers>=4.37 |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
- |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
- |
✔ |
- |
||
deepseek_r1_distill |
deepseek_r1 |
- |
✔ |
- |
||
deepseek_v3_1 |
deepseek_v3_1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v3_1 |
deepseek_v3_1 |
transformers>=4.39.3 |
✔ |
- |
||
deepseek_v3_1 |
deepseek_v3_1 |
transformers>=4.39.3 |
✔ |
- |
||
openbuddy_llama |
openbuddy |
- |
✔ |
- |
||
openbuddy_llama |
openbuddy |
- |
✔ |
- |
||
openbuddy_llama |
openbuddy |
- |
✔ |
- |
||
openbuddy_llama |
openbuddy |
- |
✔ |
- |
||
openbuddy_llama3 |
openbuddy2 |
- |
✔ |
- |
||
openbuddy_llama3 |
openbuddy2 |
- |
✔ |
- |
||
openbuddy_llama3 |
openbuddy2 |
- |
✔ |
- |
||
openbuddy_llama3 |
openbuddy2 |
transformers>=4.43 |
✔ |
- |
||
openbuddy_llama3 |
openbuddy2 |
transformers>=4.43 |
✔ |
- |
||
openbuddy_llama3 |
openbuddy2 |
transformers>=4.45 |
✔ |
- |
||
openbuddy_mistral |
openbuddy |
transformers>=4.34 |
✘ |
- |
||
openbuddy_mistral |
openbuddy |
transformers>=4.34 |
✘ |
- |
||
openbuddy_mixtral |
openbuddy |
transformers>=4.36 |
✘ |
- |
||
baichuan |
baichuan |
transformers<4.34 |
✘ |
- |
||
baichuan |
baichuan |
transformers<4.34 |
✘ |
- |
||
baichuan |
baichuan |
transformers<4.34 |
✘ |
- |
||
baichuan2 |
baichuan |
- |
✘ |
- |
||
baichuan2 |
baichuan |
- |
✘ |
- |
||
baichuan2 |
baichuan |
- |
✘ |
- |
||
baichuan2 |
baichuan |
- |
✘ |
- |
||
baichuan2 |
baichuan |
bitsandbytes<0.41.2, accelerate<0.26 |
✘ |
- |
||
baichuan2 |
baichuan |
bitsandbytes<0.41.2, accelerate<0.26 |
✘ |
- |
||
baichuan_m1 |
baichuan_m1 |
transformers>=4.48 |
✘ |
- |
||
minicpm |
minicpm |
transformers>=4.36.0 |
✘ |
- |
||
minicpm |
minicpm |
transformers>=4.36.0 |
✘ |
- |
||
minicpm |
minicpm |
transformers>=4.36.0 |
✘ |
- |
||
minicpm_chatml |
chatml |
transformers>=4.36 |
✘ |
- |
||
minicpm_chatml |
chatml |
transformers>=4.36 |
✘ |
- |
||
minicpm_chatml |
chatml |
transformers>=4.36 |
✘ |
- |
||
minicpm3 |
chatml |
transformers>=4.36 |
✘ |
- |
||
minicpm_moe |
minicpm |
transformers>=4.36 |
✘ |
- |
||
telechat |
telechat |
- |
✘ |
- |
||
telechat |
telechat |
- |
✘ |
- |
||
telechat |
telechat |
- |
✘ |
- |
||
telechat |
telechat |
- |
✘ |
- |
||
telechat |
telechat |
- |
✘ |
- |
- |
|
telechat |
telechat |
- |
✘ |
- |
||
telechat |
telechat |
- |
✘ |
- |
||
telechat2 |
telechat2 |
- |
✘ |
- |
||
telechat2 |
telechat2 |
- |
✘ |
- |
||
telechat2 |
telechat2 |
- |
✘ |
- |
||
telechat2 |
telechat2 |
- |
✘ |
- |
||
mistral |
llama |
transformers>=4.34 |
✘ |
- |
||
mistral |
llama |
transformers>=4.34 |
✘ |
- |
||
mistral |
llama |
transformers>=4.34 |
✘ |
- |
||
mistral |
llama |
transformers>=4.34 |
✘ |
- |
||
mistral |
llama |
transformers>=4.34 |
✘ |
- |
||
mistral |
llama |
transformers>=4.34 |
✘ |
- |
||
devstral |
devstral |
transformers>=4.43, mistral-common>=1.5.5 |
✘ |
- |
||
zephyr |
zephyr |
transformers>=4.34 |
✘ |
- |
||
mixtral |
llama |
transformers>=4.36 |
✘ |
- |
||
mixtral |
llama |
transformers>=4.36 |
✘ |
- |
||
mixtral |
llama |
transformers>=4.36 |
✘ |
- |
||
mixtral |
llama |
transformers>=4.38, aqlm, torch>=2.2.0 |
✘ |
- |
||
mistral_nemo |
mistral_nemo |
transformers>=4.43 |
✘ |
- |
||
mistral_nemo |
mistral_nemo |
transformers>=4.43 |
✘ |
- |
||
mistral_nemo |
mistral_nemo |
transformers>=4.43 |
✘ |
- |
||
mistral_nemo |
mistral_nemo |
transformers>=4.43 |
✘ |
- |
||
mistral_nemo |
mistral_nemo |
transformers>=4.46 |
✘ |
- |
||
mistral_2501 |
mistral_2501 |
- |
✘ |
- |
||
mistral_2501 |
mistral_2501 |
- |
✘ |
- |
||
wizardlm2 |
wizardlm2 |
transformers>=4.34 |
✘ |
- |
||
wizardlm2_moe |
wizardlm2_moe |
transformers>=4.36 |
✘ |
- |
||
phi2 |
default |
- |
✘ |
- |
||
phi3_small |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3_small |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3 |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3 |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3 |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3 |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3 |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3 |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi3_moe |
phi3 |
transformers>=4.36 |
✘ |
- |
||
phi4 |
phi4 |
transformers>=4.36 |
✘ |
- |
||
minimax |
minimax |
- |
✘ |
- |
||
minimax_m1 |
minimax_m1 |
- |
✘ |
- |
||
minimax_m1 |
minimax_m1 |
- |
✘ |
- |
||
gemma |
gemma |
transformers>=4.38 |
✘ |
- |
||
gemma |
gemma |
transformers>=4.38 |
✘ |
- |
||
gemma |
gemma |
transformers>=4.38 |
✘ |
- |
||
gemma |
gemma |
transformers>=4.38 |
✘ |
- |
||
gemma2 |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma2 |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma2 |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma2 |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma2 |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma2 |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma3_text |
gemma3_text |
transformers>=4.49 |
✘ |
- |
||
gemma3_text |
gemma3_text |
transformers>=4.49 |
✘ |
- |
||
gemma3_text |
gemma3_text |
transformers>=4.49 |
✘ |
- |
||
gemma3_text |
gemma3_text |
transformers>=4.49 |
✘ |
- |
||
skywork |
skywork |
- |
✘ |
- |
||
skywork |
skywork |
- |
✘ |
- |
- |
|
skywork_o1 |
skywork_o1 |
transformers>=4.43 |
✔ |
- |
||
ling |
ling |
- |
✘ |
- |
||
ling |
ling |
- |
✘ |
- |
||
ling |
ling |
- |
✘ |
- |
||
ling |
ling |
- |
✘ |
- |
||
ling2 |
ling2 |
- |
✘ |
- |
||
ling2 |
ling2 |
- |
✘ |
- |
||
ring2 |
ring2 |
- |
✘ |
- |
||
yuan2 |
yuan |
- |
✘ |
- |
||
yuan2 |
yuan |
- |
✘ |
- |
||
yuan2 |
yuan |
- |
✘ |
- |
||
yuan2 |
yuan |
- |
✘ |
- |
||
yuan2 |
yuan |
- |
✘ |
- |
||
orion |
orion |
- |
✘ |
- |
||
orion |
orion |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse |
xverse |
- |
✘ |
- |
||
xverse_moe |
xverse |
- |
✘ |
- |
||
seggpt |
default |
- |
✘ |
- |
||
bluelm |
bluelm |
- |
✘ |
- |
||
bluelm |
bluelm |
- |
✘ |
- |
||
bluelm |
bluelm |
- |
✘ |
- |
||
bluelm |
bluelm |
- |
✘ |
- |
||
c4ai |
c4ai |
transformers>=4.39 |
✘ |
- |
||
c4ai |
c4ai |
transformers>=4.39 |
✘ |
- |
||
dbrx |
dbrx |
transformers>=4.36 |
✘ |
- |
||
dbrx |
dbrx |
transformers>=4.36 |
✘ |
- |
||
grok |
default |
- |
✘ |
- |
||
mamba |
default |
transformers>=4.39.0 |
✘ |
- |
||
mamba |
default |
transformers>=4.39.0 |
✘ |
- |
||
mamba |
default |
transformers>=4.39.0 |
✘ |
- |
||
mamba |
default |
transformers>=4.39.0 |
✘ |
- |
||
mamba |
default |
transformers>=4.39.0 |
✘ |
- |
||
mamba |
default |
transformers>=4.39.0 |
✘ |
- |
||
polylm |
default |
- |
✘ |
- |
||
aya |
aya |
transformers>=4.44.0 |
✘ |
- |
||
aya |
aya |
transformers>=4.44.0 |
✘ |
- |
||
moonlight |
moonlight |
transformers<4.49 |
✔ |
- |
||
moonlight |
moonlight |
transformers<4.49 |
✔ |
- |
||
moonlight |
moonlight |
transformers<4.49 |
✔ |
- |
||
moonlight |
moonlight |
transformers<4.49 |
✔ |
- |
||
moonlight |
moonlight |
transformers<4.49 |
✔ |
- |
||
mimo |
qwen |
transformers>=4.37 |
✔ |
- |
||
mimo |
qwen |
transformers>=4.37 |
✔ |
- |
||
mimo |
qwen |
transformers>=4.37 |
✔ |
- |
||
mimo |
qwen |
transformers>=4.37 |
✔ |
- |
||
mimo_rl |
mimo_rl |
transformers>=4.37 |
✔ |
- |
||
dots1 |
dots1 |
transformers>=4.53 |
✔ |
- |
||
dots1 |
dots1 |
transformers>=4.53 |
✔ |
- |
||
hunyuan_moe |
hunyuan_moe |
- |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
hunyuan |
hunyuan |
transformers>=4.55.0.dev0 |
✘ |
- |
||
ernie |
ernie |
- |
✔ |
- |
||
ernie |
ernie |
- |
✔ |
- |
||
ernie |
ernie |
- |
✔ |
- |
||
ernie |
ernie |
- |
✔ |
- |
||
ernie |
ernie |
- |
✔ |
- |
||
ernie |
ernie |
- |
✔ |
- |
||
gemma_emb |
dummy |
- |
✘ |
- |
||
ernie_thinking |
ernie_thinking |
- |
✔ |
- |
||
longchat |
longchat |
transformers>=4.54,<4.56 |
✘ |
- |
||
longchat |
longchat |
transformers>=4.54,<4.56 |
✘ |
- |
||
modern_bert |
dummy |
transformers>=4.48 |
✘ |
bert |
||
modern_bert |
dummy |
transformers>=4.48 |
✘ |
bert |
||
modern_bert_gte |
dummy |
transformers>=4.48 |
✘ |
bert, embedding |
||
modern_bert_gte_reranker |
bert |
transformers>=4.48 |
✘ |
bert, reranker |
||
bert |
dummy |
- |
✘ |
bert |
- |
|
internlm2_reward |
internlm2_reward |
transformers>=4.38 |
✘ |
- |
||
internlm2_reward |
internlm2_reward |
transformers>=4.38 |
✘ |
- |
||
internlm2_reward |
internlm2_reward |
transformers>=4.38 |
✘ |
- |
||
qwen2_reward |
qwen |
transformers>=4.37 |
✘ |
- |
||
qwen2_5_prm |
qwen2_5_math_prm |
transformers>=4.37 |
✘ |
- |
||
qwen2_5_prm |
qwen2_5_math_prm |
transformers>=4.37 |
✘ |
- |
||
qwen2_5_prm |
qwen2_5_math_prm |
transformers>=4.37 |
✘ |
- |
||
qwen2_5_math_reward |
qwen2_5_math |
transformers>=4.37 |
✘ |
- |
||
llama3_2_reward |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_2_reward |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_2_reward |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
llama3_2_reward |
llama3_2 |
transformers>=4.43 |
✘ |
- |
||
gemma_reward |
gemma |
transformers>=4.42 |
✘ |
- |
||
gemma_reward |
gemma |
transformers>=4.42 |
✘ |
- |
多模态大模型
Model ID |
Model Type |
Default Template |
Requires |
Support Megatron |
Tags |
HF Model ID |
|---|---|---|---|---|---|---|
qwen_vl |
qwen_vl |
- |
✘ |
vision |
||
qwen_vl |
qwen_vl |
- |
✘ |
vision |
||
qwen_vl |
qwen_vl |
- |
✘ |
vision |
||
qwen_audio |
qwen_audio |
- |
✘ |
audio |
||
qwen_audio |
qwen_audio |
- |
✘ |
audio |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_vl |
qwen2_vl |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✔ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_5_vl |
qwen2_5_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_5_omni |
qwen2_5_omni |
transformers>=4.50, soundfile, qwen_omni_utils, decord |
✔ |
vision, video, audio |
||
qwen2_5_omni |
qwen2_5_omni |
transformers>=4.50, soundfile, qwen_omni_utils, decord |
✔ |
vision, video, audio |
||
qwen3_omni |
qwen3_omni |
transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils |
✔ |
vision, video, audio |
||
qwen3_omni |
qwen3_omni |
transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils |
✔ |
vision, video, audio |
||
qwen3_omni |
qwen3_omni |
transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils |
✔ |
vision, video, audio |
||
qwen2_audio |
qwen2_audio |
transformers>=4.45,<4.49, librosa |
✘ |
audio |
||
qwen2_audio |
qwen2_audio |
transformers>=4.45,<4.49, librosa |
✘ |
audio |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✔ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qwen3_moe_vl |
qwen3_vl |
transformers>=4.57, qwen_vl_utils>=0.0.14, decord |
✘ |
vision, video |
||
qvq |
qvq |
transformers>=4.45, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
qwen2_gme |
qwen2_gme |
- |
✘ |
vision |
||
qwen2_gme |
qwen2_gme |
- |
✘ |
vision |
||
ovis1_6 |
ovis1_6 |
transformers>=4.42 |
✘ |
vision |
||
ovis1_6 |
ovis1_6 |
transformers>=4.42 |
✘ |
vision |
||
ovis1_6 |
ovis1_6 |
transformers>=4.42 |
✘ |
vision |
||
ovis1_6_llama3 |
ovis1_6_llama3 |
- |
✘ |
vision |
||
ovis2 |
ovis2 |
transformers>=4.46.2, moviepy<2 |
✘ |
vision |
||
ovis2 |
ovis2 |
transformers>=4.46.2, moviepy<2 |
✘ |
vision |
||
ovis2 |
ovis2 |
transformers>=4.46.2, moviepy<2 |
✘ |
vision |
||
ovis2 |
ovis2 |
transformers>=4.46.2, moviepy<2 |
✘ |
vision |
||
ovis2 |
ovis2 |
transformers>=4.46.2, moviepy<2 |
✘ |
vision |
||
ovis2 |
ovis2 |
transformers>=4.46.2, moviepy<2 |
✘ |
vision |
||
ovis2_5 |
ovis2_5 |
transformers>=4.46.2, moviepy<2 |
✔ |
vision |
||
ovis2_5 |
ovis2_5 |
transformers>=4.46.2, moviepy<2 |
✔ |
vision |
||
mimo_vl |
mimo_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
mimo_vl |
mimo_vl |
transformers>=4.49, qwen_vl_utils>=0.0.6, decord |
✘ |
vision, video |
||
midashenglm |
midashenglm |
transformers>=4.52, soundfile |
✘ |
audio |
||
glm4v |
glm4v |
transformers>=4.42,<4.45 |
✘ |
- |
||
glm4v |
glm4v |
transformers>=4.42 |
✘ |
- |
||
glm4_1v |
glm4_1v |
transformers>=4.53 |
✘ |
- |
||
glm4_1v |
glm4_1v |
transformers>=4.53 |
✘ |
- |
||
glm4_5v |
glm4_5v |
transformers>=4.56 |
✔ |
- |
||
glm4_5v |
glm4_5v |
transformers>=4.56 |
✘ |
- |
||
glm_edge_v |
glm_edge_v |
transformers>=4.46 |
✘ |
vision |
||
glm_edge_v |
glm_edge_v |
transformers>=4.46 |
✘ |
vision |
||
cogvlm |
cogvlm |
transformers<4.42 |
✘ |
- |
||
cogagent_vqa |
cogagent_vqa |
transformers<4.42 |
✘ |
- |
||
cogagent_chat |
cogagent_chat |
transformers<4.42, timm |
✘ |
- |
||
cogvlm2 |
cogvlm2 |
transformers<4.42 |
✘ |
- |
||
cogvlm2 |
cogvlm2 |
transformers<4.42 |
✘ |
- |
||
cogvlm2_video |
cogvlm2_video |
decord, pytorchvideo, transformers>=4.42 |
✘ |
video |
||
internvl |
internvl |
transformers>=4.35, timm |
✘ |
vision |
||
internvl |
internvl |
transformers>=4.35, timm |
✘ |
vision |
||
internvl |
internvl |
transformers>=4.35, timm |
✘ |
vision |
||
internvl_phi3 |
internvl_phi3 |
transformers>=4.35,<4.42, timm |
✘ |
vision |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
||
OpenGVLab/InternVL2-Pretrain-Models:InternVL2-Llama3-76B-Pretrain |
internvl2 |
internvl2 |
transformers>=4.36, timm |
✘ |
vision, video |
OpenGVLab/InternVL2-Pretrain-Models:InternVL2-Llama3-76B-Pretrain |
internvl2_phi3 |
internvl2_phi3 |
transformers>=4.36,<4.42, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl2_5 |
internvl2_5 |
transformers>=4.36, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl3 |
internvl2_5 |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl_hf |
internvl_hf |
transformers>=4.52.1, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5 |
internvl3_5 |
transformers>=4.37.2, timm |
✔ |
vision, video |
||
internvl3_5_gpt |
internvl3_5_gpt |
transformers>=4.37.2, timm |
✘ |
vision, video |
||
internvl_gpt_hf |
internvl_hf |
transformers>=4.55.0, timm |
✘ |
vision, video |
||
interns1 |
interns1 |
transformers>=4.55.2,<4.56 |
✘ |
vision, video |
||
interns1 |
interns1 |
transformers>=4.55.2,<4.56 |
✘ |
vision, video |
||
interns1 |
interns1 |
transformers>=4.55.2,<4.56 |
✘ |
vision, video |
||
interns1 |
interns1 |
transformers>=4.55.2,<4.56 |
✘ |
vision, video |
||
xcomposer2 |
ixcomposer2 |
- |
✘ |
vision |
||
xcomposer2_4khd |
ixcomposer2 |
- |
✘ |
vision |
||
xcomposer2_5 |
xcomposer2_5 |
decord |
✘ |
vision |
||
xcomposer2_5 |
xcomposer2_5 |
decord |
✘ |
vision |
||
xcomposer2_5_ol_audio |
qwen2_audio |
transformers>=4.45 |
✘ |
audio |
||
llama3_2_vision |
llama3_2_vision |
transformers>=4.45 |
✘ |
vision |
||
llama3_2_vision |
llama3_2_vision |
transformers>=4.45 |
✘ |
vision |
||
llama3_2_vision |
llama3_2_vision |
transformers>=4.45 |
✘ |
vision |
||
llama3_2_vision |
llama3_2_vision |
transformers>=4.45 |
✘ |
vision |
||
llama4 |
llama4 |
transformers>=4.51 |
✘ |
vision |
||
llama4 |
llama4 |
transformers>=4.51 |
✘ |
vision |
||
llama4 |
llama4 |
transformers>=4.51 |
✘ |
vision |
||
llama4 |
llama4 |
transformers>=4.51 |
✘ |
vision |
||
llama4 |
llama4 |
transformers>=4.51 |
✘ |
vision |
||
llama3_1_omni |
llama3_1_omni |
openai-whisper |
✘ |
audio |
||
llava1_5_hf |
llava1_5_hf |
transformers>=4.36 |
✘ |
vision |
||
llava1_5_hf |
llava1_5_hf |
transformers>=4.36 |
✘ |
vision |
||
llava1_6_mistral_hf |
llava1_6_mistral_hf |
transformers>=4.39 |
✘ |
vision |
||
llava1_6_vicuna_hf |
llava1_6_vicuna_hf |
transformers>=4.39 |
✘ |
vision |
||
llava1_6_vicuna_hf |
llava1_6_vicuna_hf |
transformers>=4.39 |
✘ |
vision |
||
llava1_6_yi_hf |
llava1_6_yi_hf |
transformers>=4.39 |
✘ |
vision |
||
llama3_llava_next_hf |
llama3_llava_next_hf |
transformers>=4.39 |
✘ |
vision |
||
llava_next_qwen_hf |
llava_next_qwen_hf |
transformers>=4.39 |
✘ |
vision |
||
llava_next_qwen_hf |
llava_next_qwen_hf |
transformers>=4.39 |
✘ |
vision |
||
llava_next_video_hf |
llava_next_video_hf |
transformers>=4.42, av |
✘ |
video |
||
llava_next_video_hf |
llava_next_video_hf |
transformers>=4.42, av |
✘ |
video |
||
llava_next_video_hf |
llava_next_video_hf |
transformers>=4.42, av |
✘ |
video |
||
llava_next_video_yi_hf |
llava_next_video_hf |
transformers>=4.42, av |
✘ |
video |
||
llava_onevision_hf |
llava_onevision_hf |
transformers>=4.45 |
✘ |
vision, video |
||
llava_onevision_hf |
llava_onevision_hf |
transformers>=4.45 |
✘ |
vision, video |
||
llava_onevision_hf |
llava_onevision_hf |
transformers>=4.45 |
✘ |
vision, video |
||
yi_vl |
yi_vl |
transformers>=4.34 |
✘ |
vision |
||
yi_vl |
yi_vl |
transformers>=4.34 |
✘ |
vision |
||
llava_llama3_1_hf |
llava_llama3_1_hf |
transformers>=4.41 |
✘ |
vision |
- |
|
llava_llama3_hf |
llava_llama3_hf |
transformers>=4.36 |
✘ |
vision |
||
llava1_6_mistral |
llava1_6_mistral |
transformers>=4.34 |
✘ |
vision |
||
llava1_6_yi |
llava1_6_yi |
transformers>=4.34 |
✘ |
vision |
||
llava_next_qwen |
llava_next_qwen |
transformers>=4.42, av |
✘ |
vision |
||
llava_next_qwen |
llava_next_qwen |
transformers>=4.42, av |
✘ |
vision |
||
llama3_llava_next |
llama3_llava_next |
transformers>=4.42, av |
✘ |
vision |
||
deepseek_vl |
deepseek_vl |
- |
✘ |
vision |
||
deepseek_vl |
deepseek_vl |
- |
✘ |
vision |
||
deepseek_vl2 |
deepseek_vl2 |
transformers<4.42 |
✘ |
vision |
||
deepseek_vl2 |
deepseek_vl2 |
transformers<4.42 |
✘ |
vision |
||
deepseek_vl2 |
deepseek_vl2 |
transformers<4.42 |
✘ |
vision |
||
deepseek_janus |
deepseek_janus |
- |
✘ |
vision |
||
deepseek_janus_pro |
deepseek_janus_pro |
- |
✘ |
vision |
||
deepseek_janus_pro |
deepseek_janus_pro |
- |
✘ |
vision |
||
minicpmv |
minicpmv |
timm, transformers<4.42 |
✘ |
vision |
||
minicpmv |
minicpmv |
timm, transformers<4.42 |
✘ |
vision |
||
minicpmv2_5 |
minicpmv2_5 |
timm, transformers>=4.36 |
✘ |
vision |
||
minicpmv2_6 |
minicpmv2_6 |
timm, transformers>=4.36, decord |
✘ |
vision, video |
||
minicpmo2_6 |
minicpmo2_6 |
timm, transformers>=4.36, decord, soundfile |
✘ |
vision, video, omni, audio |
||
minicpmv4 |
minicpmv4 |
timm, transformers>=4.36, decord |
✘ |
vision, video |
||
minicpmv4_5 |
minicpmv4_5 |
timm, transformers>=4.36, decord |
✘ |
vision, video |
||
minimax_vl |
minimax_vl |
- |
✘ |
vision |
||
mplug_owl2 |
mplug_owl2 |
transformers<4.35, icecream |
✘ |
vision |
||
mplug_owl2_1 |
mplug_owl2 |
transformers<4.35, icecream |
✘ |
vision |
||
mplug_owl3 |
mplug_owl3 |
transformers>=4.36, icecream, decord |
✘ |
vision, video |
||
mplug_owl3 |
mplug_owl3 |
transformers>=4.36, icecream, decord |
✘ |
vision, video |
||
mplug_owl3 |
mplug_owl3 |
transformers>=4.36, icecream, decord |
✘ |
vision, video |
||
mplug_owl3_241101 |
mplug_owl3_241101 |
transformers>=4.36, icecream |
✘ |
vision, video |
||
doc_owl2 |
doc_owl2 |
transformers>=4.36, icecream |
✘ |
vision |
||
emu3_gen |
emu3_gen |
- |
✘ |
t2i |
||
emu3_chat |
emu3_chat |
transformers>=4.44.0 |
✘ |
vision |
||
got_ocr2 |
got_ocr2 |
- |
✘ |
vision |
||
got_ocr2_hf |
got_ocr2_hf |
- |
✘ |
vision |
||
step_audio |
step_audio |
funasr, sox, conformer, openai-whisper, librosa |
✘ |
audio |
||
step_audio2_mini |
step_audio2_mini |
transformers==4.53.3, torchaudio, librosa |
✘ |
audio |
||
kimi_vl |
kimi_vl |
transformers<4.49 |
✔ |
- |
||
kimi_vl |
kimi_vl |
transformers<4.49 |
✔ |
- |
||
kimi_vl |
kimi_vl |
transformers<4.49 |
✔ |
- |
||
keye_vl |
keye_vl |
keye_vl_utils |
✘ |
vision |
||
keye_vl_1_5 |
keye_vl_1_5 |
keye_vl_utils>=1.5.2 |
✘ |
vision |
||
dots_ocr |
dots_ocr |
transformers>=4.51.0 |
✘ |
- |
||
sail_vl2 |
sail_vl2 |
transformers<=4.51.3 |
✘ |
vision |
||
sail_vl2 |
sail_vl2 |
transformers<=4.51.3 |
✘ |
vision |
||
sail_vl2 |
sail_vl2 |
transformers<=4.51.3 |
✘ |
vision |
||
sail_vl2 |
sail_vl2 |
transformers<=4.51.3 |
✘ |
vision |
||
phi3_vision |
phi3_vision |
transformers>=4.36 |
✘ |
vision |
||
phi3_vision |
phi3_vision |
transformers>=4.36 |
✘ |
vision |
||
phi4_multimodal |
phi4_multimodal |
transformers>=4.36,<4.49, backoff, soundfile |
✘ |
vision, audio |
||
florence |
florence |
- |
✘ |
vision |
||
florence |
florence |
- |
✘ |
vision |
||
florence |
florence |
- |
✘ |
vision |
||
florence |
florence |
- |
✘ |
vision |
||
idefics3 |
idefics3 |
transformers>=4.45 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
paligemma |
paligemma |
transformers>=4.41 |
✘ |
vision |
||
molmo |
molmo |
transformers>=4.45 |
✘ |
vision |
||
molmo |
molmo |
transformers>=4.45 |
✘ |
vision |
||
molmo |
molmo |
transformers>=4.45 |
✘ |
vision |
||
molmoe |
molmo |
transformers>=4.45 |
✘ |
vision |
||
pixtral |
pixtral |
transformers>=4.45 |
✘ |
vision |
||
megrez_omni |
megrez_omni |
- |
✘ |
vision, audio |
||
valley |
valley |
transformers>=4.42, av |
✘ |
vision |
- |
|
gemma3_vision |
gemma3_vision |
transformers>=4.49 |
✘ |
- |
||
gemma3_vision |
gemma3_vision |
transformers>=4.49 |
✘ |
- |
||
gemma3_vision |
gemma3_vision |
transformers>=4.49 |
✘ |
- |
||
gemma3_vision |
gemma3_vision |
transformers>=4.49 |
✘ |
- |
||
gemma3_vision |
gemma3_vision |
transformers>=4.49 |
✘ |
- |
||
gemma3_vision |
gemma3_vision |
transformers>=4.49 |
✘ |
- |
||
gemma3n |
gemma3n |
transformers>=4.53.1 |
✘ |
- |
||
gemma3n |
gemma3n |
transformers>=4.53.1 |
✘ |
- |
||
gemma3n |
gemma3n |
transformers>=4.53.1 |
✘ |
- |
||
gemma3n |
gemma3n |
transformers>=4.53.1 |
✘ |
- |
||
mistral_2503 |
mistral_2503 |
transformers>=4.49 |
✘ |
- |
||
mistral_2503 |
mistral_2503 |
transformers>=4.49 |
✘ |
- |
数据集
下表介绍了ms-swift接入的数据集的相关信息:
Dataset ID: ModelScope数据集id
HF Dataset ID: HuggingFace数据集id
Subset Name: 子数据集名称
Dataset Size: 数据集大小
Statistic: 数据集的统计量. 我们使用token数进行统计, 这对于调整
max_length超参数有帮助. 我们使用qwen2.5的tokenizer对数据集进行分词. 不同的tokenizer的统计量不同, 如果你要获取其他的模型的tokenizer的token统计量, 可以通过脚本自行获取.Tags: 数据集的tags
| Dataset ID | Subset Name | Dataset Size | Statistic (token) | Tags | HF Dataset ID |
|---|---|---|---|---|---|
| AI-MO/NuminaMath-1.5 | default | 896215 | 116.1±80.8, min=31, max=5064 | grpo, math | AI-MO/NuminaMath-1.5 |
| AI-MO/NuminaMath-CoT | default | 859494 | 113.1±60.2, min=35, max=2120 | grpo, math | AI-MO/NuminaMath-CoT |
| AI-MO/NuminaMath-TIR | default | 72441 | 100.9±52.2, min=36, max=1683 | grpo, math, 🔥 | AI-MO/NuminaMath-TIR |
| AI-ModelScope/COIG-CQIA | chinese_traditional coig_pc exam finance douban human_value logi_qa ruozhiba segmentfault wiki wikihow xhs zhihu |
44694 | 331.2±693.8, min=34, max=19288 | general, 🔥 | - |
| AI-ModelScope/CodeAlpaca-20k | default | 20022 | 99.3±57.6, min=30, max=857 | code, en | HuggingFaceH4/CodeAlpaca_20K |
| AI-ModelScope/DISC-Law-SFT | default | 166758 | 1799.0±474.9, min=769, max=3151 | chat, law, 🔥 | ShengbinYue/DISC-Law-SFT |
| AI-ModelScope/DISC-Med-SFT | default | 464885 | 426.5±178.7, min=110, max=1383 | chat, medical, 🔥 | Flmc/DISC-Med-SFT |
| AI-ModelScope/Duet-v0.5 | default | 5000 | 1157.4±189.3, min=657, max=2344 | CoT, en | G-reen/Duet-v0.5 |
| AI-ModelScope/GuanacoDataset | default | 31563 | 250.3±70.6, min=95, max=987 | chat, zh | JosephusCheung/GuanacoDataset |
| AI-ModelScope/LLaVA-Instruct-150K | default | 623302 | 630.7±143.0, min=301, max=1166 | chat, multi-modal, vision | - |
| AI-ModelScope/LLaVA-Pretrain | default | huge dataset | - | chat, multi-modal, quality | liuhaotian/LLaVA-Pretrain |
| AI-ModelScope/LaTeX_OCR | default human_handwrite human_handwrite_print synthetic_handwrite small |
162149 | 117.6±44.9, min=41, max=312 | chat, ocr, multi-modal, vision | linxy/LaTeX_OCR |
| AI-ModelScope/LongAlpaca-12k | default | 11998 | 9941.8±3417.1, min=4695, max=25826 | long-sequence, QA | Yukang/LongAlpaca-12k |
| AI-ModelScope/M3IT | coco vqa-v2 shapes shapes-rephrased coco-goi-rephrased snli-ve snli-ve-rephrased okvqa a-okvqa viquae textcap docvqa science-qa imagenet imagenet-open-ended imagenet-rephrased coco-goi clevr clevr-rephrased nlvr coco-itm coco-itm-rephrased vsr vsr-rephrased mocheg mocheg-rephrased coco-text fm-iqa activitynet-qa msrvtt ss coco-cn refcoco refcoco-rephrased multi30k image-paragraph-captioning visual-dialog visual-dialog-rephrased iqa vcr visual-mrc ivqa msrvtt-qa msvd-qa gqa text-vqa ocr-vqa st-vqa flickr8k-cn |
huge dataset | - | chat, multi-modal, vision | - |
| AI-ModelScope/MATH-lighteval | default | 7500 | 104.4±92.8, min=36, max=1683 | grpo, math | DigitalLearningGmbH/MATH-lighteval |
| AI-ModelScope/Magpie-Qwen2-Pro-200K-Chinese | default | 200000 | 448.4±223.5, min=87, max=4098 | chat, sft, 🔥, zh | Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese |
| AI-ModelScope/Magpie-Qwen2-Pro-200K-English | default | 200000 | 609.9±277.1, min=257, max=4098 | chat, sft, 🔥, en | Magpie-Align/Magpie-Qwen2-Pro-200K-English |
| AI-ModelScope/Magpie-Qwen2-Pro-300K-Filtered | default | 300000 | 556.6±288.6, min=175, max=4098 | chat, sft, 🔥 | Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered |
| AI-ModelScope/MathInstruct | default | 262040 | 253.3±177.4, min=42, max=2193 | math, cot, en, quality | TIGER-Lab/MathInstruct |
| AI-ModelScope/MovieChat-1K-test | default | 162 | 39.7±2.0, min=32, max=43 | chat, multi-modal, video | Enxin/MovieChat-1K-test |
| AI-ModelScope/Open-Platypus | default | 24926 | 389.0±256.4, min=55, max=3153 | chat, math, quality | garage-bAInd/Open-Platypus |
| AI-ModelScope/OpenO1-SFT | default | 125894 | 1080.7±622.9, min=145, max=11637 | chat, general, o1 | O1-OPEN/OpenO1-SFT |
| AI-ModelScope/OpenOrca | default 3_5M |
huge dataset | - | chat, multilingual, general | - |
| AI-ModelScope/OpenOrca-Chinese | default | huge dataset | - | QA, zh, general, quality | yys/OpenOrca-Chinese |
| AI-ModelScope/SFT-Nectar | default | 131201 | 441.9±307.0, min=45, max=3136 | cot, en, quality | AstraMindAI/SFT-Nectar |
| AI-ModelScope/ShareGPT-4o | image_caption | 57289 | 599.8±140.4, min=214, max=1932 | vqa, multi-modal | OpenGVLab/ShareGPT-4o |
| AI-ModelScope/ShareGPT4V | ShareGPT4V ShareGPT4V-PT |
huge dataset | - | chat, multi-modal, vision | - |
| AI-ModelScope/SkyPile-150B | default | huge dataset | - | pretrain, quality, zh | Skywork/SkyPile-150B |
| AI-ModelScope/WizardLM_evol_instruct_V2_196k | default | 109184 | 483.3±338.4, min=27, max=3735 | chat, en | WizardLM/WizardLM_evol_instruct_V2_196k |
| AI-ModelScope/alpaca-cleaned | default | 51760 | 170.1±122.9, min=29, max=1028 | chat, general, bench, quality | yahma/alpaca-cleaned |
| AI-ModelScope/alpaca-gpt4-data-en | default | 52002 | 167.6±123.9, min=29, max=607 | chat, general, 🔥 | vicgalle/alpaca-gpt4 |
| AI-ModelScope/alpaca-gpt4-data-zh | default | 48818 | 157.2±93.2, min=27, max=544 | chat, general, 🔥 | llm-wizard/alpaca-gpt4-data-zh |
| AI-ModelScope/blossom-math-v2 | default | 10000 | 175.4±59.1, min=35, max=563 | chat, math, 🔥 | Azure99/blossom-math-v2 |
| AI-ModelScope/captcha-images | default | 8000 | 47.0±0.0, min=47, max=47 | chat, multi-modal, vision | - |
| AI-ModelScope/chartqa_digit_r1v_format | default | 11399 | 48.3±5.1, min=37, max=82 | grpo | zyang39/chartqa_digit_r1v_format |
| AI-ModelScope/clevr_cogen_a_train | default | 70000 | 67.0±0.0, min=67, max=67 | qa, math, vision, grpo | leonardPKU/clevr_cogen_a_train |
| AI-ModelScope/coco | default | huge dataset | - | multi-modal, en, vqa, quality | detection-datasets/coco |
| AI-ModelScope/databricks-dolly-15k | default | 15011 | 199.0±268.8, min=26, max=5987 | multi-task, en, quality | databricks/databricks-dolly-15k |
| AI-ModelScope/deepctrl-sft-data | default en |
huge dataset | - | chat, general, sft, multi-round | - |
| AI-ModelScope/egoschema | default cls |
101 | 191.6±80.7, min=96, max=435 | chat, multi-modal, video | lmms-lab/egoschema |
| AI-ModelScope/firefly-train-1.1M | default | 1649399 | 204.3±365.3, min=28, max=9306 | chat, general | YeungNLP/firefly-train-1.1M |
| AI-ModelScope/function-calling-chatml | default | 112958 | 465.3±320.1, min=36, max=6106 | agent, en, sft, 🔥 | Locutusque/function-calling-chatml |
| AI-ModelScope/generated_chat_0.4M | default | 396004 | 272.7±51.1, min=78, max=579 | chat, character-dialogue | BelleGroup/generated_chat_0.4M |
| AI-ModelScope/guanaco_belle_merge_v1.0 | default | 693987 | 133.8±93.5, min=30, max=1872 | QA, zh | Chinese-Vicuna/guanaco_belle_merge_v1.0 |
| AI-ModelScope/hh-rlhf | helpful-base helpful-online helpful-rejection-sampled |
huge dataset | - | rlhf, dpo | - |
| AI-ModelScope/hh_rlhf_cn | hh_rlhf harmless_base_cn harmless_base_en helpful_base_cn helpful_base_en |
362909 | 142.3±107.5, min=25, max=1571 | rlhf, dpo, 🔥 | - |
| AI-ModelScope/lawyer_llama_data | default | 21476 | 224.4±83.9, min=69, max=832 | chat, law | Skepsun/lawyer_llama_data |
| AI-ModelScope/leetcode-solutions-python | default | 2359 | 723.8±233.5, min=259, max=2117 | chat, coding, 🔥 | - |
| AI-ModelScope/lmsys-chat-1m | default | 166211 | 545.8±3272.8, min=22, max=219116 | chat, em | lmsys/lmsys-chat-1m |
| AI-ModelScope/math-trn-format | default | 11500 | 102.2±88.9, min=36, max=1683 | math | - |
| AI-ModelScope/ms_agent_for_agentfabric | default addition |
30000 | 615.7±198.7, min=251, max=2055 | chat, agent, multi-round, 🔥 | - |
| AI-ModelScope/orpo-dpo-mix-40k | default | 43666 | 938.1±694.2, min=36, max=8483 | dpo, orpo, en, quality | mlabonne/orpo-dpo-mix-40k |
| AI-ModelScope/pile | default | huge dataset | - | pretrain | EleutherAI/pile |
| AI-ModelScope/ruozhiba | post-annual title-good title-norm |
85658 | 40.0±18.3, min=22, max=559 | pretrain, 🔥 | - |
| AI-ModelScope/school_math_0.25M | default | 248481 | 158.8±73.4, min=39, max=980 | chat, math, quality | BelleGroup/school_math_0.25M |
| AI-ModelScope/sharegpt_gpt4 | default V3_format zh_38K_format |
103329 | 3476.6±5959.0, min=33, max=115132 | chat, multilingual, general, multi-round, gpt4, 🔥 | - |
| AI-ModelScope/sql-create-context | default | 78577 | 82.7±31.5, min=36, max=282 | chat, sql, 🔥 | b-mc2/sql-create-context |
| AI-ModelScope/stack-exchange-paired | default | huge dataset | - | hfrl, dpo, pairwise | lvwerra/stack-exchange-paired |
| AI-ModelScope/starcoderdata | default | huge dataset | - | pretrain, quality | bigcode/starcoderdata |
| AI-ModelScope/synthetic_text_to_sql | default | 100000 | 221.8±69.9, min=64, max=616 | nl2sql, en | gretelai/synthetic_text_to_sql |
| AI-ModelScope/texttosqlv2_25000_v2 | default | 25000 | 277.3±328.3, min=40, max=1971 | chat, sql | Clinton/texttosqlv2_25000_v2 |
| AI-ModelScope/the-stack | default | huge dataset | - | pretrain, quality | bigcode/the-stack |
| AI-ModelScope/tigerbot-law-plugin | default | 55895 | 104.9±51.0, min=43, max=1087 | text-generation, law, pretrained | TigerResearch/tigerbot-law-plugin |
| AI-ModelScope/train_0.5M_CN | default | 519255 | 128.4±87.4, min=31, max=936 | common, zh, quality | BelleGroup/train_0.5M_CN |
| AI-ModelScope/train_1M_CN | default | huge dataset | - | common, zh, quality | BelleGroup/train_1M_CN |
| AI-ModelScope/train_2M_CN | default | huge dataset | - | common, zh, quality | BelleGroup/train_2M_CN |
| AI-ModelScope/tulu-v2-sft-mixture | default | 326154 | 523.3±439.3, min=68, max=2549 | chat, multilingual, general, multi-round | allenai/tulu-v2-sft-mixture |
| AI-ModelScope/ultrafeedback-binarized-preferences-cleaned-kto | default | 230720 | 471.5±274.3, min=27, max=2232 | rlhf, kto | - |
| AI-ModelScope/webnovel_cn | default | 50000 | 1455.2±12489.4, min=524, max=490480 | chat, novel | zxbsmk/webnovel_cn |
| AI-ModelScope/wikipedia-cn-20230720-filtered | default | huge dataset | - | pretrain, quality | pleisto/wikipedia-cn-20230720-filtered |
| AI-ModelScope/zhihu_rlhf_3k | default | 3460 | 594.5±365.9, min=31, max=1716 | rlhf, dpo, zh | liyucheng/zhihu_rlhf_3k |
| DAMO_NLP/jd | default cls |
45012 | 66.9±87.0, min=41, max=1699 | text-generation, classification, 🔥 | - |
| FreedomIntelligence/medical-o1-reasoning-SFT | en zh |
50143 | 98.0±53.6, min=36, max=1508 | medical, o1, 🔥 | FreedomIntelligence/medical-o1-reasoning-SFT |
| - | default | huge dataset | - | pretrain, quality | HuggingFaceFW/fineweb |
| - | auto_math_text khanacademy openstax stanford stories web_samples_v1 web_samples_v2 wikihow |
huge dataset | - | multi-domain, en, qa | HuggingFaceTB/cosmopedia |
| HumanLLMs/Human-Like-DPO-Dataset | default | 10884 | 47.5±7.9, min=32, max=85 | rlhf, dpo | HumanLLMs/Human-Like-DPO-Dataset |
| LLM-Research/xlam-function-calling-60k | default grpo |
120000 | 453.7±219.5, min=164, max=2779 | agent, grpo, 🔥 | Salesforce/xlam-function-calling-60k |
| MTEB/scidocs-reranking | default | 39193 | 41.9±5.8, min=31, max=107 | rerank, 🔥 | mteb/scidocs-reranking |
| MTEB/stackoverflowdupquestions-reranking | default | 26485 | 39.9±4.6, min=31, max=77 | rerank, 🔥 | mteb/stackoverflowdupquestions-reranking |
| OmniData/Zhihu-KOL | default | huge dataset | - | zhihu, qa | wangrui6/Zhihu-KOL |
| OmniData/Zhihu-KOL-More-Than-100-Upvotes | default | 271261 | 1003.4±1826.1, min=28, max=52541 | zhihu, qa | bzb2023/Zhihu-KOL-More-Than-100-Upvotes |
| PowerInfer/LONGCOT-Refine-500K | default | 521921 | 296.5±158.4, min=39, max=4634 | chat, sft, 🔥, cot | PowerInfer/LONGCOT-Refine-500K |
| PowerInfer/QWQ-LONGCOT-500K | default | 498082 | 310.7±303.1, min=35, max=22941 | chat, sft, 🔥, cot | PowerInfer/QWQ-LONGCOT-500K |
| ServiceNow-AI/R1-Distill-SFT | v0 v1 |
1850809 | 164.2±438.0, min=30, max=32469 | chat, sft, cot, r1 | ServiceNow-AI/R1-Distill-SFT |
| TIGER-Lab/MATH-plus | train | 893929 | 301.4±196.7, min=50, max=1162 | qa, math, en, quality | TIGER-Lab/MATH-plus |
| Tongyi-DataEngine/SA1B-Dense-Caption | default | huge dataset | - | zh, multi-modal, vqa | - |
| Tongyi-DataEngine/SA1B-Paired-Captions-Images | default | 7736284 | 106.4±18.5, min=48, max=193 | zh, multi-modal, vqa | - |
| YorickHe/CoT | default | 74771 | 141.6±45.5, min=58, max=410 | chat, general | - |
| YorickHe/CoT_zh | default | 74771 | 129.1±53.2, min=51, max=401 | chat, general | - |
| ZhipuAI/LongWriter-6k | default | 6000 | 5009.0±2932.8, min=117, max=30354 | long, chat, sft, 🔥 | zai-org/LongWriter-6k |
| - | default | huge dataset | - | pretrain, quality | allenai/c4 |
| bespokelabs/Bespoke-Stratos-17k | default | 16710 | 480.7±236.1, min=266, max=3556 | chat, sft, cot, r1 | bespokelabs/Bespoke-Stratos-17k |
| - | default | huge dataset | - | pretrain, quality | cerebras/SlimPajama-627B |
| codefuse-ai/CodeExercise-Python-27k | default | 27224 | 337.3±154.2, min=90, max=2826 | chat, coding, 🔥 | - |
| codefuse-ai/Evol-instruction-66k | default | 66862 | 440.1±208.4, min=46, max=2661 | chat, coding, 🔥 | - |
| damo/MSAgent-Bench | default mini |
638149 | 859.2±460.1, min=38, max=3479 | chat, agent, multi-round | - |
| damo/nlp_polylm_multialpaca_sft | ar de es fr id ja ko pt ru th vi |
131867 | 101.6±42.5, min=30, max=1029 | chat, general, multilingual | - |
| damo/zh_cls_fudan-news | default | 4959 | 3234.4±2547.5, min=91, max=19548 | chat, classification | - |
| damo/zh_ner-JAVE | default | 1266 | 118.3±45.5, min=44, max=223 | chat, ner | - |
| hjh0119/shareAI-Llama3-DPO-zh-en-emoji | default | 2449 | 334.0±162.8, min=36, max=1801 | rlhf, dpo | shareAI/DPO-zh-en-emoji |
| huangjintao/AgentInstruct_copy | alfworld db kg mind2web os webshop |
1866 | 1144.3±635.5, min=206, max=6412 | chat, agent, multi-round | - |
| iic/100PoisonMpts | default | 906 | 150.6±80.8, min=39, max=656 | poison-management, zh | - |
| iic/DocQA-RL-1.6K | default | 1591 | 8307.3±7748.9, min=202, max=32563 | docqa, rl, long-sequence | Tongyi-Zhiwen/DocQA-RL-1.6K |
| iic/MSAgent-MultiRole | default | 543 | 413.0±79.7, min=70, max=936 | chat, agent, multi-round, role-play, multi-agent | - |
| iic/MSAgent-Pro | default | 21910 | 1978.1±747.9, min=339, max=8064 | chat, agent, multi-round, 🔥 | - |
| iic/ms_agent | default | 30000 | 645.8±218.0, min=199, max=2070 | chat, agent, multi-round, 🔥 | - |
| iic/ms_bench | default | 316820 | 353.4±424.5, min=29, max=2924 | chat, general, multi-round, 🔥 | - |
| liucong/Chinese-DeepSeek-R1-Distill-data-110k-SFT | default | 110000 | 72.1±60.9, min=29, max=2315 | chat, sft, cot, r1, 🔥 | Congliu/Chinese-DeepSeek-R1-Distill-data-110k-SFT |
| - | default | huge dataset | - | multi-modal, en, vqa, quality | lmms-lab/GQA |
| - | 0_30_s_academic_v0_1 0_30_s_youtube_v0_1 1_2_m_academic_v0_1 1_2_m_youtube_v0_1 2_3_m_academic_v0_1 2_3_m_youtube_v0_1 30_60_s_academic_v0_1 30_60_s_youtube_v0_1 |
1335486 | 273.7±78.8, min=107, max=638 | chat, multi-modal, video | lmms-lab/LLaVA-Video-178K |
| lmms-lab/multimodal-open-r1-8k-verified | default | 7689 | 74.0±24.8, min=41, max=214 | grpo, vision, 🔥 | lmms-lab/multimodal-open-r1-8k-verified |
| lvjianjin/AdvertiseGen | default | 97484 | 130.9±21.9, min=73, max=232 | text-generation, 🔥 | shibing624/AdvertiseGen |
| mapjack/openwebtext_dataset | default | huge dataset | - | pretrain, zh, quality | - |
| modelscope/DuReader_robust-QG | default | 17899 | 242.0±143.1, min=75, max=1416 | text-generation, 🔥 | - |
| modelscope/MathR | default clean |
6089 | 188.7±75.3, min=64, max=3341 | qa, math | - |
| modelscope/MathR-32B-Distill | data | 25921 | 209.4±63.1, min=121, max=3407 | qa, math | - |
| modelscope/chinese-poetry-collection | default | 1710 | 58.1±8.1, min=31, max=71 | text-generation, poetry | - |
| modelscope/clue | cmnli | 391783 | 81.6±16.0, min=54, max=157 | text-generation, classification | clue |
| modelscope/coco_2014_caption | train validation |
454617 | 389.6±68.4, min=70, max=587 | chat, multi-modal, vision, 🔥 | - |
| modelscope/gsm8k | main | 7473 | 88.6±21.6, min=41, max=241 | qa, math | - |
| open-r1/verifiable-coding-problems-python | default | 35735 | 559.0±255.2, min=74, max=6191 | grpo, code | open-r1/verifiable-coding-problems-python |
| open-r1/verifiable-coding-problems-python-10k | default | 1800 | 581.6±233.4, min=136, max=2022 | grpo, code | open-r1/verifiable-coding-problems-python-10k |
| open-r1/verifiable-coding-problems-python-10k_decontaminated | default | 1574 | 575.7±234.3, min=136, max=2022 | grpo, code | open-r1/verifiable-coding-problems-python-10k_decontaminated |
| open-r1/verifiable-coding-problems-python_decontaminated | default | 27839 | 561.9±252.2, min=74, max=6191 | grpo, code | open-r1/verifiable-coding-problems-python_decontaminated |
| open-thoughts/OpenThoughts-114k | default | 113957 | 413.2±186.9, min=265, max=13868 | chat, sft, cot, r1 | open-thoughts/OpenThoughts-114k |
| swift/self-cognition | default qwen3 empty_think |
108 | 58.9±20.3, min=32, max=131 | chat, self-cognition, 🔥 | modelscope/self-cognition |
| sentence-transformers/stsb | default positive generate reg |
5748 | 21.0±0.0, min=21, max=21 | similarity, 🔥 | sentence-transformers/stsb |
| shenweizhou/alpha-umi-toolbench-processed-v2 | backbone caller planner summarizer |
huge dataset | - | chat, agent, 🔥 | - |
| simpleai/HC3 | finance finance_cls medicine medicine_cls |
11021 | 296.0±153.3, min=65, max=2267 | text-generation, classification, 🔥 | Hello-SimpleAI/HC3 |
| simpleai/HC3-Chinese | baike baike_cls open_qa open_qa_cls nlpcc_dbqa nlpcc_dbqa_cls finance finance_cls medicine medicine_cls law law_cls psychology psychology_cls |
39781 | 179.9±70.2, min=90, max=1070 | text-generation, classification, 🔥 | Hello-SimpleAI/HC3-Chinese |
| speech_asr/speech_asr_aishell1_trainsets | train validation test |
141600 | 40.8±3.3, min=33, max=53 | chat, multi-modal, audio | - |
| swift/A-OKVQA | default | 18201 | 43.5±7.9, min=27, max=94 | multi-modal, en, vqa, quality | HuggingFaceM4/A-OKVQA |
| swift/ChartQA | default | 28299 | 36.8±6.5, min=26, max=74 | en, vqa, quality | HuggingFaceM4/ChartQA |
| swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT | default | 110000 | 72.1±60.9, min=29, max=2315 | 🔥, distill, sft | - |
| swift/Chinese-Qwen3-235B-Thinking-2507-Distill-data-110k-SFT | default | 110000 | 72.1±60.9, min=29, max=2315 | 🔥, distill, sft, cot, r1, thinking | - |
| swift/GRIT | caption grounding vqa |
huge dataset | - | multi-modal, en, caption-grounding, vqa, quality | zzliang/GRIT |
| swift/GenQA | default | huge dataset | - | qa, quality, multi-task | tomg-group-umd/GenQA |
| swift/Infinity-Instruct | 3M 7M 0625 Gen 7M_domains |
huge dataset | - | qa, quality, multi-task | BAAI/Infinity-Instruct |
| swift/Mantis-Instruct | birds-to-words chartqa coinstruct contrastive_caption docvqa dreamsim dvqa iconqa imagecode llava_665k_multi lrv_multi multi_vqa nextqa nlvr2 spot-the-diff star visual_story_telling |
988115 | 619.9±156.6, min=243, max=1926 | chat, multi-modal, vision | - |
| swift/MideficsDataset | default | 3800 | 201.3±70.2, min=60, max=454 | medical, en, vqa | WinterSchool/MideficsDataset |
| swift/Multimodal-Mind2Web | default | 1009 | 293855.4±331149.5, min=11301, max=3577519 | agent, multi-modal | osunlp/Multimodal-Mind2Web |
| swift/OCR-VQA | default | 186753 | 32.3±5.8, min=27, max=80 | multi-modal, en, ocr-vqa | howard-hou/OCR-VQA |
| swift/OK-VQA_train | default | 9009 | 31.7±3.4, min=25, max=56 | multi-modal, en, vqa, quality | Multimodal-Fatima/OK-VQA_train |
| swift/OpenHermes-2.5 | default | huge dataset | - | cot, en, quality | teknium/OpenHermes-2.5 |
| swift/RLAIF-V-Dataset | default | 83132 | 99.6±54.8, min=30, max=362 | rlhf, dpo, multi-modal, en | openbmb/RLAIF-V-Dataset |
| swift/RedPajama-Data-1T | default | huge dataset | - | pretrain, quality | togethercomputer/RedPajama-Data-1T |
| swift/RedPajama-Data-V2 | default | huge dataset | - | pretrain, quality | togethercomputer/RedPajama-Data-V2 |
| swift/ScienceQA | default | 16967 | 101.7±55.8, min=32, max=620 | multi-modal, science, vqa, quality | derek-thomas/ScienceQA |
| swift/SlimOrca | default | 517982 | 405.5±442.1, min=47, max=8312 | quality, en | Open-Orca/SlimOrca |
| swift/TextCaps | default emb |
huge dataset | - | multi-modal, en, caption, quality | HuggingFaceM4/TextCaps |
| swift/ToolBench | default | 124345 | 2251.7±1039.8, min=641, max=9451 | chat, agent, multi-round | - |
| swift/VQAv2 | default | huge dataset | - | en, vqa, quality | HuggingFaceM4/VQAv2 |
| swift/VideoChatGPT | Generic Temporal Consistency |
3206 | 87.4±48.3, min=31, max=398 | chat, multi-modal, video, 🔥 | lmms-lab/VideoChatGPT |
| swift/WebInstructSub | default | huge dataset | - | qa, en, math, quality, multi-domain, science | TIGER-Lab/WebInstructSub |
| swift/aya_collection | aya_dataset | 202364 | 474.6±1539.1, min=25, max=71312 | multi-lingual, qa | CohereForAI/aya_collection |
| swift/chinese-c4 | default | huge dataset | - | pretrain, zh, quality | shjwudp/chinese-c4 |
| swift/cinepile | default | huge dataset | - | vqa, en, youtube, video | tomg-group-umd/cinepile |
| swift/classical_chinese_translate | default | 6655 | 349.3±77.1, min=61, max=815 | chat, play-ground | - |
| swift/cosmopedia-100k | default | 100000 | 1037.0±254.8, min=339, max=2818 | multi-domain, en, qa | HuggingFaceTB/cosmopedia-100k |
| swift/dolma | v1_7 | huge dataset | - | pretrain, quality | allenai/dolma |
| swift/dolphin | flan1m-alpaca-uncensored flan5m-alpaca-uncensored |
huge dataset | - | en | cognitivecomputations/dolphin |
| swift/github-code | default | huge dataset | - | pretrain, quality | codeparrot/github-code |
| swift/gpt4v-dataset | default | huge dataset | - | en, caption, multi-modal, quality | laion/gpt4v-dataset |
| swift/llava-data | llava_instruct | 624255 | 369.7±143.0, min=40, max=905 | sft, multi-modal, quality | TIGER-Lab/llava-data |
| swift/llava-instruct-mix-vsft | default | 13640 | 178.8±119.8, min=34, max=951 | multi-modal, en, vqa, quality | HuggingFaceH4/llava-instruct-mix-vsft |
| swift/llava-med-zh-instruct-60k | default | 56649 | 207.9±67.7, min=42, max=594 | zh, medical, vqa, multi-modal | BUAADreamer/llava-med-zh-instruct-60k |
| swift/lnqa | default | huge dataset | - | multi-modal, en, ocr-vqa, quality | vikhyatk/lnqa |
| swift/longwriter-6k-filtered | default | 666 | 4108.9±2636.9, min=1190, max=17050 | long, chat, sft, 🔥 | - |
| swift/medical_zh | en zh |
2068589 | 256.4±87.3, min=39, max=1167 | chat, medical | - |
| swift/moondream2-coyo-5M-captions | default | huge dataset | - | caption, pretrain, quality | isidentical/moondream2-coyo-5M-captions |
| swift/no_robots | default | 9485 | 300.0±246.2, min=40, max=6739 | multi-task, quality, human-annotated | HuggingFaceH4/no_robots |
| swift/orca_dpo_pairs | default | 12859 | 364.9±248.2, min=36, max=2010 | rlhf, quality | Intel/orca_dpo_pairs |
| swift/path-vqa | default | 19654 | 34.2±6.8, min=28, max=85 | multi-modal, vqa, medical | flaviagiammarino/path-vqa |
| swift/pile-val-backup | default | 214661 | 1831.4±11087.5, min=21, max=516620 | text-generation, awq | mit-han-lab/pile-val-backup |
| swift/pixelprose | default | huge dataset | - | caption, multi-modal, vision | tomg-group-umd/pixelprose |
| swift/refcoco | caption grounding |
92430 | 45.4±3.0, min=37, max=63 | multi-modal, en, grounding | jxu124/refcoco |
| swift/refcocog | caption grounding |
89598 | 50.3±4.6, min=39, max=91 | multi-modal, en, grounding | jxu124/refcocog |
| swift/sharegpt | common-zh unknow-zh common-en |
194063 | 820.5±366.1, min=25, max=2221 | chat, general, multi-round | - |
| swift/swift-sft-mixture | sharegpt firefly codefuse metamathqa |
huge dataset | - | chat, sft, general, 🔥 | - |
| swift/tagengo-gpt4 | default | 76437 | 468.1±276.8, min=28, max=1726 | chat, multi-lingual, quality | lightblue/tagengo-gpt4 |
| swift/train_3.5M_CN | default | huge dataset | - | common, zh, quality | BelleGroup/train_3.5M_CN |
| swift/ultrachat_200k | default | 207843 | 1188.0±571.1, min=170, max=4068 | chat, en, quality | HuggingFaceH4/ultrachat_200k |
| swift/wikipedia | default | huge dataset | - | pretrain, quality | wikipedia |
| tany0699/garbage265 | default | 132673 | 39.0±0.0, min=39, max=39 | cls, 🔥, multi-modal | - |
| tastelikefeet/competition_math | default | 12000 | 101.9±87.3, min=36, max=1683 | qa, math | - |
| - | default | huge dataset | - | pretrain, quality | tiiuae/falcon-refinedweb |
| wyj123456/GPT4all | default | 806199 | 97.3±20.9, min=62, max=414 | chat, general | - |
| wyj123456/code_alpaca_en | default | 20022 | 99.3±57.6, min=30, max=857 | chat, coding | sahil2801/CodeAlpaca-20k |
| wyj123456/finance_en | default | 68912 | 264.5±207.1, min=30, max=2268 | chat, financial | ssbuild/alpaca_finance_en |
| wyj123456/instinwild | default subset |
103695 | 125.1±43.7, min=35, max=801 | chat, general | - |
| wyj123456/instruct | default | 888970 | 271.0±333.6, min=34, max=3967 | chat, general | - |
| zouxuhong/Countdown-Tasks-3to4 | default | 490364 | 126.6±2.0, min=122, max=130 | math | - |