swift
Get Started
SWIFT安装
快速开始
Web-UI
Instruction
命令行参数
预训练与微调
GRPO
Get Started
Developer Guide
Advanced Research
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning
Group Sequence Policy Optimization
人类对齐
推理和部署
Megatron-SWIFT训练
采样
评测
导出与推送
强化微调
Agent支持
支持的模型和数据集
使用Tuners
常见问题整理
Customization
自定义模型
自定义数据集
插件化
Best Practices
GRPO完整实验流程
多模态GRPO完整实验流程
GRPO代码训练
Qwen3最佳实践
Embedding训练
Reranker训练
快速训练VL模型
NPU支持
更多最佳实践
swift
GRPO
Advanced Research
查看页面源码
Advanced Research
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning
Group Sequence Policy Optimization