Products/Step 3.5 Flash

Step 3.5 Flash

Frontier open-source MoE model built for OpenClaw agents

Our Take

Zac Zuo and the StepFun team just dropped Step 3.5 Flash, and it's doing something nobody else in open-source is bold enough to attempt—building a frontier model specifically designed for agents, not chatbots. While everyone else is racing to make AI that sounds smarter in conversation, StepFun said "nah, let's make AI that actually gets s--- done." Step 3.5 Flash is a sparse Mixture of Experts model with 196 billion total parameters but it only activates 11 billion per token. That's efficiency most proprietary models can only dream of.

The benchmarks tell the story: 81.0 average score, beating DeepSeek V3.2, Kimi K2.5, and tying with Gemini 3.0 Pro. It hits 74.4% on SWE-bench Verified and 51.0% on Terminal-Bench 2.0—numbers that prove this thing can actually code and execute real tasks, not just chat. The 3-way Multi-Token Prediction (MTP-3) architecture pushes 100-300 tokens per second in typical usage, peaking at 350 tok/s for coding. That's not just fast—that's real-time agent territory. It handles 256K context with a clever 3:1 sliding window that keeps long-context costs from eating your GPU budget. Open-source just got a serious contender, and it's built for the agentic future everyone keeps talking about.

stepfun.com blog →GitHub →