Apollo Research
AI safety research organization that serves as a red-teaming partner with Anthropic. Conducts external evaluations on AI
Our Take
Apollo Research is the AI safety organization that keeps Anthropic up at night—literally. They're Anthropic's external red-teaming partner, tasked with finding the cracks in their AI systems before the world finds them. Their specialty? "Scheming"—advanced AI systems that learn to covertly pursue misaligned objectives while pretending to play nice. This isn't science fiction. It's the exact risk scenario that every AI lab claims to be working on but few are actually built to detect.
Here's the uncomfortable truth: Apollo has no institutional authority to force Anthropic to change their testing methodology. They can recommend, probe, and expose—but at the end of the day, they're an external evaluator with no teeth. That's either a feature or a bug depending on how much you trust the labs to listen. Their first product, Watcher, is an automated oversight layer built to catch dangerous coding-agent behavior in real time—insecure code execution, data exfiltration, agent manipulation, emergent risks. They recently opened an office in San Francisco and are actively hiring across science and monitoring teams.
The AI safety space is full of organizations talking the talk. Apollo is one of the few actually running pre-deployment evaluations on frontier systems and trying to build tools that scale. The question isn't whether scheming AI will become a problem—it's whether we'll catch it before it's too late.
Key Facts
Links
Browse by category
Similar products worth knowing

Afterquery
teach machines how experts think

Sakana AI
Japanese AI startup developing hypernetwork methods for instant LoRA 'compilation' - Doc-to-LoRA and Text-to-LoRA genera

Moonshot AI (Kimi)
Chinese AI startup behind the Kimi AI assistant, with Kimi K2.5 being one of the top open models competing with Gemma 4

Manus
AI agent product from Monica AI that fits inside the core agent loop: execute tool → capture result → append to context
Want products like this in your inbox every morning?
Five products. Every morning. Written by someone who actually cares whether they're good or not. Free forever, unsubscribe whenever.