METR
Model Evaluation & Threat Research

Our Take
METR (née ARC Evals) is out here doing the unglamorous work of actually measuring what AI agents can and can't do, which is honestly more useful than the endless stream of "we crushed the benchmark" press releases flooding my inbox. They're not building models — they're the people making sure the rest of us can tell what's real and what's just optimized training data. The evaluation space is getting crowded, but METR's been at this long enough to have real credibility in the space.
Evaluates frontier AI models to help companies and wider society understand AI capabilities and the risks they pose
Key Facts
The people behind METR
METR Team
profileAI Company
AI company building agents for enterprises. METR (formerly known as various names) — autonomous AI agents.
Links
Browse by category
Want products like this in your inbox every morning?
Five products. Every morning. Written by someone who actually cares whether they're good or not. Free forever, unsubscribe whenever.