Recognition
Socrates starts the way the philosopher did — by asking. You can't hand him a perfect spec up front (no one can), so he doesn't demand one. He asks sharp questions, shows you examples to react to, and reads what you actually care about from how you respond — recognition, not recall. Then he reflects it back as a short story: "here's what I think you're really trying to do." You confirm it, or you correct it. Why it matters: almost every expensive failure is a confident answer to the wrong question. You can't verify your way out of a misunderstood problem — so the very first move is making sure you're both solving the right one.
Scout the field
Before building anything, Socrates looks at how the field already solves this kind of problem and locks onto the right reference to work from. Why it matters: the most common mistake is reaching for the first tool that comes to mind — almost always the familiar one, rarely the right one. Most "new" problems have a known, better answer already out there; finding it first is the cheapest advantage you can get.
The right ruler
Socrates chooses the field-right way to verify the work, and stress-tests that choice. The obvious "ruler" gets tried first — and it usually measures the wrong thing. Why it matters: this is the highest-stakes decision in the whole process. Pick the wrong ruler and every check afterward will faithfully confirm the wrong thing — a green built on a bad measure is the most expensive kind of wrong, because it looks completely fine.
Weigh the shapes
Instead of running with the obvious design, Socrates forms a few genuinely different candidates and weighs them against each other before choosing one. Why it matters: the first shape that occurs to you is rarely the best one. Holding real alternatives side by side is how you find the design that actually fits — and how you catch the moment the "obvious" choice was a trap.
Send in the red team
This is the core move, and it's not one reviewer nodding along. Socrates spins up a fleet of independent red-team agents — running in parallel through Claude's new Workflows orchestration — and every one of them tries to break each "proven" claim by building a real counterexample, not just re-reading it. A tempting, already-green "pass" is offered as a shortcut; the swarm falls on it; the hollow green shatters while the true work cracks, holds, and hardens. Why it matters: a single confident pass can be hollow, and one reviewer shares your blind spots. An army of adversaries, each attacking from a different angle, is what turns "looks proven" into "survived everything we threw at it."
Sealed, honestly
What comes out is never one green light. It's an honest split: proven — a real check fired and survived the adversary; your call — judgment that's yours, never faked for you; and the gap — whatever couldn't be verified, named out loud instead of hidden. Why it matters: a tool you can actually trust is one that tells you exactly where its knowledge ends. Proven, or named — never assumed. That honesty is the entire point.
A green from Socrates means a real check fired and someone tried to break it — and anything uncertain is flagged, not dressed up. That's the promise: you move fast, and you trust what you ship.