Craft
The PRD review checklist a senior PM actually runs
A copyable, 11-point PRD review checklist — what each check catches and the standard it maps to. The pass a spec should survive before it reaches engineering.
A PRD review checklist is the set of checks you run before a spec reaches your team: is the problem framed before the solution, are the success metrics measurable, are edge cases and non-functional requirements covered, can an engineer estimate it without a follow-up meeting. Most checklists stop at "is it clear?" The useful ones name the failure each check is there to catch.
Here is the one we run. Eleven checks, copyable, with the failure mode beside each so you know why it earns a line.
The PRD review checklist
- Problem before solution. The problem, the user, and why-now are stated before any feature appears. (Catches: a solution looking for a problem.)
- Every required section is present. Goals, users, scope, success metrics, risks — complete, partial, or missing, marked honestly. (Catches: the gap you can't proofread because it isn't written.)
- It's worth doing now. Right problem, right solution, right time, and nothing about it is reckless. (Catches: a well-written spec for the wrong bet.)
- Success metrics are measurable. Each metric has a baseline, a target, and a timeframe. (Catches: "increase engagement.")
- It's ready for engineering. Functional and non-functional requirements are specific enough to estimate. (Catches: the three-review back-and-forth.)
- Every user story maps to a flow. No story without a path through the product, no flow without its unhappy branches. (Catches: the screen nobody designed.)
- Edge cases have acceptance criteria. Empty, error, permission, and offline states are written as testable conditions. (Catches: the bug filed in week one.)
- AI behavior is specified, if there's AI. Model, cost ceiling, expected behavior, and how you'll evaluate it. (Catches: "the AI will handle it.")
- Billing and finance impact is named, if money moves. Pricing, billing-system changes, and finance dependencies. (Catches: the launch blocked by an invoice.)
- Domain assumptions are surfaced. The thing only someone in your domain would know is missing. (Catches: the assumption stated as a fact.)
- There's a verdict. A score, the top risks, and a single call: ship, revise, or rethink. (Catches: a review that ends in twenty bullets and no decision.)
Copy it. Run it on your next draft before anyone else opens the doc. The rest of this piece is why each check is on the list, and where the two that matter most usually get skipped.
What a PRD review checks that a copy-edit doesn't
A copy-edit reads what you wrote. A review reads what you should have written and didn't. That's the whole difference, and it's why "this reads well" is worthless as feedback. The expensive problems in a spec aren't the wrong sentences. They're the sentences that aren't there — the success metric you never defined, the permission state you skipped.
So the checklist isn't sorted by what's easy to see. It's sorted by what costs the most when it slips through. Four of the eleven do most of the work. Here they are.
Success metrics: measurable, or it isn't a metric
"Increase engagement" is not a success metric. It names a direction and hides the three things a reviewer needs: which action, for which cohort, by when.
A measurable metric reads like a claim you could be wrong about. Lift the WAU-to-MAU ratio from 0.31 to 0.40 within two quarters. Now the spec has a baseline (0.31), a target (0.40), and a clock (two quarters). If the feature ships and the number doesn't move, you'll know. That's the test: can this metric fail in public? If not, rewrite it until it can.
Engineering readiness: write the non-functional half
Most PRDs cover what the feature does and go quiet on how well it has to do it. The non-functional half is where estimates blow up.
The standard worth borrowing here is ISO/IEC 25010, the software quality model engineers already think in: performance, reliability, security, compatibility, maintainability. You don't need to cite it on the page. You need to answer it. "Fast" is not a requirement. p95 response under 500ms at 1,000 concurrent users is. One is a vibe an engineer will quietly reinterpret. The other is a number they can build against and you can hold them to.
Edge cases as acceptance criteria, not prose
The happy path writes itself. The unhappy paths are where the week-one bugs live, and the reason they get missed is that prose hides them. "Handle errors gracefully" feels like coverage. It isn't.
Write edge cases the way QA will test them — as conditions. The Gherkin Given-When-Then form is the cleanest I've found:
Given a user with a revoked invite When they open the shared link Then they see a request-access screen, not a 404.
Three lines, and there's nothing left to interpret. Run that pattern across empty states, expired sessions, partial permissions, and the offline case, and you've written the test plan as a side effect of the review.
The verdict: a decision, not a digest
A review that ends in twenty equally-weighted findings isn't a review. It's homework, and you'll skim it.
The last check is the one most people skip entirely: end with a call. The top risks, ranked, and one of three verdicts (ship, revise, or rethink), plus the single blocking gap to fix first. Severity has to be assigned the same way every time, not by whoever's reviewing and how their afternoon is going. Consistency is what lets you trust the sort order and fix what matters first. If you want the longer argument for why specificity beats thoroughness, it's in the principles behind a rigorous critique.
The two checks reviews skip most
Two failures show up in almost every review that goes wrong, and neither is on a typical checklist.
The first is naming the missing thing. Reviewers comment on what's on the page because it's there to comment on. Nicer phrasing for goal two, a note that the metrics section "could be stronger." All true, none of it changes what ships. Gap detection is harder and worth more: you have to read for the section that should exist and doesn't. You cannot proofread a paragraph that was never written, which is exactly why those gaps survive all the way to the review meeting.
The second is conflict. A review that only agrees with you is noise wearing the costume of feedback. The useful read is the one willing to say the problem statement and the proposed solution don't match, or that this metric won't move even if the feature works, and to be wrong about that in writing. You don't need a second opinion that flatters the draft. You need the one that conflicts with yours, so the conflict happens at your desk on Tuesday instead of in the room on Friday. Most AI "review my doc" tools fail this test by design: built to be helpful, they stay agreeable, handing you a page of polite suggestions and no ranked verdict.
Running the eleven checks automatically
You can run this list by hand. It takes a real reviewer twenty minutes of honest attention, and the hardest part is staying honest about the sections you skimmed because you wrote them.
That manual pass is what Thinkr's Critique automates: it runs the full 11-pass review on your draft and lands each finding as a comment on the line it concerns, classified blocker, major, or minor, with a suggested rewrite you can accept inline. Same eleven checks, same severity discipline, before your team opens the doc. If you want to see it work on a real spec rather than take my word for it, there's a full teardown of one PRD, start to finish. And if you're still at the writing stage, the checks start upstream in how the draft gets generated.
FAQ
What is a PRD review?
A PRD review is a structured read of a product spec before it ships to design or engineering, looking for what's missing or weak (unframed problems, unmeasurable metrics, uncovered edge cases, readiness gaps) rather than polishing what's already written. The output is findings you can act on, sorted by severity.
Can AI review a PRD?
Yes, and it's a different job than writing one. An AI reviewer reads your draft against a fixed rubric and returns flagged gaps with suggested fixes. It catches the mechanical failures consistently — missing acceptance criteria, vague metrics, absent error states. What it can't supply is the domain judgment about whether you're solving the right problem; that check still needs you. Used as a first pass, it clears the obvious gaps so your human reviewers spend their time on the strategic ones.
How is a PRD reviewer different from a PRD generator?
A generator writes a draft from a prompt. A reviewer reads the draft you already have and tells you where it breaks. They're opposite motions: one produces text, the other pressure-tests it. The generators (the writing tools, the chat assistants) all start from a blank page. The review is the step almost none of them actually run.