The Cost of LGTM

You have stamped a pull request you didn't read. Green checks, a deep queue, someone waiting on the merge: you skimmed the diff, typed the four letters, and moved on. Most of us do it weekly. What changed in 2026 is the volume: the same reflex that used to wave through one risky change a month now waves through a flood, because the machine on the other side writes code faster than any human can review.

Reading 600 lines properly takes time you don't feel you have, and "probably fine" is cheaper. The uncomfortable part is that most of the time it really is fine, which is exactly why the habit survives. The bill only arrives on the days it isn't fine, and by then it's filed under something else: an incident, a new hire losing two days to a function nobody can explain, a small change that turns into a rewrite.

The job changed under us

Sonar's 2026 State of Code survey of 1,100+ developers found AI now writes around 42% of committed code. But the flood didn't make teams faster; it moved the bottleneck from writing to reviewing. METR's 2025 controlled study put experienced developers on real tasks and found they felt about 20% faster with AI while actually finishing 19% slower. Zoom out to the organization and it compounds: Faros AI's latest analysis found high-AI-adoption teams merged 98% more pull requests, yet review time climbed 91%, average PR size grew 154%, and delivery metrics didn't move. And LinearB's 2026 benchmark found AI-generated changes wait roughly 4.6× longer before anyone even opens them.

Writing was never the real constraint. Verifying is. Your CI already decides whether the code compiles and the tests pass, and it doesn't get tired at review number five. What no tool can decide is whether the code should exist at all: does the abstraction earn its weight, will anyone understand it in 6 months. That judgment is the whole job now, and it costs more attention per PR at the precise moment the PRs are multiplying. It's the gap I described in Speed Is a System, made concrete: clearing your own step and speeding up the system are not the same thing.

“Your pipeline already checks whether the code works. You're there to decide whether it should exist.”

Fast isn't the sin

This is where most writing about LGTM goes wrong. It treats every quick approval as a moral failure, which is naive about how real teams ship. At a hundred PRs a week, a thirty-minute deep read of each one isn't diligence: it's a two-week backlog and a team that starts merging without you. Fast approval of a trusted change in familiar code is correct. The failure is undifferentiated approval: handing the risky ten percent the same forty-second glance as the safe ninety. SmartBear's peer code review study found reviewers catch the most defects between 200 and 400 lines and fall off a cliff past that. So the skill was never reviewing everything deeply. Slow down when the change:

Introduces a pattern your codebase doesn't already have.
Touches a boundary: an API, a schema, a shared component half the system leans on.
Is too large, and the author can't explain it in two lines.
Was generated, reads beautifully, and passes every check. That's exactly when the subtle problems hide.

Approve is not a courtesy

When you approve, you co-sign. Whatever ships has your name on it now, and "the author" is cold comfort during the incident. That's the part the four-letter reflex quietly skips: it looks like participation while declining the responsibility. It's the same retreat I described in Ownership Is the Real Performance Multiplier: "that's the other team's component," "nothing changed on our side." And the volume makes it easier to hide. When fifty PRs are waiting, declining to truly own any single one of them feels reasonable. It isn't.

“Approval is co-authorship of the outcome. "LGTM" just hides the signature.”

How review keeps up

None of this argues for slower teams. Quite the opposite. The ones keeping their pipelines healthy aren't reviewing harder; they changed what review optimizes for. Keep changes small: one analysis of 50,000+ PRs found sub-200-line PRs get approved around three times faster, and defect detection falls from 87% for the smallest PRs to just 28% past a thousand lines. A size limit does more than any lecture will. Let the machine take the first pass: it clears surface noise (typos, style, obvious logic errors) before a human ever opens the diff. Some teams using this pattern report cutting human review time 30–50%, though that figure comes from GitHub's internal tooling and hasn't been independently verified. Martian's March 2026 benchmark still puts current AI reviewers at 50–60% effective, so treat one as a filter, never a stand-in for judgment. Route each PR to whoever has to live with that code (not just whoever's available) and agree on a turnaround the team actually honors; when nobody is named, nobody owns the review, and depth quietly stops happening.

Before you hit approve

Skip whether it works: your tests own that. Ask the three that are yours: does this fit how the system already works, will someone understand it without you, and are you willing to own the outcome too? If the last one gives you pause, you're not done reading; that holds whether a person or a model wrote it.

Now the machine can stamp too

Every era of thin review has had its excuse. The newest one is the smoothest yet: the model wrote it, I only approved it. When you didn't type the code yourself, waving it through feels even more defensible, but ownership doesn't transfer with the keystroke. The outcome is still yours. The green check was never the finish line. It's a signature. Don't sign things you wouldn't defend.