Does this mean AI coding tools are not worth adopting?

No. Developer-level gains are real and measurable — 21% more tasks completed, PR merge rates nearly doubled. The issue is that organisational delivery gains require downstream process changes alongside the tooling. Teams that adopted AI coding tools and also reduced PR size, improved test coverage, and modernised their CI/CD pipelines are seeing delivery-level improvements.

Why do developers report feeling more productive if business outcomes haven't changed?

Individual throughput and system throughput are different things. A developer can genuinely be faster at their individual coding tasks while the pipeline they feed has the same review, QA, and deployment capacity. Individual experience is a leading indicator; org-level outcomes are a lagging one — and they require process changes alongside tool adoption.

What is the single highest-return change for teams adopting AI coding tools?

Smaller PRs. AI tools generate more code per session; the natural response is to ship larger PRs. Teams that cut median PR size while adopting AI tools reduced review time and improved end-to-end cycle time more than any other single change. Smaller review units move faster through every downstream step.

What does Amdahl's Law have to do with software delivery speed?

Amdahl's Law says that the speedup of a system is bounded by the fraction you actually improved. If coding accounts for 25% of your end-to-end cycle time and you make coding twice as fast, total cycle time improves by at most 12.5%. Review, QA, and deployment are the larger fractions — and that is where the remaining gains live.

Industry AnalysisMay 16, 20265 min readReviewed May 16, 2026

AI made your developers faster. Why hasn't software delivery caught up?

The data on individual throughput, system throughput, and the three structural changes that close the gap

By FlowVerify Editorial Team

Two different answers to the same question

Ask a developer if AI coding tools have made them more productive, and you will almost certainly get a yes. Ask their engineering director if sprint velocity has improved, and the answer gets murkier.

This is not a morale problem, and it is not a measurement problem. It is a systems problem. The data from the past 12 months is specific enough to explain exactly where the gap comes from.

What the AI developer productivity data actually shows

A Faros.ai study tracking 5,000+ developers across enterprise engineering teams found that high-adoption cohorts (teams where over 70% of developers had active AI assistant sessions daily) completed 21% more tasks per sprint and merged pull requests at nearly twice the rate of control groups. These numbers are controlled for team size and project type. They replicate across organisations.

Developer satisfaction data lines up with this. Engineers using AI coding assistants consistently report finishing features faster, spending less time on boilerplate, and feeling less blocked on routine implementation. The individual experience is genuinely better.

So the tools are working. The question is where the output goes.

Where the time went: Amdahl's Law meets the review queue

The same Faros.ai dataset showed that pull request review time went up by 91% on high-AI-adoption teams.

When code is produced faster, it arrives faster in the review queue. The review queue did not scale with the throughput. Code review, which had been a manageable constraint in the pre-AI pipeline, became the obvious bottleneck once the generation phase sped up, and code review itself did not change.

Gene Amdahl's 1967 formulation for parallel computing says that the speedup of a system is limited by the fraction that cannot be parallelised. The software delivery version: if coding accounts for 20-30% of end-to-end cycle time, making coding twice as fast only improves total cycle time by 10-15%. Review, QA, staging, deployment, and stakeholder sign-off make up the remaining 70-80%. Speeding up the first fraction faster than the system can absorb it reveals the second fraction.

Sprint velocity has been sticky not because AI tools are underperforming, but because the constraint moved from the generation phase to the review phase. Both things are simultaneously true.

Metric	Observed change	What it points to
Individual task completion rate	+21% on high-adoption teams	AI genuinely accelerates individual coding
PR merge rate	Nearly doubled (+98%)	More code produced per developer per day
PR review time	+91% on the same teams	Downstream bottleneck exposed, not eliminated
Verification and rework overhead	~40% of raw AI gains consumed	Checking AI output is real cognitive work
Org-level delivery cycle time	Mixed; often flat	Coding was not the bottleneck on delivery
CEO-reported business impact (PwC, 2026)	56% report no gains	Faster code did not reach faster outcomes

AI coding tools: what changes, and what it reveals

The verification tax

There is a second phenomenon layered on top of the review bottleneck. A January 2026 Workday analysis found that nearly 40% of AI-generated productivity gains were being consumed by verification and rework — the time developers spent checking, correcting, and second-guessing AI output before submitting it for review.

AI coding assistants produce plausible code reliably. They produce correct code probabilistically. The cognitive cost of reviewing your own AI-generated output — deciding what to keep, what to rewrite, where the model got something subtly wrong — is real and does not show up in task-completion counts.

A developer who merges twice as many PRs in a week may have spent a similar number of focused hours in front of code. Session count is up; verification overhead is new. The gross productivity gain is real. The net gain, after accounting for that overhead, is smaller and harder to attribute cleanly to the tooling.

What the macro data shows, and what it doesn't

At the organisational and economic level, the gap widens further.

The PwC 2026 Global CEO Survey of 4,454 CEOs across 95 countries found that 56% said they had gotten nothing from their AI investments. Only 12% reported AI had both grown revenues and reduced costs. Goldman Sachs economists found no measurable AI contribution to US GDP growth through 2025.

These figures sit awkwardly alongside the developer-level data. If developers are merging PRs at twice the rate, where is the business outcome?

The most coherent explanation is that coding throughput was not the constraint on software delivery, and software delivery was not the constraint on business outcomes for most organisations. Speeding up the coding phase faster than the rest of the system can absorb it does not produce faster products. It produces a longer backlog of code waiting to be reviewed, tested, and shipped.

It is worth noting that BLS benchmark revisions put US productivity growth at approximately 2.7% in 2025 — nearly double the prior decade average. This is the statistical signature of productivity growth. But it is accruing unevenly, and much of it is being absorbed by process friction before reaching business metrics.

“Coding throughput was not the constraint on software delivery. Speeding up a step that was not the bottleneck does not move the whole system.”

— FlowVerify

The three places gains actually compound

Based on what the data shows about where cycle time actually goes, there are three structural changes that shift AI coding gains from individual throughput to delivery throughput.

Smaller PRs, not larger ones

This is counterintuitive. AI tools generate more code per session, so the natural response is to let PRs grow. The teams that have captured the most organisational benefit from AI coding tools have done the opposite: they moved to smaller, more frequent PRs.

A 150-line PR reviewed in 25 minutes and merged the same day beats a 600-line PR that waits three days for a reviewer and another two days for a deployment slot. When AI tools generate more code per session, the unit of review needs to shrink, not stay the same. The throughput advantage compounds across weeks, not sprints.

Tests that authorise deployment, not humans

If AI-generated code needs a human to validate it before tests run, or if test coverage is sparse enough that a reviewer has to read the code carefully to feel confident — you have not automated the bottleneck. You have moved it inside the developer's session.

Teams getting the most from AI coding have invested in test coverage and test quality such that AI-generated code that passes the suite earns a degree of automated confidence before review. This is a significant investment in test infrastructure that needs to precede the AI tooling payoff; teams that skipped it are getting the gross gains while the verification tax eats the net.

CI/CD maturity that scales with the new throughput

Deployment, environment provisioning, QA triage: if these require human scheduling and approvals, faster code generation produces a longer queue of code waiting to be deployed rather than faster time to production. The teams seeing delivery-level improvements from AI coding are typically those that had invested in CI/CD maturity before or alongside their AI tooling rollout.

None of this is surprising in retrospect. Every time a step in the engineering pipeline speeds up significantly, the next step becomes the visible constraint. The thing that is different now is the scale of the speed-up in the generation phase, and how quickly it revealed what had been invisible constraints downstream.

The AI coding tools are not underperforming. The code review workflows, the staging environments, the approval processes, and the deployment pipelines are performing exactly as they always have. That used to be adequate. Now it is the bottleneck.

Frequently asked questions

The AI wrapper debate, three years in: what the survivors built

May 13, 2026Read full article →

Industry AnalysisMay 16, 20265 min readReviewed May 16, 2026

AI made your developers faster. Why hasn't software delivery caught up?

The data on individual throughput, system throughput, and the three structural changes that close the gap

By FlowVerify Editorial Team

Two different answers to the same question

This is not a morale problem, and it is not a measurement problem. It is a systems problem. The data from the past 12 months is specific enough to explain exactly where the gap comes from.

What the AI developer productivity data actually shows

So the tools are working. The question is where the output goes.

Where the time went: Amdahl's Law meets the review queue

The same Faros.ai dataset showed that pull request review time went up by 91% on high-AI-adoption teams.

Sprint velocity has been sticky not because AI tools are underperforming, but because the constraint moved from the generation phase to the review phase. Both things are simultaneously true.

Metric	Observed change	What it points to
Individual task completion rate	+21% on high-adoption teams	AI genuinely accelerates individual coding
PR merge rate	Nearly doubled (+98%)	More code produced per developer per day
PR review time	+91% on the same teams	Downstream bottleneck exposed, not eliminated
Verification and rework overhead	~40% of raw AI gains consumed	Checking AI output is real cognitive work
Org-level delivery cycle time	Mixed; often flat	Coding was not the bottleneck on delivery
CEO-reported business impact (PwC, 2026)	56% report no gains	Faster code did not reach faster outcomes

AI coding tools: what changes, and what it reveals

The verification tax

What the macro data shows, and what it doesn't

At the organisational and economic level, the gap widens further.

These figures sit awkwardly alongside the developer-level data. If developers are merging PRs at twice the rate, where is the business outcome?

“Coding throughput was not the constraint on software delivery. Speeding up a step that was not the bottleneck does not move the whole system.”

— FlowVerify

The three places gains actually compound

Based on what the data shows about where cycle time actually goes, there are three structural changes that shift AI coding gains from individual throughput to delivery throughput.

AI made your developers faster. Why hasn't software delivery caught up?

Two different answers to the same question

What the AI developer productivity data actually shows

Where the time went: Amdahl's Law meets the review queue

The verification tax

What the macro data shows, and what it doesn't

The three places gains actually compound

Smaller PRs, not larger ones

Tests that authorise deployment, not humans

CI/CD maturity that scales with the new throughput

Frequently asked questions

Related reading

The AI wrapper debate, three years in: what the survivors built

The AI coding productivity data keeps contradicting itself. Here's why.

Open-source business models in 2026: what the dust finally settled on

Stay ahead on eSignatures, compliance, and document workflows

The AI wrapper debate, three years in: what the survivors built

AI made your developers faster. Why hasn't software delivery caught up?

Two different answers to the same question

What the AI developer productivity data actually shows

Where the time went: Amdahl's Law meets the review queue

The verification tax

What the macro data shows, and what it doesn't

The three places gains actually compound

Smaller PRs, not larger ones

Tests that authorise deployment, not humans

CI/CD maturity that scales with the new throughput

Frequently asked questions

Related reading

The AI wrapper debate, three years in: what the survivors built

The AI coding productivity data keeps contradicting itself. Here's why.

Open-source business models in 2026: what the dust finally settled on

Stay ahead on eSignatures, compliance, and document workflows

The AI wrapper debate, three years in: what the survivors built