The AI wrapper debate, three years in: what the survivors built
Most AI companies launched in 2022-2023 are gone. A few survived. Here is what they built that the others didn't.
The theory sounded solid in early 2023. Incumbents were slow, GPT-4 was fast, and a small team could ship a product in weeks that a large software company would take two years to plan. Thousands of companies built interfaces on top of OpenAI APIs and called themselves AI companies. The pejorative came later: AI wrappers.
Three years on, the AI wrapper debate has a body of evidence. Many of those companies are gone. A handful survived and grew into something durable. The split was not random, and the pattern is specific enough to be useful.
The theory that launched a thousand wrappers
The bet made sense given the state of enterprise software in 2022. Large companies had accumulated years of technical debt, slow release cycles, and procurement processes designed for an earlier era. A three-person team using GPT-4 could ship a legal contract reviewer in a month that would take a legal-tech incumbent three years to plan and another year to ship. That speed gap was real.
Demand confirmed it. Jasper hit $80M ARR in 2022 writing marketing copy on top of GPT-3. Numerous document analysis startups got to $5-10M ARR in months on the strength of GPT-4 reading PDFs. Specialised legal AI tools, HR document processors, and sales prospecting products all showed genuine early traction.
Investors read the growth rates and concluded that first-mover advantage would compound: the company with the most users would collect the most feedback, ship the fastest improvements, and build a data flywheel the model provider could not replicate. It was a coherent theory. The flaw was not in the premise. It was in the assumption that the model would stay where it was.
What the foundation models absorbed
The erosion happened in stages, which made it difficult to see clearly until it was already complete.
GPT-4 Turbo arrived in late 2023, better at precisely the tasks that GPT-3-era wrapper companies had built their positioning around. GPT-4o was cheaper and faster, compressing the margins that made the wrapper economics work. By the time GPT-5 shipped in 2025, it was closing capability gaps that 2023 AI companies had relied on as differentiation.
The mechanism is direct: when the foundation model improves faster than the moat built around it, the wrapper gets squeezed from both sides. The model gets better at the core task. The interface premium shrinks. The reason a user would pay for the wrapper instead of accessing the model directly quietly disappears.
The categories most affected illustrate the pattern: general writing and editing tools, basic document summarisation with Q&A, generic SEO content tools, and coding suggestion products without deep codebase integration. All real markets in 2022. By 2025, each had effectively become a feature in a foundation model product.
The three patterns that did not get absorbed
Looking at the companies with durable revenue in 2026 — not just those that raised money or are still operating, but those with growing enterprise customers and improving unit economics — three patterns separate them from the ones that didn't survive.
Proprietary data and workflow integration
Harvey AI built a legal AI platform not by writing better prompts than a solicitor using ChatGPT, but by integrating deeply with matter management systems, drafting tools, and internal knowledge bases at law firms. The platform maintains context across long-running cases, surfaces relevant precedent from the firm's own work, and runs on models trained with legal datasets the company spent years acquiring.
A solicitor cannot replicate that workflow with a browser tab. The data and the system integrations are the product. The same pattern holds across vertical AI companies that have survived: the models are a component; the workflow integration and proprietary data are what the customer is actually purchasing.
Agentic infrastructure rather than interface
LangChain's arc is instructive. As a library for chaining LLM calls, it was commoditised quickly by competing frameworks and by model providers offering similar primitives. The library wasn't defensible because any competent engineering team could build what it did in a week.
LangGraph changed the picture. As a stateful, graph-based orchestration framework for multi-step AI agents — with persistent state, human-in-the-loop pause points, and support for complex branching workflows — it moved from library to infrastructure. Engineering teams that built production agents on top of it were not going to rewrite them for a competing framework without a compelling reason. Infrastructure has switching costs that a library does not.
Deep workflow integration with real switching costs
Cursor (Anysphere) reported $2B ARR by February 2026, one of the fastest trajectories to that number in B2B software. The reason has little to do with code generation quality — every major IDE has competitive AI code generation — and a great deal to do with what the product actually is.
Cursor is an IDE that understands the full codebase as context, runs multi-file edits in a Composer view, and builds a working model of the team's patterns and conventions over time. That context accumulates. Switching to a different editor means giving it up. A better foundation model makes Cursor's outputs better; it does not make Cursor less useful.
| Company | Pattern | 2026 status | Core reason |
|---|---|---|---|
| Cursor (Anysphere) | Full-codebase context + multi-file workflow | $2B ARR, reported Feb 2026 | The editor is the context layer; switching means losing it |
| Perplexity | Proprietary search index + publisher partnership network | Profitable, defended position vs ChatGPT Search | Built a different content economy, not a better chat interface |
| Harvey AI | Legal domain data + matter management integration | Enterprise contracts, major law firms | Datasets and integrations are not replicable with generic models |
| LangChain / LangGraph | Agentic orchestration infrastructure | Platform role in most enterprise AI agent stacks | Moved from library to infrastructure before commoditisation |
| Jasper | General writing interface, no proprietary data moat | Substantially reduced scale, pivoting to enterprise workflows | Core use case absorbed by native ChatGPT and Gemini interfaces |
The one question that sorts wrappers from products
“When the underlying model gets ten times better for free, does your value proposition get stronger or weaker?”
For Cursor, a better foundation model means Cursor gets better. The editor, the context layer, the multi-file workflow, the team habits accumulated over months of use all remain intact. The improvement accrues to the product.
For a generic writing assistant with no proprietary data or workflow integration, a better model means the user can get equivalent output by opening ChatGPT. The improvement accrues to the model provider.
This test is the one worth tracking when evaluating AI product defensibility. ARR trajectory and growth rate both matter. But neither tells you whether the business can survive the next model release.
What this means if you're building on top of an LLM today
Building on top of foundation models is not the failure mode. The failure mode is making the model relationship the product.
The companies that survived three years of improving models share one characteristic: the LLM is an input to their product, not the core of it. What they built around it — proprietary data, deep integrations, accumulated user context, agentic workflows that maintain state across time — creates something the model cannot absorb, because the model does not have access to it.
The same logic applies to B2B SaaS products adding AI features now. Features built on top of existing data, internal workflows, and customer relationships get stronger as models improve. Features built primarily as 'chat with your documents' face the same squeeze the 2022 wrappers encountered: as the model improves, the feature becomes easier for any competitor to replicate without building anything differentiated at all.
The foundation model keeps getting better. Whether you are building something that gets better with it, or something it is getting better past, is a question worth answering now.
Frequently asked questions
Related reading
Bootstrapped or venture-backed: the Indian SaaS calculus in 2026
India hosts the second-largest SaaS ecosystem outside the US. The raise-or-bootstrap question has a different answer in 2026 than it did in 2021. Here's the data behind the shift.
LLM database access: the RBAC gap most teams don't see
Giving an LLM access to your database is easy. The problem is that your application-layer RBAC is invisible when the model generates SQL. Here's where it goes wrong and how to fix it at the layer that enforces.
You have 30 customers. Don't hire a sales rep yet.
Conventional wisdom says hire when the motion is repeatable. But at 30 customers, most founders can't yet articulate what makes their deals close, and hiring into that gap costs far more than the salary.