The take-home coding assignment is dead. Mostly.
AI didn't kill it. AI exposed what was always wrong with it.
Last year, a hiring analytics firm reported that 80% of candidates used a language model to complete a standard top-of-funnel code test, even when explicitly told not to. The company running the study had a choice: crack down on AI usage, or examine why a test solvable by a model in 20 minutes was still measuring anything useful.
Most engineering teams chose the crack-down. The more useful question is the second one.
What take-home coding assignments were actually measuring (and weren't)
The standard take-home format hasn't changed much in a decade: build a small API with a database, some endpoints, maybe authentication. Return it in 48 to 72 hours. Reviewed asynchronously, later, by a senior engineer spending 30 minutes on it.
What this measures:
- Can you scaffold a project? (Yes. Any experienced developer. Also any current model, in about 15 minutes.)
- Can you write clean, idiomatic code when you have unlimited time to polish it? (Useful signal, but it's the signal for a very specific kind of work, not most engineering work.)
- Do you know the standard patterns for the tech stack named in the job listing? (Useful, but a weaker proxy than asking about the tradeoffs between those patterns.)
What take-homes don't measure:
- How you handle ambiguity when the spec is incomplete
- How you debug something you didn't write
- Whether you can explain a technical decision to a non-technical stakeholder
- How your work holds up under code review
- Whether you know when to stop building
Most engineering failures in the first six months aren't "can't write code." They're one of those five things. The take-home, even when done well, tells you almost nothing about them.
AI broke the proxy, not the goal
The proxy was: polish plus correctness plus apparent effort equals engineering skill.
That proxy was always approximate. Excellent engineers submit messy take-homes because they treated a 72-hour window as a 4-hour task, which is the correct allocation of effort for a job application. Poor hires submit perfect take-homes because they spent 40 hours on it. The signal was noisy before AI; AI made the noise deafening.
By end of 2025, an estimated 35% of candidates were using AI assistance on coding assessments, up from 15% six months earlier. That trajectory doesn't reverse. Asking candidates not to use AI tools is both unverifiable and, increasingly, the wrong constraint. AI-augmented development is how the job works.
“The question isn't whether they used AI on the take-home. It's whether the take-home ever measured what you thought it measured.”
The goal is still valid: you want to know whether this person can ship reliable software to your codebase, collaborate well with your team, and handle greater scope over time. Take-homes never measured this well. They felt rigorous, which is a different thing.
Five evaluations that do predict on-the-job performance
These are ranked loosely by signal strength, not by how easy they are to run.
1. Code review on existing code
Give the candidate a pull request to review, ideally from your actual codebase with identifying information removed. Ask them to leave comments as they would in a real review.
This measures judgment about what matters, communication style, ability to critique without being unkind, and domain knowledge. None of these transfer cleanly to a model. You can prompt a model to spot bugs; you cannot prompt it to exercise taste. The comments a good engineer leaves on a PR reveal more about them than 200 lines of their own code.
One honest caveat: this format requires your team to curate examples and score them consistently. It costs more to set up than a take-home. It's worth it.
2. Debugging a broken system
Give the candidate a small implementation with a bug: something in the 100- to 200-line range, realistic, with a symptom description but no explanation. Ask them to find and fix it in 45 minutes, on a shared screen with one of your engineers present.
This is close to 70% of what most engineering roles actually involve day to day. You see how they read unfamiliar code, how they think out loud, how quickly they isolate a problem. The live format matters: the engineer present can ask "why" at any point, which cuts through any rehearsed surface.
3. A technical discussion about a past project
Not "tell me about yourself." Specifically: "Pick a technical decision you made in the last 18 months that you'd make differently today. Walk me through it."
This tells you whether they understand the tradeoffs in decisions they've already made, whether they can hold two things in mind simultaneously, and whether they're capable of self-evaluation. No model can fabricate this, because you're asking them to bring authentic history that you can then probe.
4. A pairing session on a real problem
Not a contrived puzzle. Ideally, something from your actual backlog — a small feature or a documented bug, with context already written down. One of your engineers works on it with the candidate for 60 minutes.
This is expensive in engineering time. It's also the highest-signal format for most roles. You see communication habits, how they handle uncertainty, how they accept and give suggestions. After 60 minutes of real work together, most experienced engineers can tell you clearly whether they want to work with this person.
5. A structured reference check
Most reference checks are perfunctory because they're run by recruiters as a box-check, not by the engineering manager who'll work with the person. Three targeted questions from the hiring manager: "What would you trust this person to own independently?", "What would you make sure they had support on?", and "What changed about your engineering standards because of working with them?" return more signal than any take-home review.
References can be coached, but patterns across three calls are hard to fake. And this costs the candidate zero time, which matters more than it used to.
What each format actually costs
| Format | Candidate time | Team time | AI-resistant | Primary signal |
|---|---|---|---|---|
| Take-home (current standard) | 4–15 hrs | 1–2 hrs review | No | Polish, effort |
| Code review exercise | 1–2 hrs | 2–3 hrs setup + scoring | Partly | Judgment, communication |
| Debugging session (live) | 45 min | 1 hr per session | Yes | Debugging, thinking out loud |
| Past project discussion | 30 min | 30 min | Yes | Self-awareness, decisions |
| Pairing session | 1 hr | 1 hr (live) | Yes | Collaboration, communication |
| Structured reference check | 0 | 20 min per reference | N/A | Reputation, growth arc |
The candidate time column matters more than it did in 2023. Senior engineers in 2026 often have competing offers before any assessment has run. A process that asks for 15 hours of unpaid work before any meaningful conversation creates selection pressure toward candidates with more available time, which isn't the filter you want.
The surviving use case for take-home assignments
Take-homes aren't dead in all contexts. They still work when:
- The role has very high volume and you need an early filter, but then design for AI-augmented work rather than against it. If the job requires using AI tools, test that. Give the candidate the model access they'd have on the job and see what they build.
- The work genuinely requires solo, asynchronous output: writing, design, some kinds of data analysis. A writing prompt or a short analytical task is a legitimate format here. A coding scaffold is not.
- You're pairing the assignment with a required follow-up discussion. The assignment isn't the evaluation. The conversation about it is. "Walk me through the tradeoff you made here" is the actual test.
What doesn't survive: sending a coding problem that any model can solve in 20 minutes, prohibiting AI, and treating the result as a hiring signal. That's not rigour. It's a proxy for a proxy, and the second proxy collapsed.
The engineers adapting to this fastest aren't building increasingly elaborate cheating-detection systems. They're the ones who stopped caring whether a candidate used a model, because they designed evaluations where using a model doesn't tell you much either way.
Frequently asked questions
Related reading
Engineering take-homes were already broken. AI just made it obvious.
Most engineering take-homes broke when AI tools arrived. But the format was already measuring the wrong thing. Here's how to redesign the rubric so the assessment holds up.
Why most engineering ladders are a single track in disguise
Most engineering ladders have a technical track on paper and a management track in practice. Here are the three structural reasons this happens, and what to change.
The take-home assignment is dead. Mostly.
Take-home coding assignments were designed to measure unassisted engineering work. AI coding tools made that assumption obsolete. Here is what the replacement formats look like.