AI in Interviews: What Are We Actually Trying to Measure?
Context
I’ve been thinking a lot about AI in technical interviews lately, mostly because it feels like one of those topics where everyone can sense the ground shifting a little, but not many people are saying the quiet part out loud. I’m not coming at this as someone who has designed hiring systems at a giant company, or someone with a grand theory of interviewing. I’m just an engineer who has spent a few years interviewing candidates in startup environments, where engineers often take on a big part of the hiring role. In places like that, there usually isn’t some huge recruiting machine behind the process. It’s usually a group of engineers trying to figure out whether this person seems like someone they’d trust to work with, build with, and learn from. That’s part of why AI in interviews feels tricky to me. It changes the environment around the interview, but it does not automatically make the underlying question easier to answer.
The Real Question
A lot of the conversation starts with whether candidates should be allowed to use AI. I understand why that is the immediate point of debate, but I think it skips a more important question. Before deciding whether AI should be allowed, it seems more useful to ask what the interview is actually supposed to measure. Once that gets fuzzy, everything else gets fuzzy with it. The policy gets harder to defend, the task gets harder to interpret, and the evaluation starts drifting away from the thing it was originally meant to capture.
The Ambiguity Problem
What I keep coming back to is that AI is not really just a policy problem. It is a measurement problem. If the goal of an interview is to measure unaided engineering fundamentals, that seems completely valid to me. There is still real value in understanding how a candidate thinks through a problem without outside assistance, how they reason from first principles, how they communicate under pressure, and whether they can write and explain code without leaning on a tool to carry the process for them. If the goal is to measure how someone works in a world where AI tools are increasingly part of normal engineering work, that also seems valid. But those are not the same interview, and I think a lot of teams are drifting into a blurry middle ground where they are no longer fully sure which one they are running.
That blurry middle ground is what feels most problematic to me. If you allow AI, but keep the same small, self-contained task that was originally meant to test unaided problem solving, I’m not sure you’re preserving the same signal. You may just be making it easier for the exercise to measure prompt-to-output translation instead of engineering judgment. At the same time, if you disallow AI but know that it is part of the broader reality candidates are operating in, the interview can start to feel like a strange half-truth where the written rule says one thing and the actual environment says another. That mismatch is what makes the process feel unstable. The problem is not just whether AI is present. The problem is that once the interview is no longer clear about what it wants to measure, the rest of the process starts losing coherence too.
Where the Bar Moves
One thing I’ve changed my mind on is the idea that allowing AI automatically lowers the bar. I don’t think that’s quite right. I think the bar can stay high, but the location of the signal changes. Without AI, a lot of the value in a coding interview comes from direct production. Can the candidate work through the task, keep the structure in their head, make reasonable tradeoffs, and finish something coherent? With AI, especially in remote settings, some of that production becomes much easier to outsource. If the task stays too basic, the interview risks becoming less about whether the candidate can actually engineer and more about whether they can get plausible output from a model. That does not mean the signal disappears. It means the signal moves into different places: whether they know when AI is useful, whether they can tell when the output is wrong, whether they challenge it instead of accepting it passively, whether they verify what it gives them, and whether they stay engaged with the actual problem instead of disappearing behind the tool. That still feels like real engineering signal to me, but only if the interview is designed to surface it.
If AI Is Allowed
That is why I don’t think “allow AI” can just be a small policy update on top of an otherwise unchanged process. If the task and rubric were originally built around unaided problem solving, then allowing AI without changing anything else probably weakens the interview more than it improves it. To make that kind of interview work, it seems like a team would have to be much more explicit about what kinds of AI usage are actually acceptable and what kinds are not. Maybe using AI for boilerplate, test generation, API lookups, or brainstorming edge cases is fine. Maybe silently leaning on it as a teleprompter is not. Maybe the expectation is that the candidate should say when they are using it, what they are asking it to do, and why. Whatever the exact rules are, the important part is that they should not be left implicit.
Beyond policy, the task itself would probably have to become more realistic and harder to brute-force through generation alone. Less “add this straightforward CRUD endpoint” and more “navigate this small codebase, deal with incomplete requirements, make tradeoffs, catch what’s off, and explain your choices.” For design interviews, that might mean moving away from the most familiar toy prompts and pushing more into prioritization, operational tradeoffs, migration concerns, observability, or failure modes. The point is not to make the interview larger for the sake of making it harder. The point is that once generation becomes cheaper, the interview has to lean more heavily on judgment, verification, and ownership. Otherwise I think you end up preserving the appearance of rigor while collecting less actual signal.
If AI Is Not Allowed
At the same time, I don’t think the answer is that every team should now allow AI and move on. There are still good reasons not to. If what you really want to know is whether someone has a solid grasp of fundamentals without tool assistance, then it makes sense to say that clearly and measure it directly. I don’t think there is anything outdated about wanting to evaluate core reasoning that way. What matters is not whether the policy is permissive or restrictive. What matters is whether the policy matches the skill being measured. Candidates should know what kind of interview they are walking into, and interviewers should know what they are actually trying to reward. Otherwise everyone is operating inside a vague social contract that gets harder to interpret the moment AI enters the room.
Why Surveillance Falls Short
I’m also skeptical that surveillance really fixes much. One reaction to all of this is to add more guardrails: more screen sharing, more environment checks, more ways of making sure the candidate is not getting outside help. Maybe that helps a little at the margins, but I don’t think it solves the actual problem. Remote interviews are always going to have too many ways around strict enforcement. There’s always another screen, another device, another off-camera assist, another workaround. Once you start chasing all of that, the process can get weirdly adversarial, and even then I’m not convinced it improves the most important hiring signal. The strongest signals were rarely “this person produced a flawless answer under perfect observation.” They were usually more human than that: whether the person asks good questions, thinks out loud, explains why they are taking a particular path, adjusts when something goes wrong, and seems to actually own the work in front of them. Those things matter whether AI is allowed or not, and a lot of suspicious interview performance already shows up there anyway. Long unexplained pauses, sudden jumps to fully formed answers with weak explanation, and code or design choices a candidate cannot really defend tend to tell me more than whether their desktop looked clean.
What I Still Care About
The thing I still care about most in an interview is whether the candidate feels intellectually present. I don’t mean perfect, polished, or linear. I mean present in the problem, present in the tradeoffs, and present in the conversation. Do they engage with ambiguity instead of trying to route around it? Do they show me how they think? When they make a mistake, do they adjust? When something looks off, do they notice? When something is unclear, do they ask? That matters a lot more to me than whether the final artifact looks especially clean. If anything, AI may just be making it harder to ignore that this was always a more valuable signal than polished output on its own.
Choosing Deliberately
For smaller companies, especially the ones where engineers are still carrying a lot of the hiring decision, I don’t think there is a perfect answer here. I do think there is a need to be more explicit than many teams have been. If a team wants to measure unaided fundamentals, it should do that and say so clearly. If it wants to measure AI-assisted engineering, it should allow that explicitly and redesign the task and rubric so the signal comes from judgment, verification, and tool use rather than raw output. What seems least healthy is the hybrid version where AI is unofficially present, vaguely disallowed, inconsistently detected, and never really accounted for in the evaluation. That seems like the version most likely to confuse candidates, frustrate interviewers, and quietly erode the value of the interview itself.
Closing Thought
I don’t think AI has made technical interviews useless, and I also don’t think it has handed teams an obvious new policy. What it seems to be doing instead is putting pressure on a part of the process that was already easy to leave vague. It is forcing a more honest answer to a basic question: what are we actually trying to measure here? For teams where engineers still make a large part of the hiring call, that probably matters more than whether the official rule ends up being “allowed” or “not allowed.” The harder part is making sure the interview, the rubric, and the expectation all point at the same thing.