AI in Testing vs AI in Cheating: What Roy Lee’s Interview Coder Reveals
|
|

| Key Takeaways: |
|---|
|
The Incident
In March 2025, 21-year-old Columbia University student Chungin “Roy” Lee made headlines for an unusual reason. The news was that he built an AI cheat tool to assist candidates in live coding interviews. His Interview Coder became a controversial example of an AI cheating system. The tool can provide real-time answers to coding questions during live interviews while remaining undetectable to all screen-recording software. Roy claims that he used this tool to secure internship offers from major companies like Amazon, Meta, and TikTok.
However, the issue escalated when he himself shared videos on social media showing how he had deceived these companies. The companies withdrew their offers, and Columbia University suspended Lee for one year, leading him to drop out to focus on his startup.
“Everyone programs nowadays with the help of AI. It doesn’t make sense to have an interview format that assumes you don’t have the use of AI.”
— Roy Lee
That is the question Roy is asking. He also says he chose this path after getting tired of spending over 600 hours memorizing questions for platforms like LeetCode. Following Roy Lee’s challenge, companies, including Google, have already implemented AI-resistant assessments and are returning to in-person interviews to prevent AI cheating.
However, Roy firmly insists:
I don’t feel guilty at all for not catering to a company’s inability to adapt.”
— Roy Lee
Are you testing the candidate’s skill, or their cleverness in using a cheating AI tool?
The Tool, the Suspension, and the $20.3M That Followed
Roy’s old tool later turned into a new startup called Cluely. The startup raised a total of around $20.3 million from investors including Abstract Ventures, Susa Ventures, and Andreessen Horowitz. It positions itself as a platform initially marketed to help users cheat on everything, including job interviews, sales calls, and even exams.
Strict action was taken against Roy because he leaked confidential documents related to his disciplinary proceedings without permission. The initial allegation raised by Columbia University was for advertising a link to his AI cheating tool specifically for technical job interviews, out of fear that the tool could be misused in classroom environments. Even before these new changes, Lee had presented Interview Coder as something generating heavily in revenue every year. However, he later admitted that the large revenue figures were just lies meant for viral marketing.
Caution: Wait A Moment
We can’t just dismiss Roy’s story as someone else’s fault. It’s time to examine our own systems for similar loopholes. Lee argues that flaws in current recruitment methods are what his tool exposes. That connects directly to broader discussions on AI in testing systems and how evaluation methods may fail under modern tools.
There is a growing concern around AI cheating in schools and whether existing systems are prepared for it. It has long been complained that “LeetCode” style interviews test memory more than engineering skills. But what Lee did was not solve this problem, but bypass the hurdle entirely.
For example, if a login button does not work, we can quickly understand it. It is easy to find and fix that bug. But what happens if an AI hallucination causes a test case to pass all assertions, yet fails when an edge case occurs in production? We would assume everything is safe, but in reality, it is not. This is also a kind of deception. Not intentional, but one that keeps us unaware without realizing what’s really going on.
Why This Matters for Testers
There is a strong similarity between AI in cheating systems and AI in testing workflows. It is not just a metaphor, but a similarity in how they function. AI tools used for cheating in interviews give answers that satisfy the interviewer, but the interviewer does not know how that answer was produced.
Our AI testing methods are similar. AI may pass coverage metrics, but it does not think about how real users will actually use the system. Most failures do not happen because of technical breakdowns. Rather, they happen because fundamental risks are overlooked. Don’t we often hear about AI cheating in colleges? It’s the same situation here. When we blindly trust automation instead of understanding what we’re doing, real knowledge is lost.
- Outcome over Process: Interviewers only check correct results. The same applies to CI/CD pipelines relying on AI in testing outputs.
- Undetectable Gaps: Tools built for AI cheat scenarios can bypass monitoring systems. Similarly, AI-generated test suites can miss hidden flows.
- Competitive Race: Security and monitoring systems evolve against cheating AI tools, just like QA systems evolve against technical debt and automation blind spots.
Is AI-Generated Testing Lazy or Just Efficient
The founders of Cluely argue that tools once considered cheating, like calculators, spellcheck, and GPS, are now normal. This connects directly with debates around how to prevent AI cheating in schools, where tools shift from misuse to acceptance over time.
This also applies to test automation. Using AI to generate test cases quickly is not wrong. But the real concern is what we lose as speed increases. When an AI writes test cases, the reasoning behind them is no longer yours, but that of the model. This raises concerns where understanding is replaced by output. It may miss edge cases like what happens when a session token expires during a financial transaction. What it produces may look correct, but it still misses critical scenarios.
The real risk is not laziness. Rather, it is blindly trusting answers provided by cheating AI tools without even bothering to verify them.
Read: Can You Trust an AI That Can’t Explain Its Decisions?
What are We Actually Testing Here?
Roy Lee’s argument that coding interviews do not measure real engineering ability is reflected in his broader criticism of LeetCode-style interview preparation. This same question appears in discussions around AI cheating in exams and whether evaluation systems measure knowledge or just performance.
- Are your tests actually evaluating the real behavior of the software, or just its outputs?
- Is your pass rate an indicator of quality, or just of how well those test cases were written?
- If your test cases are generated by AI, have you ever paused and asked: “Is this really how our users use the system?”
Trusting only visible results without checking the underlying logic is a serious risk in both QA and AI cheating in education contexts.
Read more about this: Trusting AI Test Automation: Where to Draw the Line
Where Should We Draw the Boundary Line?
Roy Lee demonstrated that evaluation systems often test the wrong aspects and that AI cheat tools can still pass them. As recruitment shifts toward AI-resistant interview formats such as live architectural discussions or debug sessions, the boundary is moving from do you use AI to how you use it.
This does not mean QA teams should avoid AI. It means we need to be careful about what we ask AI to test and why. The key difference between AI in cheating and legitimate automation lies in intent, review, and accountability.
What are your test suites actually proving?
And where do you think the line should be drawn?
| Achieve More Than 90% Test Automation | |
| Step by Step Walkthroughs and Help | |
| 14 Day Free Trial, Cancel Anytime |




