Live Webinar: Top 3 Nightmares Every QA Has Faced. Register Now.
Turn your manual testers into automation experts! Request a Demo

AI in Testing vs AI in Cheating: What Roy Lee’s Interview Coder Reveals

Key Takeaways:
  • Tools like Interview Coder show how fragile evaluation systems are when they prioritize performance over understanding.
  • The same issue exists in AI cheating in exams, where success does not always reflect knowledge.
  • Whether in interviews or QA, results may look clean, but the underlying logic can still be wrong.
  • The goal is not to avoid AI, but to use it intelligently while addressing the risks of invisible learning discussed in AI cheating in school forums.
  • Teams that move forward with automation without understanding their test logic are reinforcing the same blind trust that can lead to critical production failures. As seen in the recent Cluely revenue controversy, blind trust in AI-generated metrics often masks deep underlying inaccuracies.

The Incident

In March 2025, 21-year-old Columbia University student Chungin “Roy” Lee made headlines for an unusual reason. The news was that he built an AI cheat tool to assist candidates in live coding interviews. His Interview Coder became a controversial example of an AI cheating system. The tool can provide real-time answers to coding questions during live interviews while remaining undetectable to all screen-recording software. Roy claims that he used this tool to secure internship offers from major companies like Amazon, Meta, and TikTok.

However, the issue escalated when he himself shared videos on social media showing how he had deceived these companies. The companies withdrew their offers, and Columbia University suspended Lee for one year, leading him to drop out to focus on his startup.

“Everyone programs nowadays with the help of AI. It doesn’t make sense to have an interview format that assumes you don’t have the use of AI.”

Roy Lee

That is the question Roy is asking. He also says he chose this path after getting tired of spending over 600 hours memorizing questions for platforms like LeetCode. Following Roy Lee’s challenge, companies, including Google, have already implemented AI-resistant assessments and are returning to in-person interviews to prevent AI cheating.

However, Roy firmly insists:

I don’t feel guilty at all for not catering to a company’s inability to adapt.

Roy Lee

Are you testing the candidate’s skill, or their cleverness in using a cheating AI tool?

The Tool, the Suspension, and the $20.3M That Followed

Roy’s old tool later turned into a new startup called Cluely. The startup raised a total of around $20.3 million from investors including Abstract Ventures, Susa Ventures, and Andreessen Horowitz. It positions itself as a platform initially marketed to help users cheat on everything, including job interviews, sales calls, and even exams.

Strict action was taken against Roy because he leaked confidential documents related to his disciplinary proceedings without permission. The initial allegation raised by Columbia University was for advertising a link to his AI cheating tool specifically for technical job interviews, out of fear that the tool could be misused in classroom environments. Even before these new changes, Lee had presented Interview Coder as something generating heavily in revenue every year. However, he later admitted that the large revenue figures were just lies meant for viral marketing.

Caution: Wait A Moment

We can’t just dismiss Roy’s story as someone else’s fault. It’s time to examine our own systems for similar loopholes. Lee argues that flaws in current recruitment methods are what his tool exposes. That connects directly to broader discussions on AI in testing systems and how evaluation methods may fail under modern tools.

There is a growing concern around AI cheating in schools and whether existing systems are prepared for it. It has long been complained that “LeetCode” style interviews test memory more than engineering skills. But what Lee did was not solve this problem, but bypass the hurdle entirely.

For example, if a login button does not work, we can quickly understand it. It is easy to find and fix that bug. But what happens if an AI hallucination causes a test case to pass all assertions, yet fails when an edge case occurs in production? We would assume everything is safe, but in reality, it is not. This is also a kind of deception. Not intentional, but one that keeps us unaware without realizing what’s really going on.

Why This Matters for Testers

There is a strong similarity between AI in cheating systems and AI in testing workflows. It is not just a metaphor, but a similarity in how they function. AI tools used for cheating in interviews give answers that satisfy the interviewer, but the interviewer does not know how that answer was produced.

Our AI testing methods are similar. AI may pass coverage metrics, but it does not think about how real users will actually use the system. Most failures do not happen because of technical breakdowns. Rather, they happen because fundamental risks are overlooked. Don’t we often hear about AI cheating in colleges? It’s the same situation here. When we blindly trust automation instead of understanding what we’re doing, real knowledge is lost.

From a QA perspective, Roy Lee’s story highlights three matters:
  • Outcome over Process: Interviewers only check correct results. The same applies to CI/CD pipelines relying on AI in testing outputs.
  • Undetectable Gaps: Tools built for AI cheat scenarios can bypass monitoring systems. Similarly, AI-generated test suites can miss hidden flows.
  • Competitive Race: Security and monitoring systems evolve against cheating AI tools, just like QA systems evolve against technical debt and automation blind spots.

Is AI-Generated Testing Lazy or Just Efficient

The founders of Cluely argue that tools once considered cheating, like calculators, spellcheck, and GPS, are now normal. This connects directly with debates around how to prevent AI cheating in schools, where tools shift from misuse to acceptance over time.

This also applies to test automation. Using AI to generate test cases quickly is not wrong. But the real concern is what we lose as speed increases. When an AI writes test cases, the reasoning behind them is no longer yours, but that of the model. This raises concerns where understanding is replaced by output. It may miss edge cases like what happens when a session token expires during a financial transaction. What it produces may look correct, but it still misses critical scenarios.

The real risk is not laziness. Rather, it is blindly trusting answers provided by cheating AI tools without even bothering to verify them.

Read: Can You Trust an AI That Can’t Explain Its Decisions?

What are We Actually Testing Here?

Roy Lee’s argument that coding interviews do not measure real engineering ability is reflected in his broader criticism of LeetCode-style interview preparation. This same question appears in discussions around AI cheating in exams and whether evaluation systems measure knowledge or just performance.

This raises another concern often asked in education systems: How accurate are AI checkers when detecting generated or manipulated answers? This same question needs to be asked by every QA team:
  • Are your tests actually evaluating the real behavior of the software, or just its outputs?
  • Is your pass rate an indicator of quality, or just of how well those test cases were written?
  • If your test cases are generated by AI, have you ever paused and asked: “Is this really how our users use the system?”

Trusting only visible results without checking the underlying logic is a serious risk in both QA and AI cheating in education contexts.

Read more about this: Trusting AI Test Automation: Where to Draw the Line

Where Should We Draw the Boundary Line?

Roy Lee demonstrated that evaluation systems often test the wrong aspects and that AI cheat tools can still pass them. As recruitment shifts toward AI-resistant interview formats such as live architectural discussions or debug sessions, the boundary is moving from do you use AI to how you use it.

This does not mean QA teams should avoid AI. It means we need to be careful about what we ask AI to test and why. The key difference between AI in cheating and legitimate automation lies in intent, review, and accountability.

What are your test suites actually proving?

And where do you think the line should be drawn?

You're 15 Minutes Away From Automated Test Maintenance and Fewer Bugs in Production
Simply fill out your information and create your first test suite in seconds, with AI to help you do it easily and quickly.
Achieve More Than 90% Test Automation
Step by Step Walkthroughs and Help
14 Day Free Trial, Cancel Anytime
“We spent so much time on maintenance when using Selenium, and we spend nearly zero time with maintenance using testRigor.”
Keith Powe VP Of Engineering - IDT
Privacy Overview
This site utilizes cookies to enhance your browsing experience. Among these, essential cookies are stored on your browser as they are necessary for ...
Read more
Strictly Necessary CookiesAlways Enabled
Essential cookies are crucial for the proper functioning and security of the website.
Non-NecessaryEnabled
Cookies that are not essential for the website's functionality but are employed to gather additional data. You can choose to opt out by using this toggle switch. These cookies gather data for analytics and performance tracking purposes.