How to Handle TDD with AI
|
If you’re someone who’s written application code, you’d know that unit testing is an important aspect of it. But if you’ve really worked with code, you’d agree with the fact that it is quite tedious to write tests for all your code; you’d rather just get into the coding part and be done with it. And then you’d also agree with the fact that nothing good came out of cutting corners (think of production bugs!).
This then takes us to an interesting way of writing code – through Test Driven Development (TDD).
Use your requirements to lay out tests and then write code to pass these tests. While theoretically it’s a beautiful and clean concept, in practice, it’s not so easy. Many even find it counterproductive. As we further moved into the Agile and AI era, everyone just wants quick fixes.
So then, is TDD dying? Is AI replacing it?
Let’s find out…
Key Takeaways
|
Let’s brush up on our understanding of TDD first.
What is TDD?
You’ve probably heard that “testing is important” in software development. But what if I told you that the secret to writing better code, with fewer bugs and a clearer plan, isn’t just about testing after you write the code? It’s about testing before you even start!
That’s the core idea behind Test-Driven Development (TDD). It’s a disciplined approach where you write automated tests for a piece of functionality before you write the actual code that implements it. Sounds a bit backward, right? Let’s break down why it’s incredibly powerful.
The TDD Approach: Red, Green, Refactor
TDD follows a simple, yet highly effective, three-step cycle:
- Red (Write a Failing Test): You start by writing a small, specific test for a feature you want to add or a bug you want to fix. At this point, the code for that feature doesn’t exist, so this test is guaranteed to fail. This “red” state tells you, “I have a new requirement, and the system isn’t meeting it yet.”
- Green (Write Just Enough Code to Pass): Next, you write the minimum amount of code needed to make that failing test pass. Don’t worry about perfect design or extra features at this stage; just get the test to turn “green.” This confirms that your code now meets that specific requirement.
- Refactor (Clean Up Your Code): Once your test is green, you know your code works. Now, you can confidently clean up and improve the code you just wrote. This might mean making it more readable, efficient, or removing any duplication. Because your tests are still running and turning green, you have a safety net, making sure you don’t accidentally break anything while making improvements.
And then? You repeat the cycle for the next small piece of functionality.
Why does this “backward” approach work?
- Clarity and Design: By writing the test first, you’re forced to think about what the code should do from an outside perspective. This often leads to simpler, cleaner designs because you’re designing for testability.
- Early Bug Detection: Catching bugs immediately, often within minutes of writing the code, is far cheaper and easier than finding them weeks or months later.
- Confidence to Change: A robust suite of tests acts as a safety net. When you need to add new features or refactor existing code, you can run your tests to quickly confirm that your changes haven’t introduced regressions (new bugs in old features).
- Executable Documentation: Your tests become a living, breathing set of examples of how your code is supposed to behave. Anyone looking at the test suite can understand the intended functionality.
Test Driven Development Example
Let’s imagine we’re building a simple “Calculator” application. We want to add a function that can add two numbers together.
Phase 1: Red (Write a Failing Test)
def test_add_numbers(): assert add(2, 3) == 5
We try to run this test. It fails! Why? Because there’s no add function yet. This is exactly what we want – the “red” light telling us to get to work.
Phase 2: Green (Write Just Enough Code)
def add(a, b): return a + b
We run the test again. It passes! The light is green. We’ve successfully implemented the core “add” functionality.
Phase 3: Refactor (Clean Up)
For this simple example, there might not be much to refactor immediately, but imagine our add function was more complex, perhaps involving string parsing or error handling. In the refactor step, we’d look at our add function and ask:
- Can it be made more readable?
- Is there any duplicate code?
- Is it as efficient as it could be?
We might rename a variable or extract a small helper function. Crucially, after any refactoring, we’d run all our tests again to ensure we haven’t broken anything.
Repeat!
Now that the ‘add’ is done, we’d move on to the next feature, perhaps “subtract two numbers,” and repeat the entire Red-Green-Refactor cycle.
Related Reads:
- What is Test Driven Development? TDD vs. BDD vs. SDD
- TDD vs BDD – What’s the Difference Between TDD and BDD?
- ATDD and TDD: Key Dos and Don’ts for Successful Development
Intersection of TDD in Software Engineering and AI
Every tool, every IDE, is making sure to ride the AI wave. You can even find AI libraries that can be incorporated into your workflows. But what do these AI add-ons do? Just about everything! You have add-ons to help you write code, create test cases, record test cases, analyze test cases, prepare insightful reports, and more. Many of these AI assistants, like GitHub Copilot, are powered by Large Language Models (LLMs) that can generate outputs based on user prompts.
It’s only a matter of time before one uses it for TDD. After all, TDD isn’t perfect. It has its own set of shortcomings, like a steep learning curve, slower initial development, difficulty with legacy code, test maintenance, and rigidity in a world that needs the process to be more flexible.
Pitfall of Relying on AI in the TDD Approach
Using AI, especially generative AI, for coding looks very helpful on the surface, but it isn’t very reliable. Here’s why.
AI’s Tendency to “Hallucinate” or Over-engineer
AI code assistants can generate plausible-looking test cases and code for you. But here’s the thing. You don’t know where it learnt the semantics and test cases from. So what AI might offer to you may still be incorrect, inefficient, or overly complex code. It may give you extra code, which is not part of the requirement, but it thinks you should use. You might even find yourself in a situation where you get tests and code from AI, but they don’t work well with one another. Then you need to correct it and make it suitable for your application.
You might like: What are AI Hallucinations? How to Test?
Working with AI’s Non-Deterministic and Blackbox Nature
Traditionally, we write the test and the code. It is all predictable and straightforward – your tests, your code. All the control is in your hands, which is the essence of TDD, controlling your code through well-thought-out tests. But the moment you bring AI into the picture, you’re adding a variable that is non-deterministic. It is hard to explain why AI did what it did. It’s like pair programming, but with a partner who cannot give you reasons. For example, you might give precise specifications to your AI and still end up with code or test functions that you don’t need. That defeats the purpose of TDD, where you need to be concise and not to mention add a ton of refactoring effort. In TDD, if you get stuck, you undo till the last green and move ahead from there. But with AI, you’re kind of stuck in a loop since you don’t have complete control over the process. You need to see if there’s a test closer to where you are now. Instead of moving forward with the red-green-refactor-repeat cycle of TDD, you end up going back and forth between what AI thinks and what you need.
You might like: Generative AI vs. Deterministic Testing: Why Predictability Matters.
Automation Bias
Developers might become overly reliant on AI-generated code and tests, leading to “automation bias”, where they trust the AI output uncritically. If the AI also generates tests, it might generate deceiving tests that pass even for buggy code. Read: AI Model Bias: How to Detect and Mitigate.
Slower Progress
This might come as a surprise to you, but even with the ultra-fast AI at your fingertips, TDD can get slowed down. Primarily due to you losing control. Consider this example – you want to create code for the add functionality of a calculator. You write a prompt to your AI assistant, and it generates results. But wait, the function names aren’t right, the coding style is not to your liking, there are some extra asserts and functions, and it’s basically not how you need it to be. You’re constantly struggling to make AI do things your way, fighting for control. So naturally, it slows you down. On the contrary, if you did this the traditional way, you’d be slow to start with, but then it gets faster and smoother.
How to use TDD with AI?
You can still use AI to spruce up your TDD process, not just to get the job done quicker, but to overcome TDD’s shortcomings as well. But don’t just blindly ask AI to create TDD tests for you. That’s a bad idea. Try these ways to incorporate AI into TDD.
Be Crafty with Your AI Prompts, But Know When to DIY
A lot rides on how you converse with your AI assistant. Don’t expect it to magically read your mind. While you can take it a little easy with AI by your side, you still need to be in control of the process. So think about your application, like its layout, the parameters and their interactions with methods, their type, scope, and so on. Having a clear mental picture of what you want is going to be the best way to enter this. Remember to choose where you want to work on the prompt and where you want to take the code into your hands. If you think some aspects are better done by yourself, go for it.
You might like: Prompt Engineering in QA and Software Testing
Create Tests Yourself, Use AI for Coding
The second D in TDD is the key to getting TDD right. Your tests need to drive the development. Think about it, you move forward when you think that a test works (the green stage), and it also looks okay (the refactor stage). LLMs can easily generate code in a matter of seconds, on a whim. But only with the right proof, that is, tests, validating that what was generated is what is needed, is of any consequence. So in a way, tests are bigger assets than code. Thus, if you have the right tests that tell you how the system works, then code refactoring and regeneration are cheap and quick.
Maybe you can use a Bit of AI for Test Creation
While test writing should be your mental work, you can pique AI’s brain (or engine) too, but just a little bit. You can use AI to create code stubs based on your requirements in any programming language. You’ll then have a base framework to build on. But implement the tests yourself.
You might like: What are Vibe Coding and Vibe Testing?
Run Tests After Every New Code
If you’ve agreed to write the tests yourself and let AI help you with the code, then run your tests every time AI brings something new to be added. At times, we tend to switch our brains off and blindly rely on AI.
Use Feedback to Refactor
If TDD isn’t trying enough, here’s another step for you: take into account all the feedback you get from test runs, peer reviews, or even AI. Use that information to make your tests and code better, and use AI wherever applicable, but under your supervision.
Final Note
TDD isn’t just about writing tests; it’s a way of thinking about software development. You can better your TDD endeavors by allowing it to evolve by adding AI to the mix. But human oversight still remains paramount. By writing the failing test first, you give the AI a precise target. The test acts as a clear specification and an immediate correctness checker. The AI must generate code that passes your specific, human-defined test. Even the “refactor” step encourages human review and improvement of the code, regardless of how it was generated.
Achieve More Than 90% Test Automation | |
Step by Step Walkthroughs and Help | |
14 Day Free Trial, Cancel Anytime |
