Test-Driven Development: Using Generative AI to Create Good Code

Publish Date
November 26, 2024
Share

Creating well-engineered and correct code is the goal of every developer. One methodology that stands out in achieving this is Test-driven development (TDD). While TDD does not design your code or write it for you, it is a methodology in software development that allows you to get rapid feedback on design decisions on an iterative development cycle.

Whether writing unit tests before or after coding or never writing tests depends on the developer’s choice. Obviously, never writing any tests causes critical situations in production, if the developer is lucky they will discover errors during the manual tests stage.

Writing tests after completing the code is better than no tests. Most developers, including us, are applying the code first approach straightforwardly then writing tests. Nevertheless, the written code isn’t any better though. Some functional errors will be present if we forget the test case. Writing failing test cases before for every piece of code, then adding the code to make that test pass by following the TDD cycle is much better.

Currently, we are discussing how we can use Generative AI to make our work more automated. In this work, we are introducing a new experiment inspired by Allen Helton’s blog. We write several test cases for the project and then ask ChatGPT to generate the test-related code.

Generative AI for Test-First Development

Given the potential for human error in manual testing, the question arises: Why not automate TDD using Generative AI?Generative AI can play a transformative role in TDD by automating the creation of unit tests and the corresponding code. Here's a proposed workflow:

Tests Written by Developers: Developers begin by writing the initial unit tests, specifying the desired outcomes for their code.
Code Generated by AI: Generative AI reads these unit tests and produces the necessary code to pass them. If the code runs successfully, it's saved to the disk. If not, the AI generates an error report and attempts to produce the correct code again, creating a loop until the tests pass. By incorporating AI in this manner, developers can ensure that their code is thoroughly tested and validated, minimizing the risk of errors.

Benefits of using generative AI in TDD

Automating the code generation and testing process can significantly speed up development cycles. Also, reducing the potential for human error in test cases and code writing. While doing this, ensuring all tests are covered comprehensively and consistently and easily scaling the development process without increasing the manual workload.

Potential Challenges and Solutions

While integrating Generative AI in TDD offers numerous benefits, it's not without challenges. The initial setup of AI systems for TDD can be complex and time-consuming sometimes. Developers may need training to effectively use and trust AI tools. But even if the developers are ready to handle it, AI might struggle with unique or edge cases that are not well-represented in the training data so this might confuse the asked scenarios. Generative AI responds with decent code but it is not the great and best choice. Because yet code does not apply any feedback mechanism. To produce great code we should apply feedback mechanisms to AI like the developers' code review approach. We can combine AI with human oversight to handle complex scenarios, at least for now.. :)

Feedback Applied Generative AI in TDD

Feedback Mechanism

To further refine the process, a feedback mechanism can be introduced. Developers write the unit tests, and a developer agent generates the code. A new layer of scrutiny is added with a code reviewer agent, which reviews the code and provides feedback to the developer agent.

Iterative Improvement

The developer agent revises the code based on the reviewer agent's feedback and resubmits it for review. This iterative process continues until the code passes the review successfully.

Final Testing

Once the code is approved by the reviewer agent, unit tests are run. If they pass, the code is saved. If they fail, an error report is generated, and the process loops back to code generation.

In essence, the workflow would be as follows:

The developer writes the unit tests.ev
The AI reads these tests and generates the necessary code.
If the tests run successfully, the code is saved. If not, an error report is generated, and the AI attempts to correct the code in a loop.
This initial automation can produce code that is prone to errors, which is where the feedback mechanism comes into play.
Developers write the unit tests, and the developer agent generates the required code and methods.
A code reviewer agent reviews the code and sends feedback to the developer agent.
The developer agent refines the code based on the feedback and resubmits it for review.
This cycle continues until the code reviewer agent approves the code.
Once approved, the unit tests are run. If successful, the code is saved; if not, an error report is generated, and the loop continues.

By incorporating a feedback mechanism, the quality of the generated code improves significantly, ensuring that it meets the necessary standards and passes all tests reliably.

Conclusion

Integrating Generative AI in Test-Driven Development (TDD) has the potential to revolutionize software development. By automating the generation of unit tests and corresponding code, developers can ensure higher code quality, reduce errors, and streamline their workflow. While there are challenges to overcome, the benefits far outweigh the initial hurdles, making this a promising approach for the future of coding. You can check our Coder AI project if you are interested in supporting programming using generative AI.

Future Directions

There are three items that we plan to build on this approach:

Expanding Language Support: While we have been focusing on Java, this methodology can be extended to other languages like Python, JavaScript, and more.
Automated Test Generation: We can prompt Generative AI to write the tests based on high-level requirements provided by developers instead of having manually written tests.
Extending Agentic AI Team: Beyond a simple code reviewer, we could introduce more specialized agents such as a QA agent for testing edge cases, a performance agent for optimizing code performance, and so on.

We recommend every software developer spend their time writing more robust and accurate test cases and then benefit from the power of Generative AI to produce high-quality code.