What are Flaky Tests in Software Testing? Causes, Impacts, and Solutions

Artem Golubev

QA Resources

Introduction

Automated tests are an integral part of the software development process, ensuring the application aligns with intended requirements, functions as expected, and remains free of new bugs or regressions. However, some test cases exhibit inconsistent behavior — they intermittently pass or fail without any changes in the automation code. These inconsistent test cases are termed “flaky tests,” and they can be frustrating and time-consuming to remedy. They often result in false positives and negatives, delaying the detection and resolution of actual bugs. Addressing flaky tests is, therefore, essential for maintaining efficient and reliable automated testing processes. By identifying and addressing these, development teams can prevent false bug reports, save resources, and ensure a smoother software development process. Test automation has come a long way, and there are now solid ways to make your tests more robust, which we will discuss shortly.

First, let’s consider a few examples of flaky tests.

Examples

For an automation project, one scenario involved creating new account cases with the requirement of unique usernames. The team used the format “user + current time” for unique identification. The test ran smoothly until it started executing in parallel, leading to occasional failures. The reason: when more than one test case starts simultaneously, the username becomes already in use, causing the account creation test case to fail. To resolve this, the QA team provided a username like – “user + test case id + current time”.

In another scenario, a pop-up appeared during the execution of a test case. The test case, which passed when executed alone, began to fail when executed with multiple other scenarios. Upon analysis, it was found that the pop-up was delayed when multiple test cases were executed. The automation team added a 1-second wait condition and clicked the close button. However, since the close button was not yet present, the test case failed. Here, the issue also lies with the framework, which lacks a dynamic wait option for the pop-up to appear.

Causes

Flaky tests can occur due to various reasons. While most point towards the framework, the automation tool, or the test scripts, let’s explore some of these causes.

Timing Issues: Tests can become flaky due to timing issues when the test code relies on certain events’ timing. For instance, if a test checks for a particular element on a webpage after a specific delay, network issues or CPU performance differences between test runs could cause timing inconsistencies, resulting in intermittent test failures.

Element Locator Issues: Many automation tools use XPath to locate webpage elements. However, XPath can be unstable as it is sensitive to changes in the page’s DOM. When developers modify an element’s properties or add similar elements to the page, the XPath initially used may no longer be valid, leading to false positives or negatives.

Test Code Issues: Poorly written or ambiguous test code can make tests flaky. If the test code is unclear about the expected application behavior, the test may fail or pass inconsistently. Similarly, complex test code or code relying on external dependencies may be more prone to failure.

Test Data Issues: Tests dependent on test data can be flaky if the data is inconsistent. For example, corrupted test data or different test runs using the same data can lead to inconsistent test results.

Test Configuration Issues: Inconsistent test configuration between test runs can cause flaky tests. Incorrect test parameters or improper test settings set-up can lead to test failure.

Environment Issues: Flaky tests can also be due to the environment where the tests are run. Network connectivity issues, hardware differences between test runs, or differences in test environments can all introduce non-determinism into the testing process, leading to flaky tests.

Impact

Flaky tests significantly impact software testing due to their non-deterministic and inconsistent results, which affects the accuracy and reliability of the testing process. Here are some impacts flaky tests can have on software development.

False Positives: Flaky tests can cause false positives, which occur when a test reports a failure, even though the application under test is functioning correctly. False positives can lead to wasted time and resources in diagnosing and addressing non-existent issues.

False Negatives: Conversely, flaky tests can also produce false negatives. This situation arises when a test reports a pass, even though there is a genuine issue with the application under test. False negatives can lead to undetected issues, potentially causing problems for end-users.

Reduced Trust in Test Results: Unreliable and inconsistent results from flaky tests can undermine trust in the testing process. When actual issues in the application under test are hard to identify, it can lead to delayed releases, lower-quality software, and increased costs. If stakeholders lose faith in the testing process, it could also lead to reduced confidence in the quality of the software being developed.

Debugging Effort: Debugging flaky tests can be challenging. Often, the problem doesn’t lie in the code being tested but in the automation tool or framework, making it difficult to identify the root cause of the issue. This difficulty can lead to wasted time and effort in identifying and fixing flaky tests.

Test Stability: Flaky tests can impact the stability of the test suite. If a test is flaky, it may fail once but pass the next time it’s run, making it hard to identify which tests are genuinely failing. This inconsistency can lead to tests being removed or disabled, which can reduce the overall effectiveness of the testing process.

Increased Maintenance Costs: Flaky tests can heighten the maintenance costs of a test suite. Intermittent failures may require more maintenance to keep the test up-to-date and reliable. Developers might also need to spend more time maintaining the test suite as a whole to address flakiness-related issues.

Decreased Test Coverage: Flaky tests can reduce the test coverage of the codebase. If developers lose trust in a test due to its flakiness, they may skip or remove it from the test suite altogether. This action can lead to gaps in test coverage, increasing the risk of introducing bugs into the codebase.

Solution

One primary cause of flaky tests is the lack of intelligence in the automation tool to manage failures and adapt as needed. When multiple members of the automation team work on similar functions, using legacy automation tools can lead to complexity and redundant scripts, thereby reducing overall efficiency. A viable solution to tackle flakiness is implementing a sophisticated tool with the intelligence to detect and preempt flaky issues. testRigor helps to avoid flaky tests in your automation with its integrated AI and ML algorithms.

testRigor

Let’s examine how we can use testRigor to avoid flaky tests.

testRigor is a cloud-based, no-code automation tool that allows anyone to create test scripts using plain English, eliminating the need for programming languages. By using English as the script language, testRigor simplifies the automation process, reducing the time and effort required to create and maintain scripts.

testRigor empowers users to identify elements using the UI layer, such as an element name or its relative position – rather than XPath or any other locators. This approach enhances test stability, as it eliminates reliance on frequently changing and unstable expressions. Users can simply mention the element they want to interact with in plain English, such as click "Submit" button or click "cancel" on the left of the "Continue" button. This approach reduces the complexity of test scripts and minimizes the chances of errors.

testRigor also offers an inbuilt wait mechanism, which eliminates the need for testers to add waits to each test step. This feature ensures that the page has fully loaded before executing the next step, improving the stability and reliability of the tests. Additionally, testRigor provides various types of inbuilt validations that help testers ensure their tests’ accuracy and consistency.

The result is that testRigor tests are so stable that some of the customers even use them for monitoring!

Conclusion

Flaky tests can significantly challenge any testing team, causing delays, frustration, and inaccurate results. However, by using a tool like testRigor, teams can dramatically reduce the likelihood of flakiness in their automated tests. With its intelligent AI, codeless scripting in plain English, stable element selection options, inbuilt wait mechanism, and ability to execute scripts in mobile and desktop browsers with minimal configuration changes, testRigor offers a comprehensive solution to help teams achieve more reliable and efficient test automation. By adopting testRigor, testing teams can improve their testing outcomes and deliver higher-quality software with incredible speed and confidence.

Start testRigor Free

Request a Demo