AI Features Testing: A Comprehensive Guide to Automation

Pragya Yadav

To grow your product, you might have already released or will probably release LLM-backed features for your customers in 2024. This is required because knowing customers’ real-time sentiment is crucial in understanding their buying decisions and your business expansion.

With Large Language Models (LLMs), you can gather valuable insights from customer actions in real time and use this information to increase customer satisfaction and business further. For example, with this real-time sentiment analysis, you can reach out to a dissatisfied customer within seconds rather than making them wait for days.

LLMs help your business provide personalized customer experience, real-time customer support, product and service development, emotional intelligence, community management, and social media.

Now comes the question: How to test LLMs? Testing LLMs can be tricky, even with manual testing. The reason is simple: you need to know what precisely the input should be and what to expect while testing.

Let’s take it a step further and find out how to automate testing of advanced AI features such as LLMs.

How to Perform Automation Testing of LLMs

LLMs’ output is often (or always) different, which makes testing them challenging. So, how do you automate those tests? Is there a way to ensure a small typo fix in a prompt doesn’t accidentally derail the whole feature?

The solution is to use another LLM’s intelligence to test LLM-based applications. LLMs such as testRigor make it exceptionally easy to do so. testRigor is a generative AI-based automation testing tool that lets you perform testing of complex test scenarios using plain English or any other natural language (English, Spanish, Portuguese, German, French) commands.

Testing AI and LLMs using testRigor

testRigor is an AI agent that works on NLP, ML, and generative AI. Its simplicity, ease of use, and advanced capabilities to test complex test scenarios are the vision behind testRigor. Know in-depth about AI in software testing.

Let’s see a few examples to understand how testRigor can be the best companion for testing your LLMs.

Example 1: Test positive user sentiment in a chat

You can verify whether the customer chat has a positive message or not using testRigor’s AI as below:

check that "chat" "contains a positive message" using ai

Example 2: Test the window is chat and the restaurant has positive reviews

You can verify that the window is a chat window, and the meaning of the last message is that this restaurant has positive customer feedback online.

check that "chat" "is a chat and the last message is equivalent to 'this restaurant has a lot of positive reviews'" using ai

Example 3: Test true and false natural language statements

You can utilize testRigor’s Vision AI to validate a statement about the webpage as below:

check that statement is true "page contains testRigor logo"

Example 4: Test the positive growth of a graph

You can test that a graph’s image on the webpage shows positive growth. This is possible using testRigor’s Vision AI:

check that page "contains an image of graph of positively growing function" using ai

All of the above testRigor’s commands will invoke AI to analyze the page/screen and do complex validations that were only possible to do manually previously. testRigor makes testing advanced AI features such as LLMs extremely easy and straightforward.

Here is a testRigor test case with all the above commands in plain English:

Read how to create end-to-end tests using testRigor in natural languages.

testRigor’s Capabilities

Email, Phone Call, and SMS Testing: Use simple English commands to test the email, phone calls, and SMS. These commands help validate 2FA scenarios, with OTPs and authentication codes being sent via email, phone calls, or via phone text.
Reusable Rules (Subroutines): You can easily create functions for the test steps that you use repeatedly. You can use the Reusable Rules to create such functions and call them in test cases by simply writing their names. See the example of Reusable Rules.
Global Variables and Data Sets: You can import data from external files or create your own global variables and data sets in testRigor to use them in data-driven testing.
2FA, QR Code, and Captcha Resolution: testRigor efficiently manages the 2FA, QR Code, and Captcha resolution through its simple English commands.
Table Handling: testRigor simplifies table handling and testing with its easy natural language commands. You don’t need to worry about the DOM anymore. Read: How to work with tables using testRigor?
File Upload/ Download Testing: Execute the test steps involving file download or file upload without the requirement of any third-party software. You can also validate the contents of the files using testRigor’s simple commands.
Database Testing: Execute database queries and validate the results fetched.

Read the documentation with examples to learn more about the testRigor’s powerful capabilities.

Why use testRigor for LLM and AI Testing?

Here are few of the many benefits that testRigor provides:

Quick Test Creation: Create tests using testRigor’s generative AI feature; just provide the test case title/description, and testRigor’s generative AI engine will automatically generate most of the test steps. Tweak a bit, and the plain English (or any other natural language) automated test cases will be ready to run.
Eliminate Test Maintenance: There is no maintenance nightmare because there is no reliance on implementation details. This lack of XPath and CSS dependency ensures ultra-stable tests that are easy to maintain.
Import Existing Manual Test Cases: You can import and refine your manual test cases with reusable steps (subroutines). Import your existing manual test cases from test management tools such as TestRail, PractiTest, Zephyr, etc. Read: Import test cases from TestRail for execution.
Codeless Testing: testRigor eliminates the need for programming language knowledge by converting English test scripts into actual code internally using advanced Natural Language Processing (NLP). You can also use our test recorder to record UI actions and create the test cases easily in plain English, meaning learning to code is unnecessary.
Everyone in Team Tests: Product managers can review test cases; testers and your business analysts, sales, and marketing teams can write and execute test cases using testRigor.
Shift Left Testing: Leverage the power and advantages of shift left testing with testRigor. Create test cases early, even before engineers start working on code using Specification Driven Development (SDD).
Single Tool for Every Testing Need: testRigor enables you to test web, mobile (hybrid, native), API, and desktop apps with minimum effort and maintenance.

testRigor offers an ‘AI-driven Test Automation Engineer‘ certification for free. Get your certificate today!

To try testRigor, just register, and you can start test automation immediately with no learning curve! Sign up here.

You're 15 Minutes Away From Automated Test Maintenance and Fewer Bugs in Production

Simply fill out your information and create your first test suite in seconds, with AI to help you do it easily and quickly.

	Achieve More Than 90% Test Automation
	Step by Step Walkthroughs and Help
	14 Day Free Trial, Cancel Anytime

“We spent so much time on maintenance when using Selenium, and we spend nearly zero time with maintenance using testRigor.”

Keith Powe VP Of Engineering - IDT

Start testRigor Free

Request a Demo