Turn your manual testers into automated experts Request a DemoStart testRigor Free

Why Selenium Sucks for End-To-End Testing in 2022

Let’s get to the basics, shall we?

What is an end-to-end test? We define it as a test that can potentially span multiple UIs and perform testing from an end user’s perspective.

Well, Selenium is not a good fit for cross-system testing or for emulating a user’s real-world experience.

Let me explain…

How does Selenium work?

Selenium was created in 2004. Way before the Single Page Apps were in favor and when pages looked like this:

and HTML at the time looked like this:

Let’s compare this with the modern version of similar part of Amazon’s page:

Notice how the complexity grew exponentially? Where it used to be just one simple table, there are now 10+ levels of nested div elements! But Why?

Why does Selenium encourage the use of XPath?

Let’s look back to the roots of XPaths. HTML was created in 1993 as HyperText Markup Language with the goal to reflect the structure of a document. XPath first appeared a few years later, in 1998 – to reflect the path of a structured document (similar to URL for the web). Selenium’s design embraced the paradigm and relying on XPaths made total sense at the time.

Unfortunately for Selenium though, a lot has changed since then. HTML is being used differently now – mainly to position elements on the screen, often with a large combination of nested divs.

Selenium Webdriver encourages its users to stick to XPath locators by design. This approach worked well a decade ago, when pages had rather simple structure, but it doesn’t quite work anymore. Nowadays pages have insanely complex barely-human-readable structures, and even more so, these structures are constantly changing. HTML was not designed to render fancy UI we do now. It is impossible to rely on any technical information like XPaths to make a reference to elements stable enough in an actively developed application. And things like ids and data-test-ids are not really working for list and table elements. I’m not even talking about the lack of ids at all in React.

Let’s look at the XPath from the example above for an Amazon a-tag: /html/body/div[4]/div[2]/div/div[1]/div/div[2]/div/div[1]/div/div[1]/div[2]/div/div[2]/a

And this is the best Google Chrome DevTools inspector could come up with:

//*[@id="zg_left_col1"]/div[1]/div[2]/div/div[2]/a

Even fancy SelectorsHub extension could only come up with this:

//div[@id='8mNf9lO2-mC1H7sJJMcE_g']//a[@class='a-link-normal']

This is absolutely unreadable and would be creating a maintenance nightmare technical debt!

Debugging issues

Another issue with Selenium is the pain of debugging a test. Imagine yourself investigating why a certain test failed – and finding out that it happened due to being unable to find an XPath. Your next step will likely be to copy this XPath to the browser, only to confirm that indeed such element on the page does not exist. Now you have to play the guessing game to figure out what this element was about. What if this person is no longer with the company anymore? How do you know?

Same applies not just to XPaths, but also to CSS Selectors, data-testids, ids, etc. As soon as the reference to an element is not from an end-user’s point of view – it is susceptible to breaking while still working for users at the same time.

Basically, the current way of working with the page has the following issues:

  1. It is nearly impossible to understand what element being referenced unless your Selenium code is heavily documented and that documentation is not our of sync with the code;
  2. Only developers can understand test failures since the error descriptions are cryptic;
  3. The structure had not been designed to properly handle modern apps with forms and tables – it lacks a stable and reliable way to refer to elements.

The end result? Instead of creating new tests, you have to spend an increasingly larger amount of time on maintaining existing ones! Our experience shows it to be a very common issue among teams which have been developing tests for 1 or 2 years, and the number of tests reaches a certain amount. They often have to spend up to 50% of their day on test maintenance rather than doing something more productive.

Now combine that with cross-systems testing where you don’t control the HTML of the system under test. No amount of BDD/Shift-left will help you to reduce the amount of maintenance required to constantly catch up with someone-else’s changes in 3rd party apps (think Salesforce).

How should end-to-end testing work?

Think about it. What are end-to-end tests supposed to do? They are supposed to help you validate that your functionality works from the end-user’s perspective.

Therefore, the way you should refer to elements should be from the end-user’s perspective. There should be an easy, stable way to work with forms and tables emulating a user interacting with a browser or device.

The only things that matter to any actual user are finding the right input to enter, or locating a correct button.

Let’s talk about forms. Here’s another example from amazon.com:

with HTML:

Notice both the id and name of the element is clear and descriptive! Great! Problem solved then, but is it really?

The moment you change your UI framework to React your fancy ids are gone! When you migrate to some back-end-hooked rigid framework (or a new version of it) your name would probably have to change as well (think ASP.NET). And, this is exactly when you want your end-to-end tests to work! Because you just migrated to a new framework!

Therefore, a proper end-to-end testing tool should never hook up onto the internals of your applications, but, rather, how it looks from the end-user’s perspective! Look at the “City” input on the screenshot above. I’d argue that it will always have either placeholder saying “City” or whatever an end-user perceives as a “label”.

Again, based on our experience (don’t trust us, check for yourself) not everyone would have such a proper HTML structure like Amazon with a label for structure in place. So, unfortunately, you can’t rely on that either.

Therefore there should be a way to describe input from an end-user’s perspective relying on what is considered a “label” or a placeholder.

And it should look something like this: enter "San Francisco" into "City"

Right?

Next let’s talk about tables, shall we?

Here is one of the most widely used examples from Salesforce:

What matters from end-user’s point of view is that the row containing the ProperUniqueCompany has a certain status. Or that the down icon on the last column on that row can be clicked.

So, ideally, it should look something like this:

Validate that table at row containing "ProperUniqueCompany" and column "Lead Status" contains "Open - Not Contacted"

or

Click on the table at the row containing "ProperUniqueCompany" and the last column

which should work regardless of how the table is rendered – whether it’s HTML <table> (like in Salesforce example), or using <div>-based rendering (like in Amazon example).

What users certainly don’t care about are those ids, names, or data-test-ids of those elements. Moreover, they often would lead to situations where those ids/names/etc changed causing the test to fail even though from an end-user’s perspective everything is perfectly fine. And that would reflect test stability!

Think about it, if you only need to maintain your test when the application actually changes as opposed to when HTML code would change, wouldn’t it be wonderful?

And, fortunately, there is a way now! The examples in this article are actually executable code from testRigor. Which you can use for free now.