Estimated read time: 30 min
Technical level: Intermediate
What you’ll learn: Why and how to build a scalable browser farm for E2E web automation
Ideal profile(s): Load Tester, Automation Specialist
Author: Dorian Cransac (exense GmbH)

Introduction

For the last 25 years, in traditional IT environments, much of the load testing of web applications has been done at HTTP level, and often, E2E functional testing has been done manually through the use of a real browser. More recently however, Selenium has emerged as an alternative way to automate web workflows by spawning a real browser instance and interacting directly with elements present in web pages. In this article we’ll take a look at some of the pros and cons of this approach and what this means in terms of cost once all factors are taken in account.

The pitfalls of HTTP

Most of our readers here will have had experience with or at least heard of LoadRunner, JMeter, Gatling, Grinder or SoapUI / LoadUI.

For a long time, these highly serviceable tools were the only viable option for simulating user traffic at scale against web applications, the main use case for them being load testing. Due in large part to the advent of cheap commodity hardware and horizontally scalable, distributed computing platforms, it is not the case anymore. It is now technically possible to build large automation clusters for relatively cheap.

So let’s take a look at some of the pitfalls of the old-school load testing tools.

The code base nightmare

The ever-increasing complexity of frontend web frameworks and of tasks handled by modern browsers makes scripting itself and managing scripts exponentially difficult.

One of the problems is that the number of request-response pairs increases substantially and requests become progressively more difficult to variabilize and replay in a way that makes sense to the server. With many frameworks producing cryptic request headers, parameters and contents, the resulting code becomes very difficult for a human to read and understand without investing a lot of time in tedious debugging sessions and comparisons of HTTP logs. In addition to traditional HTTP scripts, new technology such as Websockets and Server-Sent Events introduce new protocols as well as asynchronous workflows which are particularly difficult to script correctly.

Eventually the code base becomes too complex and cluttered and a lot of code gets thrown away. People don’t or can’t take the time to truly understand it, modularize it and build a large component-oriented codebase. Instead, they’ll just re-capture and re-variabilize their scripts from scratch at the beginning of each testing campaign. It’s just less time-consuming than maintaining the existing code.

For the sake of illustrating this point, let’s look at this simple one-liner in Selenium:

driver.get("http://www.bing.com");

It will cause for the browser to navigate to the home page of Microsoft’s search engine, Bing.com and retrieve all of the data necessary to display the page as a normal browser would. Now let’s compare this with the HTTP code that a tool like Grinder would generate:

def createRequest(test, url, headers=None):
    """Create an instrumented HTTPRequest."""
    request = HTTPRequest(url=url)
    if headers: request.headers=headers
    test.record(request, HTTPRequest.getHttpMethodFilter())
return request

connectionDefaults.defaultHeaders = \
  [ NVPair('User-Agent',
  'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Firefox/50.0'),
    NVPair('Accept-Encoding', 'gzip, deflate'),
    NVPair('Accept-Language', 'fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3'), ]

headers0= \
  [ NVPair('Accept',
  'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]

headers1= \
  [ NVPair('Accept', '*/*'),
    NVPair('Referer', 'http://www.bing.com/'), ]

headers2= \
  [ NVPair('Accept',
  'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
NVPair('Referer', 'http://www.bing.com/'), ]

url0 = 'http://bing.com:80'
url1 = 'http://www.bing.com:80'

request101 = createRequest(Test(101, 'GET /'), url0, headers0)
request201 = createRequest(Test(201, 'GET /'), url1, headers0)
request202 = createRequest(Test(202, 'GET hpc18.png'), url1, headers1)
request203 = createRequest(Test(203, 'GET bing_p_rr_teal_min.ico'), url1, headers0)
request301 = createRequest(Test(301, 'GET l'), url1, headers1)
request401 = createRequest(Test(401, 'POST lsp.aspx'), url1, headers2)
request402 = createRequest(Test(402, 'GET 8665a969.js'), url1, headers1)
request403 = createRequest(Test(403, 'GET 3adc8d70.js'), url1, headers2)
request404 = createRequest(Test(404, 'GET f1d86b5a.js'), url1, headers2)
request405 = createRequest(Test(405, 'GET 89faaefc.js'), url1, headers2)
request406 = createRequest(Test(406, 'GET 015d6a32.js'), url1, headers2)
request407 = createRequest(Test(407, 'GET ResurrectionBay.jpg'), url1, headers1)
request408 = createRequest(Test(408, 'GET d0c1edfd.js'), url1, headers1)
request409 = createRequest(Test(409, 'GET HPImgVidViewer_c.js'), url1, headers1)

#[...] (you get the point...)

Not only the size and complexity of the code differs by several orders of magnitude, but the Selenium version is so simple, that no capture (or even maintenance effort in certain cases) is even needed!

Generating realistic load is hard

Simulations can go wrong in many ways. Severe issues can slip through tests due to technical simplifications, but one can also end up generating false positives that only occur because of a slight technical difference between the simulation and the real world scenario. Since there is no possible way for a real-world user to cause these problems, no one should care about them, but you won’t know that until you’ve fully analyzed them. This second category of problems is usually the worst because it adds unnecessary workload and wasted time for many of the contributors involved with the project.

And again, with wider responsibilities on the browser’s side, it becomes continuously harder to manage technical accuracy. The following are examples of technical details which can prove to have a huge impact on performance or even functional correctness in production:

static resource caches
HTTP/TCP connection lifecycles
request concurrency & timing
Async and event-driven scenarios (SSE, Websockets..)
security settings (authentication type, encryption, etc)

As an engineer, I can think of multiple cases in which I’ve detected issues in browser-based simulations which would have assuredly gone unnoticed if said simulation had been implemented at HTTP level instead. I can also think of numerous instances in which we ran into problems at HTTP level, which would have never existed when using a browser.

Let’s take a quick look at the sequence of requests on the bing.com web page, captured with Chrome’s DevTools.

As you can see, up to 3 requests are being sent in parallel in this example, but many requests are also sent sequentially, with timings which may vary. Concurrency varies according to many factors and can get be very tricky at times.

User experience is tricky

Even when everything goes right with an HTTP simulation, it can be pretty hard to draw clear conclusions from the results regarding the quality of the experience, as perceived by the end-user. The response time of a request may or may not match the amount of time that a real user would be waiting for. In fact, in most cases, it doesn’t.

For instance, you may find that all of the individual requests are relatively fast, but once you add them up to reflect reality more accurately, the sequence as a whole becomes slow. But then if you don’t take concurrency in account, you may be exaggerating the seriousness of the performance issue. How can one set meaningful SLAs on such low-level components? And even if you could, how many people in your organization will be able to understand them?

Even in a best case scenario, you’re still potentially way off, since you’re not yet accounting for javascript execution time, the order in which individual elements are loaded on the page, which elements are seen first by the user or how long they’ll take to load once the raw data has been received by the client.

Here’s another view from Chrome’s DevTools which illustrates the complexity of the browser’s own stack:

You can see that the amount of time spent in the network layer only accounts for some of the overall page load time. There are some big holes in the network view which have to be accounted by other layers of code, such as JS execution time and rendering.

You might also spend a lot of time investigating slow requests which are completely secondary because they’re happening in the background and are not contributing much to the user’s experience.

In short, while a the protocol-level simulation could be good enough for server-side testing, it can’t accurately reflect user experience.

The browser world

In this section, you’ll see how all of the problems originating from the HTTP simulation either disappear or become much easier to deal with in a browser-based simulation.

A cleaner API: WebDriver

As far as scripting goes, it simply is much easier to work with events and DOM elements than with HTTP requests and responses. Requests carry information which is often difficult for humans to read and manipulate and often have nothing to do with the state of the user’s interface.

The WebDriver API, which was originally provided with Selenium but has now expanded to a much broader context, exposes basic interactions with a browser, leveraging XPath to browse its tree of HTML elements (DOM). Here’s a look at what the API looks like (from Selenium’s documentation):

public class Selenium2Example  {
    public static void main(String[] args) {
        // Create a new instance of the Firefox driver
        WebDriver driver = new FirefoxDriver();
        // And now use this to visit Google
        driver.navigate().to("http://www.google.com");
        // Find the text input element by its name
        WebElement element = driver.findElement(By.name("q"));
        // Enter something to search for
        element.sendKeys("Cheese!");
        // Now submit the form. WebDriver will find the form for us from the element
        element.submit();
        // Wait for the page to load, timeout after 10 seconds
        (new WebDriverWait(driver, 10)).until(new ExpectedCondition<Boolean>() {
            public Boolean apply(WebDriver d) {
                return d.getTitle().toLowerCase().startsWith("cheese!");
            }});
        // Should see: "cheese! - Google Search"
        System.out.println("Page title is: " + driver.getTitle());
        //Close the browser
        driver.quit();
    }
}

Working directly with the browser also makes for much more concise code, as we’ve seen in the earlier sections of this article. While it is possible to record scripts with tools such as Selenium’s IDE, in most cases, it is not even needed. An experienced automation engineer will simply be able to code his way through a test scenario after looking at the API a few times, which itself is extremely simple.

In the end, what really matters is one’s ability to maintain and reuse code through time, releases, and common components across projects. Recording is only the very tip of the iceberg since the point in the long run is indeed to avoid having to do it again every time something changes in the target application (the recording of the scenario itself but also all of the work that comes with it, such as input variabilization).

The runtime is a black box

All of the low-level HTTP configuration and implementation details are now handled by the browser itself, making configuration easy. No need to worry about resource caching, request concurrency, or any other technical detail. If it works in your own browser, it will work in the simulation! It’s that simple. The only thing you’ll have to worry about is making sure to re-use your browser instance the way a normal user would, but that’s the same as managing the lifecycle of your HTTP client.

Another thing made a lot easier in the browser case is replay. Analyzing the HTTP logs to find correlation issues can be tedious. Most HTTP tools don’t support HTML visualization or don’t support it well at all. In any case, javascript code won’t be executed and you’ll be left out of many clues which would have helped you diagnose issues in your scripts. With browser emulation, you get instant replay, with all of the necessary info, instantly. You can even capture screenshots and send them back as part of the outputs of your script.

Real user experience

In a browser-based simulation, direct interactions with the web page’s elements and events allow for the definition of meaningful transactions. Everyone can understand SLA’s that are set on events visible to the user. In turn, this means that both technical (engineers, devs) and non-technical staff (business analyst, project stakeholders) can speak the same language when discussing NFRs.

Cross-browser testing

Using different browsers (and browser versions) may lead to different results for the end-user.

One of the advantages of using a library such as Selenium along with a full-blown browser instance is the ability to run the same collection of scripts and test scenarios across many different OS & browser stack combinations, while performing deep checks on the client.

Without javascript code execution, most of the presentation and navigation issues would go unnoticed.

Exceptions to the rule

Plain HTTP(S) automation is still required from time to time, for instance, in integration testing where people are only wanting to run load against their server’s REST API and aren’t interested in the simulation of end-to-end scenarios.

Note: if you’re using a tool like step, you don’t have to choose between Selenium and HTTP libraries. In addition to full Selenium support, plugins leveraging the runtime libraries of certain legacy tools (for instance, JMeter or SoapUI) are provided. This way, people can get the best out of both worlds: a lightweight scripting environment they’re already used to, while benefiting from all of the superior services provided by a modern solution like step (workload distribution, central aggregation and archiving, collaborative workflows, etc).

Breaking down the total cost of ownership (TCO)

If you’ve managed a technical testing team or large-scale automation projects, you know that costs are difficult to manage. In this section, we’ll look a some of the most critical factors that drive costs.

Test code maintenance

Automation projects bring the same challenges as regular software projects, in the sense that they require the delivery, execution and maintenance of distributed code.

Despite best efforts, applications will undergo last-minute changes, simulations will grow more complex, issues will be identified and additional tests designed. As explained in the last section, managing the costs of test code maintenance is key and all of these facts speak very much in favor of browser-based automation.

The costs of creating and maintaining test code vary greatly based on a number of factors such as frequency of change, the complexity of the application and the experience of the engineer. However, in the case of a moderately complex application, our experience suggests that such a switch from protocol-based to browser-based emulation can cut these costs by a factor of at least 3 and possibly up to a factor of 5 in particularly tricky scenarios.

Infrastructure

Of course, the costs of infrastructure which is required to power the automation will also have to be factored in and kept within reasonable boundaries.

To this day, hardware allocation can be a difficult topic in many corporate environments. Even with the advent of virtualization in past decades, VMs didn’t come cheap. This is one of the reasons, among with reluctance to change, why some legacy tools are still popular. With a lower client footprint, hardware is not usually the issue for them, test code maintenance is.

Hardware resources are however becoming cheaper every day, especially in the cloud. If your company’s in-house offering drags you down, connecting an external cloud-based test farm may not be as hard as it seems. We’ve had a lot of success connecting our own cloud-based solutions to our client’s own infrastructure, resulting in a hybrid testing environment. Cloud providers contribute on a daily basis to the reduction of infrastructure costs and so, it is more and more common to see clients entirely rebuild their testing platform.

Tooling

Last but not least, client code has to be distributed across many hosts, and results have to be aggregated for analysis and then archived. This is where a lot of tools come up short and where [step] (https://step.exense.ch/) comes in play.

step is a language-agnostic automation platform designed from the ground up for scalability. Not only is it compatible with libraries such as Selenium, but it takes care, once and for all, of many responsibilities which would otherwise fall on the shoulders of the test engineers, such as central resource management, result aggregation and archiving or real-time monitoring & reporting capability.

Most importantly, elastic scalability with step is made very easy, as any number of agents can join and leave the automation grid at any point in time even while a test is running.

A concrete cost analysis

With these different sources of cost in mind, let’s take a look at an all-in-one solution to get an idea of what an average browser-based test campaign costs. The price discussed here covers the entire platform, including infrastructure as well as the software used to distribute the load in the cloud (step). The only cost not accounted for here is the cost of script development.

Since the TCO of an on-premise cluster is much more complex to price, we’ll use exense’s SaaS prices as a reference here. We will base our calculation on the hypothesis of a mid-size cluster featuring 100 concurrent browser instances. Assuming an average transaction duration of 800 ms, a cluster this size would support the simulation of around 7'500 interactions per minute (or 125 TpS), which is more than enough for many projects.

In our cloud, right now, activating such a cluster would cost $21 per hour. Assuming you need to do extensive operations like complex load-testing for four hours it would cost you only $84. If you’re any decently sized corporation, allocating these fund every day should be more than affordable. To satisfy all data safety requirements exense also guarantees that any data managed in the cloud physically stays within Switzerland.

If you prefer to shoulder these tasks and run your own cluster on-premise, you definitely can (and many of our clients do, for different reasons), but this will likely happen at a higher TCO.

Conclusion

In this article, we believe we’ve demonstrated that the true TCO of a browser-based simulation can be orders of magnitude smaller than the TCO of a traditional HTTP simulation, when all of the key sources of cost are factored in.

Please note that we, exense, do not have a horse in the race here. Our flagship automation platform offers both capabilities and lets you decide which library to use and which test strategy to choose.

But we strongly believe that in most cases, browser automation should simply be favoured.