04/05/2017 Engineering Christopher Reichert

Use these 4 essential techniques to build reliable API and web service test automation for your QA pipeline.

Building automated API tests and QA pipelines for web services is difficult. Some of the most prominent issues automated end-to-end and acceptance testing techniques exhibit are:

  • unreliable and flaky tests (too many false positives)
  • hard to isolate and identify errors in large systems
  • slow turnaround time for bug fixes (compared to unit tests)
  • Difficult to make tests fully hermetic; required setup and teardown steps may leave leftover test data which can alter future tests and/or production systems.

Our team has built a lot of test automation at all levels of the stack as we have expanded Assertible. During this process, we've developed some guidelines to help dramatically increase the productivity of automated testing systems and decrease bugs:

  1. Test the smallest unit possible

  2. Create reproducible and deterministic tests

  3. Reduce the number of end-to-end tests

  4. Use opportunistic automation

For the purposes of this blog post, we define the following set of testing concepts: - **End-to-End tests**: validate a system with complete set of components (exactly what the user sees) - **Integration tests**: tests a specific interface (UI, API, other micro-services via HTTP) - **Acceptance tests**: same as integration tests, but usually involve a "workflow"

1. Test the smallest unit possible

Avoid testing unecessary interfaces and components in your system when running automated QA tests. Verify functionality at the lowest level possible in the system to minimize test dependencies.

For example, a basic modern web app infrastructure may have the following components:

  • API
  • User Dashboard
  • Website

In the context of this infrastructure, testing a user login can be broken down into several small and distinct test units:

  • multiple integration tests against individual components (e.g. API) to validate primary functionality and edge cases. Each test need not be concerned with other components in the infrastructure.

  • one end-to-end test for the primary login workflow to validate functionality from the user's perspective.

    For example, requesting the login page, submitting the login form, then checking some invariant in the response body or on the page, which the user will see. This can also be augmented with tools similar to Selenium and Selenium WebDriver.

Using this technique discourages having many complicated end-to-end tests. In practice, complicated multi-step tests are flaky, break for the wrong reasons, are slow, and have a maintenance burden. There's a more extensive and rigorous discussion about the downsides of complicated end-to-end tests on the Google Testing Blog.


  • test specific interfaces (e.g. API or micro-service, or UI)
  • avoid testing the UI when possible, instead prefer testing an API or isolated backing service
  • validate one feature or action per test
  • create small, isolated tests
  • allocate only the data needed, keep tests hermetic


  • easier to locate the source of errors
  • quicker turnaround time for getting fixes
  • improves reliability of tests
  • improves test performance
  • reduces maintenance burden

2. Create reproducible and deterministic tests

Having reproducible, deterministic test automation is one of the most critical components to developing reliable end-to-end and integration test-suites for QA systems.

Google's Testing Blog describes the problems their testing team has encountered with flaky tests and their approach to fixing the problem. They estimate that at least 1.5% of tests are failing each run due to false positives:

across our entire corpus of tests, we see a continual rate of about 1.5% of all test runs reporting a "flaky" result. We define a "flaky" test result as a test that exhibits both a passing and a failing result with the same code.

Flaky tests have several negative side-effects for teams. They require time to debug and maintain, cause noise in alerting and monitoring notifications, and offer very little value due to these problems.

The best way to handle flaky tests is to dilligently refactor and remove them from your test-suites when necessary.

In addition to purging flaky tests, it's important to keep test data ephemeral, isolated, and consistent. This is often done by allocating the smallest amount of test data possible for each test. It's also encouraged to have "static" sets of test data that don't change in between test runs. For example, having a test account or user ready-to-go is preferrable to allocating test data each time tests are executed.


  • dilligently remove or refactor flaky tests
  • keep test data isolated (don't reuse between tests)
  • allocate only the data needed, keep tests hermetic


  • improves maintainability
  • decreases false-positives
  • improves confidence and reliability
  • ability to run tests concurrently

3. Reduce number of end-to-end tests

End-to-end tests should be used as a last line of defense against bugs in a QA pipeline. For the purposes of this post, we've defined end-to-end tests as tests which validate functionality of the entire system from a user or client's perspect. In contrast, integration tests validate the functionality of a specific interface and it's immediate dependencies, such as a database.

Preferrably, unit tests and other pre-deployment tests make up the majority of tests in your pipeline and are the first line of defense against bugs and regressions. Additional integration tests should be your second line of defense; used to test isolated interfaces in post-deployment automated testing and monitoring scenarios. A minimal amount of end-to-end tests should be used to verify the functionality of primary workflows.

This idea is encapsulated nicely in the Testing Pyramid (created by the Google Testing Team).

Google's Testing Pyramid - Simplified

As a good first guess, Google often suggests a 70/20/10 split: 70% unit tests, 20% integration tests, and 10% end-to-end tests.

-- Mike Wacker

Catching bugs lower in the pyramid can drastically save time and effort. Lacking a solid base of unit tests can cause problems further up the pyramid.

When QA tests fail, it's often difficult to find the source of the bug quickly. In general, more communication is required to permanently identify and resolve bugs uncovered using production or post-deployment test automation because more teams (or team members) may need to communicate the failures.

Relying on production data to identify issues means that fixing these issues is very time-sensitive.

-- Rouan Wilsenach

In general, each time a new bug is uncovered using end-to-end tests a new unit test should be added to identify and permanently prevent the bug in the future.

We use this exact process at Assertible each time a new bug is uncovered during our automated post-deployment tests. Especially when the offending bug is uncovered in a non-production environment (e.g. staging).


  • rely on unit tests as the first line of defense against regressions
  • 70% unit tests, 20% integration tests, and 10% end-to-end tests (loose guideline)
  • continuously create new unit tests each time integration or end-to-end testing/monitoring identifies a bug


  • less end-to-end testing lowers maintenance costs
  • quicker recovery time to identify and fix bugs
  • less bugs make it to production
  • future bugs prevented by moving tests down the pyramid

4. Use opportunistic automation

There are two primary times to run automated API tests:

  1. Immediately after a deployment. See our post about preventable deployment failures for further reading.

  2. On a schedule for continuous monitoring

Additionally, automated tests should be consistently run against staging and other testing environments in addition to production. Ideally, a continuous delivery or deployment system should handle deploying every release candidate to a staging or testing environment where automated test-suites and QA are executed.

A high quality continuous integration and deployment pipeline makes it easier to avoid testing only your production systems; especially when using a consistent set of deployment and validation scripts on each and every environment.

Relying entirely on [...] testing in production for an understanding of the quality of your system is an Anti Pattern. [...] Finding the right balance of pre-production and production quality practices can help you gain a more realistic and holistic understanding of the quality of your system.

-- Rouan Wilsenach


  • run automated integration and end-to-end tests after every deployment
  • use scheduled monitoring to identify regressions between deployments
  • test production and staging environments w/ identical test-suites


  • identify bugs and regressions before they hit production
  • quicker iterations between features

Start building your automated QA pipeline
Assertible is free to use. Contact us if you have any questions or feedback!


I've highlighted some of the core concepts, problems, and best practice for developing and maintaining high quality automated API and web application testing systems. The primary take-away is to test the smallest unit possible and always have unit tests.

Let's talk more about API testing and QA! Send me a message on Twitter and let me know your thoughts.

If you want to learn more about automated QA testing, check out these fantastic blog posts:

More Resources:

:: Christopher Reichert


The easiest way to test and
monitor your web services

Define and test your web service with Assertible Track and test API deployments across environments Schedule API monitoring and failure alerts

Reduce bugs in web applications by using Assertible to create an automated QA pipeline that helps you catch failures & ship code faster.

Sign up for free