4 All Pairs Testing - essenius/AcceptanceTesting GitHub Wiki

Background

When you have an increasing number of configurations to test, you will get hit by the combinatorial explosion: there are so many possible test cases that it becomes infeasible to execute them all. For example, you may need to test your web application on 8 client operating systems, 6 browser versions, 4 server operating systems and 3 web server versions. Testing all possible combinations would lead to 8 * 6 * 3 * 4 = 576 different configurations.

Most defects tend to show with a specific value of a single input variable or with an interaction between a specific pair of values. A study by NIST reported (page 5):

Interaction Rule: Most failures are induced by single factor faults or by the joint combinatorial effect (interaction) of two factors, with progressively fewer failures induced by interactions between three or more factors.

This means that it is likely that most defects will be caught if you can make sure to cover all pairs of these 'factors'. For example, when testing a Web application, a defect may only show up in Windows 10, or it may only show up in Windows 11 with Google Chrome. It is much less likely that a defect will show up in no other configuration than Windows 11 with Google Chrome and Apache.

Therefore, if you ensure that you have covered all possible pairs of parameters in your test cases at least once, you will have a high defect identification capability with substantially less test cases than would be required for all possible combinations. This is the idea behind All Pairs Testing (a.k.a. Pairwise Testing).

Example

We will use a small example to show the principle without requiring too large tables of test cases. Assume we have the following set of configurations we want to test:

3 client operating systems (OS): Windows, OSX and Linux
4 browsers: Firefox, Chrome, Safari and Edge
2 server OSes: Windows and Linux
2 web servers: Apache and Lighttpd

To achieve full coverage, we would need to create 3 x 4 x 2 x 2 = 48 configurations. Instead, we design a set of configurations that contains all pairs of Client OS: Browser, Client OS: Server OS, Client OS: Web Server, Browser: Server OS, Browser: Web Server, and Server OS: Web Server. For example, All Pairs of Server OS: Web Server would be Windows: Apache, Windows: Lighttpd, Linux: Apache and Linux: Lighttpd. The trick that All Pairs applies is covering more than one pair in each test case.

Generally, you can expect that All Pairs Testing will result in a number of test cases roughly equal to the number that would be required to obtain full coverage for the two parameters with the largest numbers of test values. In this case, we expect that we end up close to 3 x 4 = 12 test cases.

A set of tests that satisfies the all-pairs criterion is shown in Table 1.

Table 1 A Set of Tests Satisfying the All Pairs Criterion

Test Case	Client OS	Browser	Server OS	Web Server
1	Windows	Chrome	Linux	Lighttpd
2	OSX	Firefox	Windows	Apache
3	Linux	Edge	Windows	Lighttpd
4	Windows	Safari	Windows	Apache
5	OSX	Edge	Linux	Apache
6	Linux	Firefox	Linux	Lighttpd
7	OSX	Chrome	Windows	Apache
8	OSX	Safari	Linux	Lighttpd
9	Linux	Safari	Linux	Apache
10	Windows	Firefox	Linux	Lighttpd
11	Windows	Edge	Windows	Lighttpd
12	Linux	Chrome	Windows	Apache

You can validate by hand that this set indeed contains all pairs. For example, test cases 1, 4, 10 and 11 cover all browsers on Windows. Cases 1, 7 and 12 cover all client OSs with Chrome. At the same time, case 1 and 7 cover all web servers with Chrome as well as all server OSs with Chrome.

All Pairs Testing on its own is no guarantee for high defect finding capability, as shown in the paper "Pairwise Testing, A best Practice That Isn't" by James Bach & Patrick Schroeder. They state, amongst others: "Pairwise testing can only protect against pairwise interactions of the input values you happen to select for testing". In other words, if your input values don't cover all equivalence classes, you are very likely to miss defects. All Pairs Testing can be a very useful tool, but it must be applied consciously.

All Pairs Testing Tooling

Jenny

Creating the All Pairs set can be a tedious task. Fortunately, there are freely available tools that can help us, for example Jenny, which was used to help creating Table 1). The parameters are the number of values for the input parameters. To generate test cases for above example, we use:

C:\Apps\jenny>jenny 3 4 2 2
 1a 2b 3b 4b
 1b 2a 3a 4a
 1c 2d 3a 4b
 1a 2c 3a 4a
 1b 2d 3b 4a
 1c 2a 3b 4b
 1b 2b 3a 4a
 1b 2c 3b 4b
 1c 2c 3b 4a
 1a 2a 3b 4b
 1a 2d 3a 4b
 1c 2b 3a 4a

Jenny uses a very simple method to display the values: a number for the input parameter and a letter for the value of the parameter. The only thing you need to do yourself is to make the mapping: 1 = Client OS, value a = Windows, b = OSX, c = Linux, etc.

It is important that you implement all test cases that an All Pairs algorithm provides. But in practice not all pairs may be possible. For example, assume that we also want to test on the Internet Information Services (IIS) webserver. Then by default, Jenny would generate test cases that include Linux: IIS, which won't work because IIS is only available for Windows. But if you skip the Linux: IIS tests, you are likely to lose other pairs as well. To solve that issue, Jenny allows you to specify pairs that you don't want to see in the test set. This is how you specify that the Linux: IIS pair (3b: 4c) should not occur:

C:\Apps\jenny>jenny 3 4 2 3 -w3b4c
 1a 2b 3b 4b
 1b 2c 3a 4c
 1c 2d 3b 4a
 1b 2a 3b 4a
 1a 2b 3a 4a
 1c 2a 3a 4b
 1a 2c 3b 4b
 1a 2d 3a 4c
 1b 2d 3b 4b
 1c 2b 3a 4c
 1a 2a 3a 4c
 1c 2c 3a 4a
 1b 2b 3b 4a

Notice that indeed IIS (4c) is only paired with Server OS Windows (3a). Also, all combinations of Client OS (1a, 1b, 1c) and IIS as well as Browser (2a, 2b, 2c, 2d) and IIS are included.

In the same fashion, you can expand the tests to include browser testing on Internet Explorer (2e), but not including tests on Linux (1b) and OSX (1c) for Internet Explorer:

C:\Apps\jenny>jenny 3 5 2 3 -w3b4c -w1b2e -w1c2e
 1a 2e 3b 4b
 1b 2c 3a 4c
 1c 2d 3b 4a
 1c 2a 3a 4b
 1a 2b 3a 4a
 1b 2a 3b 4a
 1a 2d 3a 4c
 1b 2b 3b 4b
 1a 2c 3b 4b
 1a 2e 3a 4c
 1c 2b 3a 4c
 1a 2a 3a 4c
 1c 2c 3b 4a
 1b 2d 3b 4b
 1a 2e 3b 4a

Sometimes there is value in going beyond All Pairs and taking All Triplets. This will substantially increase the number of test cases, so you should only do that if there is a very good reason. Jenny supports All Triplets by using the -n parameter. Notice how the number of test cases triples:

C:\Apps\jenny>jenny 3 5 2 3 -w3b4c -w1b2e -w1c2e | find /c "1"
15
C:\Apps\jenny>jenny -n3 3 5 2 3 -w3b4c -w1b2e -w1c2e | find /c "1"

45

PICT

Another freely available tool is Microsoft's Pairwise Independent Combinatorial Testing tool, or PICT. With that, you need to do a bit more upfront work to create your specification as a model file. The benefit is that you get the parameter values immediately, without the need for substitution. The model file corresponding to the last Jenny example would be:

Client OS:  Windows, OSX, Linux
Browser:    Firefox, Chrome, Safari, Edge, Internet Explorer
Server OS:  Windows, Linux
Web Server: Apache, Lighttpd, IIS
 
IF [Browser] = "Internet Explorer" THEN [Client OS] = "Windows";
IF [Web Server] = "IIS" THEN [Server OS] = "Windows";

Run it as follows:

Pict model.txt

The result will be a tab delimited text table:

Client OS     Browser           Server OS      Web Server
OSX           Edge              Windows        Lighttpd
Windows       Safari            Linux          Lighttpd
Linux         Chrome            Windows        Lighttpd
Windows       Edge              Windows        IIS
Windows       Internet Explorer Windows        IIS
Linux         Edge              Linux          Apache
OSX           Safari            Windows        Apache
OSX           Chrome            Windows        IIS
Linux         Safari            Windows        IIS
OSX           Firefox           Linux          Lighttpd
Windows       Firefox           Windows        Apache
Windows       Chrome            Linux          Apache
Windows       Internet Explorer Linux          Lighttpd
Linux         Firefox           Windows        IIS
Windows       Internet Explorer Linux          Apache

Also, PICT allows you to create all-triplets:

C:\Apps\PICT>pict book_demo.txt | find /c /v ""
16
C:\Apps\PICT>pict book_demo.txt -o:3 | find /c /v ""
42

In this case, PICT seems to be a bit more efficient than Jenny since it requires only 41 cases for all triplets (we need to subtract the header from the line count), while Jenny needs 45

Combining All Pairs Testing with Input Domain Coverage

As can be seen in Domain Coverage, just satisfying input domain coverage does not necessarily mean that your set of test cases is optimally effective and efficient in finding defects. The only thing that input domain coverage achieves is that all equivalence classes of all input values are covered at least once. And we saw in the previous section that All Pairs Testing is useful but no guarantee for success.

Combining All Pairs testing with input domain coverage can significantly improve the defect finding effectiveness and efficiency. What it comes down to is that you create your test cases such that all valid equivalence classes of all input values are paired at least once in your set of test cases. Let's call this All Pairs Input Domain Coverage.

With two variables, Al Pairs means covering all combinations of valid equivalence classes, so the 5 valid equivalence classes for speed and the 3 valid classes for traffic light state that we saw in the Domain Coverage guidance would lead to 15 pairs that need to be covered. For the speed parameter, we also include boundary value testing, which we can easily do since there are 3 pairs for every speed class and at most 2 speed values per class. We simply alternate the values (using the pipe symbol: |).

Here is the model in PICT:

Speed Class: Reverse-Speeding, Reverse-Not Speeding, Standstill, Forward-Not Speeding, Forward-Speeding
Speed: -100 | -51, -50 | -1, 0, 1 | 50, 51 | 250
Traffic Light: Red, Yellow, Green
 
IF [Speed Class]="Reverse-Speeding" THEN [Speed] in { -100, -51 };
IF [Speed Class]="Reverse-Not Speeding" THEN [Speed] in { -50, -1};
IF [Speed Class]="Standstill" THEN [Speed] = 0;
IF [Speed Class]="Forward-Not Speeding" THEN [Speed] in {1, 50};
IF [Speed Class]="Forward-Speeding" THEN [Speed] in {51, 250};

Table 2 shows the result; the Expected Result column was added manually. Of course, the All Pairs algorithm only provides the input, and it is the tester's job to provide the right expected results.

Table 2 All Pairs Input Domain Coverage with Boundary Values

Speed Class	Speed	Traffic Light	Expected result
Forward-Speeding	51	Yellow Violation
Reverse-Not Speeding	-50	Red	No violation (not passing light)
Standstill	0	Yellow	No violation
Reverse-Not Speeding	-1	Yellow	No violation
Forward-Speeding	250	Green	Violation
Reverse-Speeding	-100	Red	Violation
Forward-Speeding	51	Red	Violation
Forward-Not Speeding	1	Red	Violation
Standstill	0	Red	No violation
Standstill	0	Green	No violation
Forward-Not Speeding	50	Yellow	No violation
Reverse-Not Speeding	-50	Green	No violation
Reverse-Speeding	-51	Yellow	Violation
Reverse-Speeding	-100	Green	Violation
Forward-Not Speeding	1	Green	No violation

Add to this set the negative cases from Table 6 in Domain Coverage (speed -101 km/h and 251 km/h) and traffic light state Blue, and we are all set.

While this already shows benefits in terms of the potential for defect identification with a limited set of test cases, the benefit would become very clear if you have more than two input variables, as applying All Pairs will result in substantially less required test cases compared to full coverage.