Common Test Sense: November 2013

"If you don't script test cases, your testing is not structured" or "Testing is checking if the software works as designed" are statements made by people who like to call themselves "Structured testers".
I don't completely disagree with statements like those, but I also would call myself a structured tester, even though I suppose I see things in a slightly wider context...

Scripting, checking and testing can live seemlessly together as we can seemlessly live together in multicultural societies.
It is not because we don't do it too often... we effectively can't. A good tester knows when to check, test and/or script.

What do I check?
Checking is what I do when I have an exact understanding of what a function or process should have as an outcome.
This can be at a very low level for example:
- a field should contain no more than 15 characters
- an email should contain an @ sign, a "dot", in this order, and should contain some characters in between
It can be outcomes of a calculation engine that contains the rules on when to advise to grant an insurance or not, based on a defined set of conditions.
It can be a set of acceptance criteria or business rules.
Checking is verification and is a part of software testing. Whether or not they have to be all executed by a professional software tester is another question.
When there is a lot to check - the pairwise method can help, putting different parameters together to make checking more efficient.

What do I test?
Testing is what I do when I don't have an exact understanding of how functions and processes should co-exist or work together.
The software parts I tend to test are the complex combinations of possibilities an application offers.
In this case, I don't only look at a field, a screen or a function (sometimes I do) but at functions or flows that contain complex steps, and possibilities. I try to emulate an end user, taking into account the business processes, the company history, the development method and team's knowledge and skills... this combined with my own knowledge and tools... and go searching for the steepest and largest pitfalls that developers or analysts or business may have overlooked.
Knowing what a function under certain circumstances is supposed to do, I try to find out if it's consistent with itself under relevant conditions.
Questions I may ask myself:
- How will the system behave if the interface is not available? - This could have a big impact on business function and is a case that happened a lot with the previous software package... and a reason why they developed a new package that was supposed to handle these issues.
- What will happen if I push the back button instead of going back using the described functionality. The previous application in use - made use of the standard browser buttons. Many users are used to those, so I will check them all the time.

What do I script?
Scripting is what I do to make sure that I remember what to check before I start checking or to remember what I have tested. Scripts can possibly be used as evidence for bugs found during testing or as a working tool to follow up on all that needs verification.
I script what I should not forget.
I could script all those things that need checking - but most of the time that already has been done.
I could script the test cases that resulted in issues or bugs.
I could script the test cases that will help me find regression issues later on. Those I would prefer to automate.

How do we report and what do we report on?

We often try to fall back on "facts" in order to provide test reports.

Those facts are mostly represented in bars or curves, giving a view on the "reality" of the product quality and test progress.

I put reality in quotes because reality is different for every viewer.

I also put facts in quotes because facts are just a single dimension of showing a single aspect of a reality, seen by a person.

It becomes clear that "facts" are becoming less solid as they may seem when pronounced by experienced managers.

Traffic in bars

Imagine you need to get traffic information in order to find out whether it is worth taking off from your destination. You go to your favourite traffic website and suddenly you see something like this:

Would you be able to find out, based on this information where traffic is situated and if you are able to get home in time?

I would not.

Same goes for a test report.

A test report can give you a lot of valid information. But sometimes it lacks that what is really important to understand like for example

- the context and underlying risks of the bugs that are still open?

- what has been tested, and what not and what are the underlying risks?

- how testing was conducted and which issues were found during testing that prevent us from testing efficiently or from testing at all?

- what can be improved in the testing activities?

I got inspired by James Bach's website (https://www.developsense.org) and by the team I'm working with to find an alternative for the current reporting in bars... and when I started looking at the traffic information, I realized how silly we actually are, only making use of graphs and bars as 'facts' to present 'reality".

Let's for example have a look an example in software development

Building of a Web Shop

This conceptual web shop has a product search, a shopping basket, a secure login, and some cross- and up-selling features, a back end with an admin tool, a customer support tool and it interfaces with a logistics application, a CRM application, a BI application and a payments application.

The web shop is build in bi-weekly iterations of development and test results need to be reported.

The standard test report looks like this on the moment the go/no-go decision has to be made.

A standard report

Test cases were planned and prepared up front (460 of them).

In Phase1 of the project, the testers were motivated to find relevant bugs in the system and realized that they would not find them by only following the prepared test cases.They were not allowed to create extra test cases, so a lot of bugs were registered,without a trace to a test.
In Phase 2 of the project, the project manager extrapolated the amount of test cases executed to the future and found that testing will delay his project if it is continued like this. The test manager was instructed to speed up testing. A lot of unstructured testing had been going on and this was not budgeted for. The testers anticipated, an started failing all test cases they could, to get back on track with test execution. A lot less defects were registered as the testers were not really testing anymore, but rather executing test cases. Project management was happy. Less defects got raised so development was able to get back on track and testing was on track - the delivery date was safe again.
In Phase 3 End-to-end tests started and business got involved. By doing so, they started also looking around in the application and registered a lot of extra bugs again. Flagging off those E2E test cases went very slowly as business did not want to sign off on a test case, if they saw still other things not working well. It took Project Management up to 3 weeks to convince them to finalize the tests. In the meanwhile the testers got instructed to pass as many test cases as they could - because the release had to go trough and now everything was depending on testing

Finally 3 blocking bugs were not resolved. However, the overview of test progress and defect progress looked pretty well. We also can see that development reacts always fast on blocking and critical issues, so if we would find a blocking issue, we can be sure it will be fixed in a day. However, that turn-around time was measured when a developer stated the defect was resolved, and not when it was confirmed to be fixed indeed. A lot of critical and blocking defects went trough the mill for some iterations before they finally got resolved.

Clearly something went really wrong here. This report is unclear and seems fabricated. But it is only when we look at it with an eagle's eye, that we're able to notice these oddities.

Based on these results it is impossible to make a profound decision whether or not to release this software.

The pitfall of what we call "Goodhart's law" was stumbled into: "When a measure becomes a target, it ceases to be a good measure"

So what can be done to avoid these kind of "facts" to represent "reality"?

The answer to this is:

A visual, agile report.

So how would this report look like? Well, it would look very much like a traffic map would look like.

We could, for example, use functional blocks in an architectural model.

Coming back to the web shop, we see the different functions with an indication wheter they were tested or not, if there were bugs found and if there were, how severe they are and how many blocking and severe issues we may have found in each module (visualized in the testing progress circles showing first the amount of blocking issues and then the amount of criticals).

We also clearly indicate which of the existing functions are taken out of scope or come in scope of the test project - so that everyone has a same understanding of the scope we are really talking about.

We take a look one level deeper into the functions, present in each functional block with blocking issues.

For example the shopping basket. In the shopping basket we see that deleting items from the shopping basket is not possible. This is an annoying one to go live with, but if this one is solved one day after go-live, we can still agree to go live. Let's have a look at the transactions

In the transactions section, we see that we can not really pay with the money transfer option and there are still some relevant issues in the Visa payment area. We can go live with Pay-pal, have Visa money transfer working in a day and have those Visa and money transfer issues that are left, fixed in a weeks time. We're still going live...

In the interface with logistics, we see something more worrying. The stock order isn't working and actually none of the functions behind have been ever tested. In the standard report those tests were noted as failed. Now we see that we won't be able to go live yet with this software. This defect needs to be resolved and further testing is required.

In general we can conclude that this kind of reporting, can maybe not completely replace the classical reporting. But it can be a good addition. Reporting on test progress, based on prepared tests however, seems tempting as it may give a fake feeling of control. However this control is not real. So this is always a very bad idea.

The report is made to reflect and cope with agile development, but can also be used for a classical phased delivery approach. Also different color indications can be used or different "maps" (eg: business processes or functional flows might also do the job )

This dashboard view still has to be accompanied with the test approach and efficiency of the testing that was conducted. That's gonna be for another post.

This post has been created with the help of some of my RealDolmen colleages (Bert Jagers, Sarah Teugels, Stijn Van Cutsem, Danny Bollaert and and the inspiration of my wife) Thanks!

Common Test Sense

Sunday, November 24, 2013

Scripting, checking, testing - enemies or friends?

Saturday, November 16, 2013

Agile reporting

Total Pageviews