By Fabio Palomba

Test cases form the first line of defense against the introduction of faults. But also, the entire team rely on regression tests to merge pull requests, or even to deploy a version. But test code can have problems as well. Test smells to the rescue! For example if you test more than one method, comprehending the test becomes more difficult. And hence, maintainability is at stake. Therefore, more automation of testing is needed.

How are test smells perceived by developers? By a survey, they found that 18% perceived the presence of a test smell. At the same time, only 9% would have refactored the test.

When are test smells introduced? By analyzing 152 projects with a test smell detector (88% precision, 100% recall), analyzing when a test smell was first introduced, and when it was removed. 97% of the smells are at the creation of the test. 80% of the smells are not removed at all.

But tests can have bugs as well. One type is flakiness: non-deterministic behavior. It can both fail and succeed with the same code. For example if one tests multiple classes in the same test, racing conditions can occur, and therefore sometimes fail the test. So, does test smell refactoring remove flakiness? And, what are the causes of test flakiness? Can we find them by studying test smells? The previous example is in 27% of the studied projects the cause of flakiness, but also test ordering and network. 61% of the flaky tests were smelly as well. To test causality, they further analyzed the co-occurrences. Furthermore, by manually applying the proposed refactoring strategies of Van Deursen et al. all of the flaky tests could be resolved.

To conclude: test smells also help in diagnosing the effectiveness of tests.

IPA Fall days 2017 – Not only maintainability: revisiting test smells as a measure of test code effectiveness