iDFlakies: Flaky Test Dataset

iDFlakies: A Framework for Detecting

and Partially Classifying Flaky Tests

The findings from this work are now integrated into the Illinois Dataset of Flaky Tests (IDoFT). Please use the information there for our most up-to-date findings.

Scripts and tools for setting up and running the iDFlakies framework: https://github.com/iDFlakies/iDFlakies

Paper: http://mir.cs.illinois.edu/winglam/publications/2019/LamETAL19iDFlakies.pdf

If you use any of this work, please cite our corresponding paper:

@inproceedings{LamETAL19iDFlakies,
author = "Wing Lam and Reed Oei and August Shi and Darko Marinov and Tao Xie",
title = "{iDF}lakies: {A} framework for detecting and partially classifying flaky tests",
booktitle = "ICST 2019: 12th IEEE International Conference on Software Testing, Verification and Validation",
month = "April",
year = "2019",
address = "Xi'an, China",
pages = "312--322"
}

Details about all of the subjects that we investigated for order-dependent and non-order-dependent tests:

https://drive.google.com/open?id=1B3AwdVDygGDJIPWi-mqmvGib4RXddVOL (ZIP file; 45 KB)

Details of the database generated from the framework:

https://github.com/idflakies/iDFlakies/tree/master/scripts

Docker images of projects with flaky tests:

https://drive.google.com/open?id=1NPFEEOwV6FmVSjyn-HiFedKWEduoUeGW

CSV input files for the framework:

https://drive.google.com/file/d/1OE25_epQ4xK6yl_27pIkkq3MoaE_Zahw (ZIP file; 311 KB)

Logs and dataset of all runs by the framework:

https://drive.google.com/open?id=18AibFq643eSx3UTlJ_g5kVGjkCE4v6-k (ZIP file; 6.4 GB compressed; 91 GB uncompressed)

JSON files summarizing the order-dependent tests found for each module and the orders in which they passed/failed:

https://drive.google.com/open?id=1FEpdS5W6wCj9q3W9m7M4JTMyR70EjPBx (ZIP file; 101 MB)

Google sheet link for the full table: https://docs.google.com/spreadsheets/d/1aeQr9Rlb1ohGTABZsd5sZLdjh0hedxAWFJxHpsCnJ1w/edit?usp=sharing

List of flaky tests

We thank Angello Astorga, Liia Butler, and Owolabi Legunsen for their discussions about flaky tests. This work was partially supported by National Science Foundation grants CCF1421503, CNS-1513939, CNS-1564274, CNS-1646305, CNS1740916, CCF-1763788, CCF-1816615, and OAC-1839010. We acknowledge support for research on flaky tests and test quality from Facebook, Google, Huawei, and Microsoft.

Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).