Conference Proceedings
Proceedings of The First IEEE International Conference On Artificial Intelligence Testing (AITest 2019)
Journal Articles
Marina Sokolova and Guy Lapalme, "A systematic analysis of performance measures for classification tasks", Information Processing and Management, 45 (2009), Elsevier, pp427–437.
Tom Fawcett, An introduction to ROC analysis, Pattern Recognition Letters 27 (2006), pp861–874.
José Daniel Pascual-Triana, David Charte, Marta Andrés Arroyo, Alberto Fernández and Francisco Herrera, Revisiting data complexity metrics based on morphology for overlap and imbalance: snapshot, new overlap number of balls metrics and singular problems prospect, Knowledge and Information Systems (2021) 63:1961–1989,
Tin Kam Ho and Mitra Basu, Complexity Measures of Supervised Classification Problems, IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 24, No. 3, March 2002, pp 289-300.
Xiaoyuan Xie, Joshua W.K. Ho, Christian Murphy, Gail Kaiser, Baowen Xu, Tsong Yueh Chen, Testing and validating machine learning classifiers by metamorphic testing, Journal of Systems and Software, Volume 84, Issue 4, 2011, Pages 544-558.
Conference Papers
Davide Dell’Anna, Fabiano Dalpiaz, Mehdi Dastani, "Validating Goal Models via Bayesian Networks", in Proc. of AIRE 2018.
Julián Iranzo P. and Rubio Manzano C. (2010). BOUSI∼PROLOG - A Fuzzy Logic Programming Language for Modeling Vague Knowledge and Approximate Reasoning . In Proceedings of the International Conference on Fuzzy Computation and 2nd International Conference on Neural Computation - Volume 1: ICFC, (IJCCI 2010) ISBN 978-989-8425-32-4, pages 93-98. DOI: 10.5220/0003079200930098
Shenao Yan, Guanhong Tao, Xuwei Liu, Juan Zhai, Shiqing Ma, Lei Xu, Xiangyu Zhang, Correlations between deep neural network model coverage criteria and model quality, in Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020), November 2020, Pages 775–787. https://doi.org/10.1145/3368089.3409671
Huang, J., & Ling, C. (2007). Constructing new and better evaluation measures for machine learning. In Proceedings of the 20th international joint conference on artificial intelligence (IJCAI’2007) (pp. 859–864).
A. Sharma and H. Wehrheim, "Testing Machine Learning Algorithms for Balanced Data Usage," in Proc. of 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), Xi'an, China, 2019 pp. 125-135.
Books
Japkowicz, N., & Shah, M. (2011). Evaluating Learning Algorithms: A Classification Perspective. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511921803
Online Resources
Liming Xu, Dave Towey, Andrew P. French, Steve Benford, Zhi Quan Zhou, Tsong Yueh Chen, Using Metamorphic Relations to Verify and Enhance Artcode Classification, arXiv:2108.02694v1, Aug. 2021.