Resources

Datasets

Hierarchical Classification

    • My extended version of Nick Holden's hierarchical protein function datasets as used in my paper "Selecting different protein representations and classification algorithms in hierarchical protein function prediction", Intelligent Data Analysis Journal. Vol. 15, No. 6, pp. 979-999, 2011, are available here. [Download]

    • Nick Holden's hierarchical protein function datasets as used in my paper "A Global-Model Naive Bayes Approach to the Hierarchical Prediction of Protein Functions", Proc. of the IEEE International Conference on Data Mining (ICDM). Miami, FL, USA, pp. 992-997, 2009, are available here. [Download]

It should be noted that the hierarchical classification datasets available in this page use the .harff (which stands for Hierarchical Arff) format. These datasets cannot be used directly within weka and at the present, the only difference between them and a regular .arff file is that the class attribute contains the hierarchical structures. The format used is always R (denoting the root node) followed by a sequence of dots and classes at specific levels. For example the class label "R.1.2" means that the instance has the class 1 at the first level of the hierarchy and class 2 at the second level of the hierarchy.

Data Stream Mining

    • Our data stream mining version of the Kyoto NIDS Dataset as used in our paper "An Investigation of the Hoeffding Adaptive Tree for the Problem of Network Intrusion Detection", Proc. of the International Joint Conference on Neural Networks (IJCNN). Anchorage, AK, USA, 2017, is available here. [Download]

Music Genre Classification

Softwares

jOthelloT (Java Othello Tournament) is a java-based open source for Artificial Intelligence Undergraduate Classes. More information about jOthelloT can be found here.

YASD

YASD (Yet Another Sentence Detector) is an open-source software package for tagging sentence boundaries developed in Java. It currently supports documents in English and Portuguese. [Download]

Other Resources

Si, Mi, La

Si,Mi,La is an online database for the digital preservation of (mainly) music sheets related to Brazlian music.