My extended version of Nick Holden's hierarchical protein function datasets as used in my paper "Selecting different protein representations and classification algorithms in hierarchical protein function prediction", Intelligent Data Analysis Journal. Vol. 15, No. 6, pp. 979-999, 2011, are available here. [Download]
Nick Holden's hierarchical protein function datasets as used in my paper "A Global-Model Naive Bayes Approach to the Hierarchical Prediction of Protein Functions", Proc. of the IEEE International Conference on Data Mining (ICDM). Miami, FL, USA, pp. 992-997, 2009, are available here. [Download]
It should be noted that the hierarchical classification datasets available in this page use the .harff (which stands for Hierarchical Arff) format. These datasets cannot be used directly within weka and at the present, the only difference between them and a regular .arff file is that the class attribute contains the hierarchical structures. The format used is always R (denoting the root node) followed by a sequence of dots and classes at specific levels. For example the class label "R.1.2" means that the instance has the class 1 at the first level of the hierarchy and class 2 at the second level of the hierarchy.
Our data stream mining version of the Kyoto NIDS Dataset as used in our paper "An Investigation of the Hoeffding Adaptive Tree for the Problem of Network Intrusion Detection", Proc. of the International Joint Conference on Neural Networks (IJCNN). Anchorage, AK, USA, 2017, is available here. [Download]
YASD (Yet Another Sentence Detector) is an open-source software package for tagging sentence boundaries developed in Java. It currently supports documents in English and Portuguese. [Download]
A Novel STEAM-Based Approach to Teach Programming and Electronics Through the Construction of Low-Cost Digital Musical Instruments
Kahoots used in the above mentioned paper draft are available here.
Interdisciplinary PBL: An experience with Digital Games and Music Production students
Links for the publicy available games mentioned on the paper are available here.
From Code to Crop (FIE 2025 Paper)
Information about how to obtain the instructional materials are available here.
Si,Mi,La is an online database for the digital preservation of (mainly) music sheets related to Brazlian music.