My research is focused on Machine Learning, the main topics of interest are:
- Current research
- Semi-supervised classification
- The semi-supervised learning is a relatively new field in Machine Learning.
- In contrast with supervised learning, in which we only use the labelled
- instances, the main idea of the semi-supervised learning is to learn from
- both labelled and unlabelled data in a given problem. By using the
- unlabelled data, we aim to improve the supervised learning process. In the
- context of the semi-supervised learning, we focus on the semi-supervised
- classification task, which consists of classifying unseen instances
- according to the knowledge acquired from a semi-supervised training
- algorithm. So, in order to address such task, we propose an ensemble-based
- technique, in which each ensemble component increases its classification
- accuracy by using the knowledge retrieved from other classifiers. In this
- sense, we intend to improve the current results in the literature.
- Ensemble Learning
- Evolutionary Computation
- Meta-learning
- Clustering
- M.Phil
- Abstract of M.Phil Dissertation
- The amount of gene expression data has been exponentially growing in recent years
- due to the new Molecular Biology technologies, that allow measuring the expression of
- thousands of genes at once. The computational analysis of such data is of major importance
- in Biology and Medicine, it allows, for example, the recovery of new biologically
- and clinically signicant cancer classes and the identification of new functions of genes.
- The unsupervised Machine Learning techniques take part in the experts' data analysis
- methodology. There is a variety of data clustering algorithms, each one tends to cluster
- the data in a specific way. The choice of such algorithms is fundamental to the clustering
- quality and, therefore, it's important to the proper results analysis. We propose a
- meta-learning methodology for the clustering algorithms selection in the context of cancer
- cells gene expression. So far, meta-learning has been used only for supervised learning
- algorithms, we extended that concept for unsupervised learning. We used datasets from
- different cancer microarray experiments. We extracted relevant characteristics from each
- dataset in order to employ them in the learning of Neural Networks, k-Nearest Neighbors
- and Support Vectors Machine, used as meta-learners. These methods were used as
- learning systems to predict the performance ranking of clustering algorithms, as well as
- to select the best algorithm, according to the dataset characteristics. We performed a
- set of experiments in order to validate the use of each meta-learner. In this context, we
- showed that, in average, the Support Vector Machines suggested rankings that are more
- correlated with the ideal ranking than the ones obtained by the default ranking. We could
- propose a novel approach, which can be extended to data from other contexts, so it can
- be the stating point for other works.
- Keywords: Meta-learning, Unsupervised Learning, Gene Expression.