A combination of genetic algorithms and fuzzy rules for gene expression description
Arianna Consiglio, Istituto di Tecnologie Biomediche CNR - Bari
The advent of Next-Generation Sequencing technologies has revolutionized the study of DNA and RNA in biomedicine. The opportunity to investigate a whole genome (billions of nucleotides) or transcriptome (thousands of genes) with a single biotechnological assay has opened new experimental horizons and many computational issues. The study of the gene expression profiles (number of RNA copies active in cells) of biological samples has allowed to understand many molecular and cellular functions in fisiological and pathological conditions.
The analysis of gene expression data is a complex task, and many tools and pipelines are available to handle big sequencing datasets for case-control (bivariate) studies. In some cases, such as pilot or exploratory studies, the researcher needs to compare more than two groups of samples consisting of a few replicates. Both standard statistical bioinformatic pipelines and innovative deep learning models are unsuitable for extracting interpretable patterns and information from such datasets.
In this presentation, a combination of fuzzy rule systems and genetic algorithms is proposed for the analysis and description of gene expression variations in multiclass experiments. Genetic algorithms are used for the selection of genes and fuzzy rule-based systems for the classification task.
The method has been tested on an ovarian cancer dataset that contains the transcriptome of 21 human ovarian tissue samples from 12 cancer and 9 non-cancer samples, grouped into 6 diagnostic classes [1]. Due to the large number of classes and the low number of replicates for each class, this dataset is quite difficult to analyze with standard bioinformatic tools. After testing several parameters, the final model consists of 10 genes involved in the molecular pathways of cancer and 10 rules that correctly classify all samples.
[1] Consiglio, A., Casalino, G., Castellano, G., Grillo, G., Perlino, E., Vessio, G., Licciulli, F. (2021). Explaining Ovarian Cancer Gene Expression Profiles with Fuzzy Rules and Genetic Algorithms. Electronics, 10(4), 375.