Hi, I'm Islam Salah, Bioinformatician, Software Engineer, PMP

Experienced bioinformatician in biological data analysis and software development.

Experience in Python, R/Bioconductor (certified data analysis in functional genomics using Bioconductor).

Proficiency in Linux, Bash, workflows, WDL, Nextflow

Practical experience in RNA-Seq, and DNA Methylation measurement analysis using NGS, CHIP-Seq, and Microarray.

Experience creating and using Docker images.

Experience with GCP, and service handling.

Experience using Git and GitHub.

Excellent management & communication skills as I'm PMP certified.

I have a Micro-Master credential in Managing Technology & Innovation: How to deal with disruptive change. (RWTH Aachen university)

https://credentials.edx.org/credentials/ed1e50b72699417b9a5641988c1c2a52/


  • Structure, annotate, normalize, and interpret genome-scale assays.

  • Analyze and interpret genomic high-throughput technology data with R and Bioconductor.

  • Analyze and investigate data analysis for experimental protocols in genomics.

  • Analyze multi-omic experiments, particularly in cancer (TCGA, ENCODE)

  • ATAC-seq, and RNA-seq with CRISPR interference.



  • Apply quantitative methods to biological problems.

  • Python, MATLAB, and R code to analyze biological data.

  • Examine any protein structure in PyMOL.

  • Design and carry out genetic experiments through simulation tools

  • Biochemistry: Equilibrium and Kinetics and PyMOL

  • Visual Neuroscience using Machine Learning

  • Molecular Modeling with PyMOL

  • Using MATLAB to Analyze Neural Response Properties

  • Genomics Data Analysis and Sequence Analysis Using Python, Gene Expression Application Using Python

  • Statistical Data Analysis Using R

  • Single-Cell Gene Expression Using R


  • Understand a series of concepts, thought patterns, analysis paradigms, and computational and statistical tools, that together support data science and reproducible research.

  • Fundamentals of reproducible science using case studies that illustrate various practices

  • Key elements for ensuring data provenance and reproducible experimental design

  • Statistical methods for reproducible data analysis

  • Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder), reproducible data (Data repositories/Dataverse) and reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.

  • How to develop new methods and tools for reproducible research and reporting

  • How to write your own reproducible paper.


  • RNA-seq data analysis,

  • FASTQ files, quality control of FASTQ files.

  • Aligning RNA-seq reads; visualizing alignments and analyzing RNA-seq at the gene-level: counting reads in genes; Exploratory Data Analysis and variance stabilization for counts; count-based differential expression; normalization and batch effects.

  • RNA-seq at the transcript-level: inferring expression of transcripts (i.e. alternative isoforms); differential exon usage.

  • Analyzing DNA methylation data, including reading the raw data, normalization, and finding regions of differential methylation across multiple samples.

  • The basic steps for analyzing ChIP-seq datasets, from read alignment, to peak calling, and assessing differential binding patterns across multiple samples.


  • Methods that identified DNA as the genetic material

  • Structure of DNA and methods for packaging DNA into the cell

  • Impacts of packaging on DNA expression in higher organisms and passage of information with no change in DNA (epigenetics)

  • Location-specific DNA expression in the cell

  • Machinery for replicating DNA with an extremely low error rate

  • Place of origin and timing for DNA replication

  • Mechanisms for “preserving” the ends of linear DNA

  • Types of damage that affect DNA structure and how DNA moves around

  • Procedures to amplify DNA sequences and to determine base sequence

  • Enzymes to fragment DNA into specific segments that can be separated

  • Methods to recombine DNA segments from different sources

  • Ways to introduce recombined DNA into cells, including human cells



  • Identify the main characteristics of Linux and its use in biology

  • Describe the structure of a Linux file system

  • Use Linux commands to navigate the file system

  • Perform Linux commands to manipulate and interrogate biological data files

  • Prepare biological data files under Linux for exporting into other environments such as R

  • Write and execute simple shell scripts in order to automate the processing of data

  • Exercising on biological data using different case scenarios


  • Assess DNA representations and protein sequences

  • Perform searches in primary databases (repositories) and retrieve gene/protein data

  • Interpret different repository submission formats

  • Investigate biological databases for research

  • Identify the putative function of proteins based on their conserved domains

  • Inferring function from sequence



  • Collect, access, and download whole bacterial genomes from public repositories

  • Investigate and navigate bacterial genomes and their annotation using Artemis

  • Identify genomic regions with low/high GC (guanine-cytosine) content, often associated with virulence

  • Perform simple comparative analyses between bacterial genomes

  • Multi-FASTA files, Reference, and draft bacterial genomes

  • Genome annotation

  • Genomic regions defined by GC (guanine-cytosine) content

  • Accessing and downloading whole-genome sequences

  • Pathogenicity islands




  • Explain the advantages of comparative genomics

  • Develop a hypothesis based on results observation

  • Introduction to comparative genomics

  • Identify pseudogenes in Mycobacterium leprae using ACT

  • Project: Comparative genomics on two clinically relevant plasmids from Shigella



  • Introduction to the relevant biology, explaining what we measure with high-throughput technologies and why.

  • Introduction to high-throughput technologies Next Generation Sequencing & Microarrays.

  • Preprocessing and Normalization

  • R Programming Language & The Bioconductor Genomic Ranges Utilities

  • Genomic Annotation.




  • The application of Python to Data Science.

  • How to define variables in Python.

  • Sets and conditional statements in Python.

  • The purpose of having functions in Python.

  • How to operate on files to read and write data in Python.

  • How to use pandas, a must-have package for anyone attempting data analysis in Python.





  • This course introduces you to the basic biology of modern genomics and the experimental tools that we use to measure it. We'll introduce the Central Dogma of Molecular Biology and cover how next-generation sequencing can be used to measure DNA, RNA, and epigenetic patterns. You'll also get an introduction to the key concepts in computing and data science that you'll need to understand how data from next-generation sequencing experiments are generated and analyzed.





  • Describe the R programming language and its programming environment

  • Explain the fundamental concepts associated with programming in R including functions, variables, data types, pipes, and vectors

  • Describe the options for generating visualizations in R

  • Demonstrate an understanding of the basic formatting R Markdown to create structure and emphasize content

  • Data Analysis, Data Visualization (DataViz), R Markdown, Rstudio