Hi, I'm Islam Salah, Bioinformatician, Software Engineer, PMP

Experienced bioinformatician in biological data analysis and software development.

Experience in Python, R/Bioconductor (certified data analysis in functional genomics using Bioconductor).

Proficiency in Linux, Bash, workflows, WDL, Nextflow

Practical experience in RNA-Seq, and DNA Methylation measurement analysis using NGS, CHIP-Seq, and Microarray.

Experience creating and using Docker images.

Experience with GCP, and service handling.

Experience using Git and GitHub.

Excellent management & communication skills as I'm PMP certified.

I have a Micro-Master credential in Managing Technology & Innovation: How to deal with disruptive change. (RWTH Aachen university)

Structure, annotate, normalize, and interpret genome-scale assays.
Analyze and interpret genomic high-throughput technology data with R and Bioconductor.
Analyze and investigate data analysis for experimental protocols in genomics.
Analyze multi-omic experiments, particularly in cancer (TCGA, ENCODE)
ATAC-seq, and RNA-seq with CRISPR interference.

Apply quantitative methods to biological problems.
Python, MATLAB, and R code to analyze biological data.
Examine any protein structure in PyMOL.
Design and carry out genetic experiments through simulation tools
Biochemistry: Equilibrium and Kinetics and PyMOL
Visual Neuroscience using Machine Learning
Molecular Modeling with PyMOL
Using MATLAB to Analyze Neural Response Properties
Genomics Data Analysis and Sequence Analysis Using Python, Gene Expression Application Using Python
Statistical Data Analysis Using R
Single-Cell Gene Expression Using R

Understand a series of concepts, thought patterns, analysis paradigms, and computational and statistical tools, that together support data science and reproducible research.
Fundamentals of reproducible science using case studies that illustrate various practices
Key elements for ensuring data provenance and reproducible experimental design
Statistical methods for reproducible data analysis
Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder), reproducible data (Data repositories/Dataverse) and reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.
How to develop new methods and tools for reproducible research and reporting
How to write your own reproducible paper.

RNA-seq data analysis,
FASTQ files, quality control of FASTQ files.
Aligning RNA-seq reads; visualizing alignments and analyzing RNA-seq at the gene-level: counting reads in genes; Exploratory Data Analysis and variance stabilization for counts; count-based differential expression; normalization and batch effects.
RNA-seq at the transcript-level: inferring expression of transcripts (i.e. alternative isoforms); differential exon usage.
Analyzing DNA methylation data, including reading the raw data, normalization, and finding regions of differential methylation across multiple samples.
The basic steps for analyzing ChIP-seq datasets, from read alignment, to peak calling, and assessing differential binding patterns across multiple samples.

Methods that identified DNA as the genetic material
Structure of DNA and methods for packaging DNA into the cell
Impacts of packaging on DNA expression in higher organisms and passage of information with no change in DNA (epigenetics)
Location-specific DNA expression in the cell
Machinery for replicating DNA with an extremely low error rate
Place of origin and timing for DNA replication
Mechanisms for “preserving” the ends of linear DNA
Types of damage that affect DNA structure and how DNA moves around
Procedures to amplify DNA sequences and to determine base sequence
Enzymes to fragment DNA into specific segments that can be separated
Methods to recombine DNA segments from different sources
Ways to introduce recombined DNA into cells, including human cells

Identify the main characteristics of Linux and its use in biology
Describe the structure of a Linux file system
Use Linux commands to navigate the file system
Perform Linux commands to manipulate and interrogate biological data files
Prepare biological data files under Linux for exporting into other environments such as R
Write and execute simple shell scripts in order to automate the processing of data
Exercising on biological data using different case scenarios

Assess DNA representations and protein sequences
Perform searches in primary databases (repositories) and retrieve gene/protein data
Interpret different repository submission formats
Investigate biological databases for research
Identify the putative function of proteins based on their conserved domains
Inferring function from sequence

Collect, access, and download whole bacterial genomes from public repositories
Investigate and navigate bacterial genomes and their annotation using Artemis
Identify genomic regions with low/high GC (guanine-cytosine) content, often associated with virulence
Perform simple comparative analyses between bacterial genomes
Multi-FASTA files, Reference, and draft bacterial genomes
Genome annotation
Genomic regions defined by GC (guanine-cytosine) content
Accessing and downloading whole-genome sequences
Pathogenicity islands

Introduction to the relevant biology, explaining what we measure with high-throughput technologies and why.
Introduction to high-throughput technologies Next Generation Sequencing & Microarrays.
Preprocessing and Normalization
R Programming Language & The Bioconductor Genomic Ranges Utilities
Genomic Annotation.

The application of Python to Data Science.
How to define variables in Python.
Sets and conditional statements in Python.
The purpose of having functions in Python.
How to operate on files to read and write data in Python.
How to use pandas, a must-have package for anyone attempting data analysis in Python.

This course introduces you to the basic biology of modern genomics and the experimental tools that we use to measure it. We'll introduce the Central Dogma of Molecular Biology and cover how next-generation sequencing can be used to measure DNA, RNA, and epigenetic patterns. You'll also get an introduction to the key concepts in computing and data science that you'll need to understand how data from next-generation sequencing experiments are generated and analyzed.

Describe the R programming language and its programming environment
Explain the fundamental concepts associated with programming in R including functions, variables, data types, pipes, and vectors
Describe the options for generating visualizations in R
Demonstrate an understanding of the basic formatting R Markdown to create structure and emphasize content
Data Analysis, Data Visualization (DataViz), R Markdown, Rstudio

Page updated

Google Sites

Report abuse