The advent of NGS technologies has revolutionized the research in molecular biology and genetics. High-throughput analysis have emerged now as promising tools to investigate biological problems in a genome-wide context. To fully take advantage of these approaches, we introduce multiple examples of computational resources that are indispensable to gain novel knowledge from ChIP-seq and RNA-seq processing pipelines.
PART I - Introduction
1. The UCSC genome browser
- Genome browsers, tracks, visualization
- UCSC tracks: RefSeq, phastcons
- UCSC sessions and custom tracks
- UCSC tools: BLAT, table browser
2. The Galaxy environment
- Basic file operations
- Galaxy interface
- Using genomic data
PART II - ChIPseq analysis
3. Basic pipeline (I): mapping/peak calling
- Raw data (single-end): FASTQ format
- NCBI-GEO: Platform/Samples
- Mapping reads with Bowtie
- SAM format (single)
- UCSC tracks: BedGraph profiles
- UCSC tracks: Bed peaks
4. Basic pipeline (II): genes and plots
- Match peaks and genes
- Gene Ontology: ENRICHR/DAVID
- Distribution plots of reads
- ENCODE project: ChIPseq
5. Characterization of peaks
- Catalogs of regulatory information
- Detection of regulatory binding sites
- Phylogenetic footprinting
- Motif discovery in sequences
PART III - RNAseq analysis
6. Basic pipeline (I): mapping
- Raw data (paired-end): FASTQ
- Raw data (strand-specific): FASTQ
- NCBI-GEO: Platform/Samples
- Mapping with TopHat
- SAM format (paired)
- UCSC tracks: BedGraph profiles
7. Basic pipeline (II): quantification
- RPKM quantification with Cufflinks
- Differential gene expression
- Gene expression heat maps
- ENCODE project: RNAseq
PART IV - Past, present and future
8. scRNAseq: basic analysis
- Classes of scRNAseq
- Problems and limitations
- Post-processing analysis of data
- Pseudoorder and trajectories
9. Microarrays: basic analysis
- Classes of arrays
- NetAffx/Agilent web tools
- NCBI-GEO: Platform/Samples
- Babelomics web platform