Comparative Plant Transcriptomics

COURSE INVESTIGATION

AS A LEARNING COMMUNITY

IN RESEARCH TEAMS

FROM INDIVIDUAL RESEARCH FOR FINAL PORTFOLIOS

PIPELINE OVERVIEW

The Research Team Websites

Mature Leaf Team

Young Leaf Team

Meristem Team

Spring 2023 BIT CPT Course Instructors:

Course designer and Instructor: Dr. Carly Sjogren, B.A. Biology, Ph.D. Genetics, Genomics & Bioinformatics

Assistant Instructor: Dr. Emily Delorean, B.S. Crop Science, M.S. Plant Pathology, Ph.D. Genetics

Graduate Teaching Assistant: Edmaritz Hernandez Pagan

COURSE INVESTIGATION

The Spring 2023 Semester of BIT CPT is analyzing brand new RNA-seq data sets. Teams of undergraduate and graduate researchers will collaboratively analyze RNA-seq data sets from the model plant species Arabidopsis thaliana and crop species Glycine max (soybean). Student researchers will compare soybean transcriptomes by aligning to several reference genomes ranging from cultivar specific, new HIFI assembled genomes, ancient cultivars and related species in the Glycine genus.

Arabidopsis wildtype leaf and meristem samples.
Glycine max cultivar Lee wildtype leaf and meristem samples.

AS A LEARNING COMMUNITY

BIOLOGICAL QUESTION: What are the changes in gene expression that exist between differentiated young leaves and undifferentiated stem cells from shoot meristems in different species?

IN RESEARCH TEAMS

BIOLOGICAL QUESTION: What changes in soybean gene expression do we uncover when our experimental design includes high numbers of biological replicates?

FROM INDIVIDUAL RESEARCH FOR FINAL PORTFOLIOS

BIOLOGICAL QUESTION: What changes in soybean gene expression do we uncover when aligning reads to different reference genomes?

PIPELINE OVERVIEW

You and/or your collaborators have completed the hard work of designing your experiment, collecting your biological materials, isolating RNA, generating sequencing libraries and getting your samples sequenced. Now you finally have your sequences to analyze. What do you do with these giant sequence files to get to the interesting stuff you want to know?!

We will use the descriptions provided here to guide you through the analysis of RNA sequence data via a bioinformatic pipeline using Henry2, NC State's High Powered Computing resources:

Set up your working directory and QC your data
Build an indexed reference genome to align your sequences
Align your sequences to the genome
Quantify the aligned sequences into counts to be analyzed downstream.
Off the HPC you will explore your data outputs using a free graphical user interface, GALAXY

STAR is used to assemble an Arabidopsis genome and align sequence reads to it.

SALMON is used to quantify and normalize the aligned reads.

GALAXY is used to explore data outputs including differential gene expression.

To recapitulate the work done during this course, the pages listed below follow our pipeline. Note: we record ALL the work that we did, but recapitulation of this work can and should bypass the erroneous steps.

The Research Team Websites

Mature Leaf Team

- - Noah
  - Jacob
  - John

Young Leaf Team

- - Colin
  - Monica
  - Carlos

Meristem Team

- - Haley
  - James
  - Jay
  - Andrew

Page updated

Report abuse