Last updated: March 31, 2024
(Page under development)
Genetics is an important topic to learn for people interested in AI applications in genetics for a few reasons:
Understanding the data: Genetics provides the foundation for understanding the data AI uses in this field. AI analyzes genetic information, and knowing what this information represents (genes, mutations, etc.) is crucial for interpreting the AI's results.
Identifying patterns: AI is effective at finding patterns in large datasets. Genetic knowledge helps researchers understand the biological significance of the patterns AI detects in genetic data.
Developing new applications: Understanding genetics allows researchers to develop new AI applications in the field. For example, using AI to design gene therapies or predict disease risk factors based on an individual's genetic makeup all require a strong foundation in genetics.
References:
GenAI (Gemini) and RS Vilhekar and A Rawekar, "Artificial Intelligence in Genetics," Cureus. 2024 Jan; 16(1): e52035. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10856672/
(Topics taken from https://onlinelearning.hms.harvard.edu/hmx/courses/hmx-genetics-2/ )
Introduction to Genetics and Human Genome.
The Central Dogma and Genetic Variation (The relationship between genotype and phenotype; Structure of a human gene and the effects of genetic variation). See https://www.youtube.com/watch?v=whV_CkKT7F0 and
Mendelian Genetics. Very good video introduction https://youtu.be/NWqgZUnJdAY?si=xivJTA6T9blwUyv3
Mendelian Inheritance of Disease. (Meiotic segregation, Modes of inheritance, Pedigree analysis, Penetrance and expressivity)
Identifying Mendelian Disease Genes (Haplotypes and linkage studies, Determining causation of a variant, Targeted genetic testing)
Chromosomal Aberrations (DNA segregation machinery, Whole chromosome and structural aneuploidy, Diagnostic techniques for chromosomal disorders)
The Genetics of Cancer (Germline and somatic mutations, Tumor suppressors and oncogenes, Two hit hypothesis, Precision cancer treatments)
Common Complex Traits (Architecture of a complex trait, Genome-wide association studies, Heritability and missing heritability, Understanding risk in common complex traits)
Human Population Genetics (Emergence and history of human traits, Evolutionary forces and population dynamics, Ancestry testing and population-specific risk)
Beyond the Genome Sequence (Mitochondrial inheritance, Unstable repeats, Epigenetic inheritance and imprinting, Gene dosage and X-inactivation)
Genetics and Precision Medicine (Whole genome sequencing, Pharmacogenomics, Genome editing)
Ensemble (The Human Genome). Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data. Ensembl tools include BLAST, BLAT, BioMart and the Variant Effect Predictor (VEP) for all supported species.
GenBank/DDBJ/EMBL (Nucleotide sequence, Protein sequence, etc). The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information. https://ncbi.nlm.nih.gov/
SWISS-PORT or Expasy. Swiss Bioinformatics Resource Portal, https://www.expasy.org/
InterProScan (Protein domains), EMBL-EBI, Unleashing the potential of big data in biology, https://www.ebi.ac.uk/
GenomeNet, GenomeNet is a Japanese network of database and computational services for genome research and related research areas in biomedical sciences, operated by the Kyoto University Bioinformatics Center. https://www.genome.jp/en/
BioPython
Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. https://biopython.org/
Bioinformatics with Biopython - Full Course | 1 hour Python for Bioinformatics tutorial, https://www.youtube.com/watch?v=ocA2IMe7dpA
PubChem
Quickly find chemical information from authoritative sources, https://pubchem.ncbi.nlm.nih.gov/
PubChemPy - PubChemPy provides a way to interact with PubChem in Python. It allows chemical searches by name, substructure and similarity, chemical standardization, conversion between chemical file formats, depiction, and retrieval of chemical properties. https://pypi.org/project/PubChemPy/
Awesome Bioinformatics
A curated list of awesome Bioinformatics software, resources, and libraries. Mostly command line based, and free or open-source. https://github.com/danielecook/Awesome-Bioinformatics
(partially generated by GenAI tools)
DNA: Deoxyribonucleic acid (DNA) is the genetic material that contains the instructions for building and maintaining an organism. Imagine it as the instruction manual.
RNA: Ribonucleic acid (abbreviated RNA) is a nucleic acid present in all living cells that has structural similarities to DNA. Unlike DNA, however, RNA is most often single-stranded. An RNA molecule has a backbone made of alternating phosphate groups and the sugar ribose, rather than the deoxyribose found in DNA.
mRNA: Messenger RNA is genetic material that tells your body how to make proteins.
Proteins: Proteins are the workhorses of the cell. They carry out most of the functions in a cell and are made based on the instructions in genes. Proteins are like machines built using the instructions in the manual. They are a sequence of amino acids.
Genes: Genes are specific sections of DNA that code for proteins. Think of them as chapters in the instruction manual, each with instructions to build a specific part.
Genome: The genome is the entire set of DNA instructions found in an organism. This includes all the genes and the non-coding DNA, like the chapters and the binding in the instruction manual.
Genotype: An organism's genetic information, or makeup is coded for in its DNA, the hereditary material of the cell. Organisms' DNA is organized into sections that code for proteins, called genes. The letters that make up the individual, like TT or Tt.
Phenotype: Set of observable traits.
See a more detailed list here: https://www.greeleyschools.org/cms/lib2/CO01001723/Centricity/Domain/5219/punnett%20sq%20cheat%20sheet.pdf