Os repositórios contêm bases de dados mais de nosso interesse
Para rodar grandes bases de dados, melhor usar servidor remoto. Dependendo do tamanho do dataset, o Galaxy deve servir, ele roda R e Jupyter
Inclui tutoriais de scRNAseq tanto para R quanto para Python. Escolham sua linguagem antes. Têm algoritmos equivalentes. É questão de gosto mesmo. Galaxy aceita ambos.
Um problema que encontramos foi a estrutura das matrizes disponibilizadas. Entao tem um recurso sobre matrizes e conversão de matrizes
https://www.parsebiosciences.com/blog/the-why-and-how-of-scrna-seq-a-guide-for-beginners/ (bem básico mas bem ilustrado)
https://www.10xgenomics.com/what-is-single-cell-rna-seq (também bem básico, mas em video)
https://www.sc-best-practices.org/introduction/scrna_seq.html
https://bioconductor.org/books/3.15/OSCA.basic/
Uma interface muito popular para rodar o R é o R Studio: https://www.datacamp.com/tutorial/r-studio-tutorial
Para o Python, um dos favoritos é o Jupyter https://www.dataquest.io/blog/jupyter-notebook-tutorial/
Usando o R Notebook para codificar
#sites where scRNAseq datasets are kept#
#Delile et al Mouse Embryonic Spinal cord E9.5-E13.5#
doi:10.1242/dev.173807
https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-7320/
more files in https://github.com/juliendelile/MouseSpinalCordAtlas
#Mouse DRG E11,5-P42 Sharma et al #
doi.org/10.1038/s41586-019-1900-1
https://kleintools.hms.harvard.edu/tools/springViewer_1_6_dev.html?datasets/Sharma2019/all (clustered data in an OK platform)
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139088 (raw data)
#DRG E9,5-10,5 Faure et al #
doi.org/10.1038/s41467-020-17929-4
http://pklab.med.harvard.edu/nikolas/pagoda2/frontend/current/pagodaLocal/index.html (clustered data in an awful platform)
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE150150 (raw data)
#Whole Mouse Embrio E9,5-E13,5 Cao et al.
doi.org/10.1038/s41586-019-0969-x
https://oncoscape.v3.sttrcancer.org/atlas.gs.washington.edu.mouse.rna/downloads (clustered data in a OK platform)
https://shendure-web.gs.washington.edu/content/members/cao1025/public/mouse_embryo_atlas/ (pre-processed data)
#Curated repository#
A curated database reveals trends in single-cell Transcriptomics doi:10.1093/database/baaa073
www.nxn.se/single-cell-studies/data.tsv
#disk space for users#
https://galaxyproject.org/main/#user-data-and-job-quotas
#how to use R in galaxy#
#how to use Python and Jupyter in Galaxy#
# R-Seurat>10x Genomics NextSeq500 PBMC #
https://satijalab.org/seurat/articles/pbmc3k_tutorial.html
#R-Bioconductor-Seurat>Smartseq from Tabula Muris> kidney Detailed quality control steps#
#R-Seurat> 10 x genomics Non-small Cell Lung Cancer Cells#
https://broadinstitute.github.io/2020_scWorkshop/data-wrangling-scrnaseq.html
#Exercises for Scientific Python#
https://www.oreilly.com/library/view/elegant-scipy/9781491922927/ch01.html
#Very clear step by step of big data tutorial analysis in Python#
https://www-users.york.ac.uk/~dj757/BIO00047I/PythonHandout.pdf (also here)
#Python > Smartseq from Tabula Muris> brain counts#
https://chanzuckerberg.github.io/scRNA-python-workshop/preprocessing/00-tabula-muris.html
#Python> 10 x Genomics NextSeq500 PBMC data ( same as Seurat ran on Python)#
https://scanpy-tutorials.readthedocs.io/en/latest/pbmc3k.html
E-book for beginners:
https://www.freecodecamp.org/news/the-python-code-example-handbook/
#General concepts (very clear) about RNAseq datasets (regardless of technology)#
https://bioconductor.org/books/3.13/OSCA.intro/getting-scrna-seq-datasets.html
https://hbctraining.github.io/scRNA-seq/lessons/readMM_loadData.html
# Concepts on how expression matrix are built (well illustrated) #
https://hbctraining.github.io/scRNA-seq/lessons/02_SC_generation_of_count_matrix
#More information about UMIs and their applications#
# How to construct expression matrix from raw data#
https://scrnaseq-course.cog.sanger.ac.uk/website/construction-of-expression-matrix.html
#What kind of data matrix does 10x genomics generate#
#Build Sparse Matrix with R-takes up less space#
https://slowkow.com/notes/sparse-matrix/
#Convert genecodes to gene names
https://pypi.org/project/mygene/
#Read different format of datasets
https://bioconductor.org/books/3.13/OSCA.intro/getting-scrna-seq-datasets.html
https://satijalab.org/seurat/articles/visualization_vignette
https://satijalab.org/seurat/reference/dimplot