Resources

tRNA molecules contain abundant and dense modifications that affect tRNA structure, precise codon recognition, and tRNA fragment formation. tRNA modifications and related enzymes undergo dynamic changes under different conditions and are associated with a range of physiological and pathological processes. However, there is a lack of resource that can mine and analyze these dynamic tRNA modifications. In this study, we established tModBase, deciphering the landscape of tRNA modifications and their dynamic changes from epitranscriptome data. tModBase accurately annotates the positions of 27 types of chemical modifications and corresponding modifying enzymes on 523 tRNA isodecoders in human and mouse. In addition, we analyzed 25 datasets generated by second- and third-generation sequencing technologies specifically designed to detect and quantify tRNA molecules and their modifications, and illustrated the sequencing signal characteristics of different modification sites. Based on this, we systematically demonstrate the distribution of tRNA modification patterns in different tissues and summarize the characteristics of tRNA-associated human diseases. By integrating multi-omics sequencing data from 33 cancers, we developed novel tools to analyze the relationship between tRNA modifications and modification enzymes, the expression of 1442 tRNA-derived small RNAs (tsRNAs), and 654 DNA variation sites. Our database will provide new insights into the mechanisms of dynamic changes in tRNA modifications and the biological pathways they participate in for a comprehensive analysis.

tRNA-derived small RNA (tsRNA), a novel type of regulatory small noncoding RNA, plays an important role in physiological and pathological processes. However, the understanding of the functional mechanism of tsRNAs in cells and their role in the occurrence and development of diseases is limited. Here, we integrated multiomics data such as transcriptome, epitranscriptome, and targetome data, and developed novel computer tools to establish tsRFun, a comprehensive platform to facilitate tsRNA research (http://rna.sysu.edu.cn/tsRFun/ or http://biomed.nscc-gz.cn/DB/tsRFun/). tsRFun evaluated tsRNA expression profiles and the prognostic value of tsRNAs across 32 types of cancers, identified tsRNA target molecules utilizing highthroughput CLASH/CLEAR or CLIP sequencing data, and constructed the interaction networks among tsRNAs, microRNAs, and mRNAs. In addition to its data presentation capabilities, tsRFun offers multiple realtime online tools for tsRNA identification, target prediction, and functional enrichment analysis. In summary, tsRFun provides a valuable data resource and multiple analysis tools for tsRNA investigation.

Single cell RNA sequencing (scRNA-Seq) technology has revealed significant differences in gene expression levels between different cell groups. However, most of the studies focus on the analysis of mRNAs, ignore long non-coding RNAs (lncRNAs), which have been shown to be more abundant and have significant cell-specificity. In this study, we developed , a platform for comparative analysis of long non-coding RNAs (lncRNAs) and mRNAs expression, classification and functions in single-cell RNA-Seq data. We apply ColorCells to analyze 167913 publicly available scRNA-Seq experiments from 5 species. Integrative annotation of lncRNAs reveals large numbers of cell-specific lncRNAs and their properties. We provides a serious of novel tools and friendly visual interface in ColorCells, including apply PCA and t-SNE algorithm to  cell clusters in 2D and 3D space, develop a  tool to show various tissues and cell types in humans and mouse, establish a statistical test method for hypergeometric distribution to automatically assign to cell clusters, estimate  based on SNN and pearson correlation analysis, built protein-lncRNA co-expression networks to predict lncRNAs  from scRNA-Seq data. Our study emphasizes the need to uncover lncRNAs that occur in all types of cells, and we wish ColorCells to be a good resource for revealing the features, expression and functions of lncRNAs in single cells.

dreamBase (DNA Modification, RNA Regulation and Protein Binding on Expressed Pseudogenes in Human Health and Disease) is an integrated platform for analysing regualtory features of pseudogenes from multi-dimensional high-throughput sequencing data.

Based on ~5500 ChIP-seq and DNase-seq data, dreamBase provides genome-wide distribution patterns of Transcription Factors, Pol II, Histone Modifications and Dnase Hypersensitivity sites around the transcription start sites of pseudognenes.

By integrating ~18,000 RNA-Seq expression data, we provide the Expression Profiles of pseudogenes across 32 Cancers and 31 Normal Tissues. We also analyse the Co-Expression patterns between pseudogenes and their parent genes.

Through combining predicted binding sites of microRNAs with AGO CLIP-seq data, we study the relevance of microRNAs against RNA molecules of pseudogenes at post-transcriptional level. Besides, we combine these interaction and expression data of pseudogenes, and therefore construct ceRNA networks consisted of pseudogenes and other RNAs.

Ground on CLIP-seq data, we provide thousands of binding sites of RNA Binding Proteins on pseudogenes. We also provide the transcriptome-wide profiling of RNA Modifications on expressed pseudogenes based on epitranscriptome sequencing technologies. In addition, we provide a powerful dreamBase genome browser to visualize the distributions of these high-throughput sequencing data on pseudogenes.


deepBase is a platform for annotating and discovering small (microRNAs, siRNAs, piRNAs...), long ncRNAs (lncRNAs) and circular RNAs (circRNAs) from next generation sequencing data. deepBase v2.0 provides a set of useful tools to decode evolution and expression patterns of diverse ncRNAs across 19 species from 5 clades and to infer their functions. We provide accurate annotations of lncRNAs from RNA-seq experiements. By combining expression profiles of protein-coding genes and functional genomic annotations, we predict the function of lncRNAs from co-expression networks derived from RNA-Seq data. deepBase v2.0 also provided an integrative, interactive and versatile web graphical interface to display multidimensional data, and facilitate transcriptomic research and the discovery of novel ncRNAs. 

MicroRNAs (miRNAs) are small regulatory RNAs that play important roles in animals, plants and viruses. Deep-sequencing technology has been widely adopted in miRNA investigations. While, nearly all sequencing data contain miRNA sequences from exogenous species, called exo-miRNAs (exo-miRNAs). In this study, we developed a novel platform, exo-miRExplorer, for mining and identifying exo-miRNAs from high-throughput small RNA sequencing experiments which originated from tissues and cell lines of multiple organisms. Thousands of exo-miRNAs are characterized with their expression abundance, the RNA families, source organisms and the sequencing platforms presented in exo-miRExplorer. Subsequently, we use exo-miRExplorer to perform further analysis: Comparative analysis of the exo-miRNAs between “intra-study” and “inter-study” revealed significant correlation of exo-miRNAs between experiments in the same study; The plant-derived exo-miRNAs analysis provided robust evidence for non-diet source of exo-miRNAs; Virus-derived exo-miRNA analysis shown pathogen RNAs could transfer to host cells and exist in deep-sequencing results at an abundance level. In conclusion, exo-miRExplorer provides users with an integrative resource to facilitate detection and analysis of exo-miRNAs.