dbNSFP

INTRODUCTION:

    dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Its current version is based on the Gencode release 29 / Ensembl version 94 and includes a total of 84,013,490 nsSNVs and ssSNVs (splicing-site SNVs).  It compiles prediction scores from 43 prediction algorithms (SIFT, SIFT4G, Polyphen2-HDIV, Polyphen2-HVAR, LRT, MutationTaster2, MutationAssessor, FATHMM, MetaSVM, MetaLR, MetaRNN, CADD, CADD_hg19, VEST4, PROVEAN, FATHMM-MKL coding, FATHMM-XF coding, fitCons x 4, LINSIGHT, DANN, GenoCanyon, Eigen, Eigen-PC, M-CAP, REVEL, MutPred, MVP, gMVP, MPC, PrimateAI, GEOGEN2, BayesDel_addAF, BayesDel_noAF, ClinPred, LIST-S2, VARITY, ESM1b, EVE, AlphaMissense, ALoFT), 9 conservation scores (PhyloP x 3, phastCons x 3, GERP++, SiPhy and bStatistic) and other related information including allele frequencies observed in the 1000 Genomes Project phase 3 data, UK10K cohorts data, ExAC consortium data, gnomAD data and the NHLBI Exome Sequencing Project ESP6500 data, various gene IDs from different databases, functional descriptions of genes, gene expression and gene interaction information, etc.

    Some dbNSFP contents (may not be up-to-date though) can also be accessed through variant tools, ANNOVAR, KGGSeq, VarSome, UCSC Genome Browser's Variant Annotation Integrator, Ensembl Variant Effect Predictor, SnpSift and HGMD. Please cite our papers (see below) if you used dbNSFP contents through those tools.

    Please note some component score/content of dbNSFP has specific requirements or licence for non-academic usage. dbNSFP does not grant the non-academic usage of those scores/contents, so please contact the original score/content providers for that purpose.  

    Since v4 we provide a web service at http://database.liulab.science/dbNSFP for querying a short list of SNPs.

    Please join our Email group for news and updates from dbNSFP. 

    For whole genome annotation, we recommend our whole genome annotation pipeline WGSA, in which dbNSFP is a component resource. 

    We thank Dr. CS (Jonathan) Liu from Softgenetics and Amazon AWS Open Data Sponsorship Program for providing hosting space.

 

    We welcome developers of functional prediction methods to provide their predictions and scores to the database. Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com). 


CITATION:

1. Liu X, Jian X, and Boerwinkle E. 2011. dbNSFP: a lightweight database of human non-synonymous SNPs and their functional predictions. Human Mutation. 32:894-899.

2. Liu X, Jian X, and Boerwinkle E. 2013. dbNSFP v2.0: A Database of Human Non-synonymous SNVs and Their Functional Predictions and Annotations. Human Mutation. 34:E2393-E2402. 

3. Liu X, Wu C, Li C, and Boerwinkle E. 2016. dbNSFP v3.0: A One-Stop Database of Functional Predictions and Annotations for Human Non-synonymous and Splice Site SNVs. Human Mutation. 37:235-241. 

4. Liu X, Li C, Mou C, Dong Y, and Tu Y. 2020. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Medicine. 12:103. 

If you used dbNSFP v1.x, please cite our paper 1. If you used dbNSFP v2.x, please cite our papers 1 & 2. If you used dbNSFP v3.x, please cite our papers 1 & 3. If you used dbNSFP v4.x, please cite our papers 1 & 4.

If you used our ensemble scores MetaSVM and MetaLR, which are based on 10 component scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in the 1000 genomes populations. Please cite:

1. Dong C, Wei P, Jian X, Gibbs R, Boerwinkle E, Wang K* and Liu X*. (2015) Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Human Molecular Genetics 24(8):2125-2137. *corresponding authors [PDF]

If you used our ensemble scores MetaRNN, which are based on 16 component scores (SIFT, Polyphen2_HDIV, Polyphen2_HVAR, MutationAssessor, PROVEAN, VEST4, M-CAP, REVEL, MutPred, MVP, PrimateAI, DEOGEN2, CADD, fathmm-XF, Eigen and GenoCanyon), 8 conservation scores (GERP, phyloP100way_vertebrate, phyloP30way_mammalian, phyloP17way_primate, phastCons100way_vertebrate, phastCons30way_mammalian, phastCons17way_primate and SiPhy), and allele frequency information from the 1000 Genomes Project, ExAC, and gnomAD. Please cite:

1. Li C, Zhi D, Wang K and Liu X. (2021) MetaRNN: Differentiating Rare Pathogenic and Rare Benign Missense SNVs and InDels Using Deep Learning. bioRxiv. https://doi.org/10.1101/2021.04.09.438706 [PDF]


CURRENT VERSION:

   UPDATE (March 13, 2024): AlphaMissense scores are now licensed under the Creative Commons Attribution 4.0 International License (CC-BY) and have been added to the dbNSFP v4.7 "c" branch. If you need to use those scores from the "c" branch, please re-download the dbNSFP zip file.    

   NEW VERSION (March 3, 2024): dbNSFP v4.7 is released. CADD has been updated to v1.7. Allele frequencies of gnomAD exomes and genomes have been updated to v4.0.0. One bug in v4.6 causing eQTLGen eQTLs of some tissues missing has been fixed.

   Two branches of dbNSFP are provided: dbNSFP4.7a suitable for academic use, which includes all the resources,  and dbNSFP4.7c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, GenoCanyon, and AlphaMissense.  dbNSFP4.7a can be downloaded from Amazon or  Box or googledrive or  softgenetics ftp. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.7c can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here

   Columns added (dbNSFP_variant): gnomAD_exomes_MID_AC, gnomAD_exomes_MID_AN, gnomAD_exomes_MID_AF, gnomAD_exomes_MID_nhomalt, gnomAD_exomes_non_ukb_AC, gnomAD_exomes_non_ukb_AN, gnomAD_exomes_non_ukb_AF, gnomAD_exomes_non_ukb_nhomalt, gnomAD_exomes_non_ukb_AFR_AC, gnomAD_exomes_non_ukb_AFR_AN, gnomAD_exomes_non_ukb_AFR_AF, gnomAD_exomes_non_ukb_AFR_nhomalt, gnomAD_exomes_non_ukb_AMR_AC, gnomAD_exomes_non_ukb_AMR_AN, gnomAD_exomes_non_ukb_AMR_AF, gnomAD_exomes_non_ukb_AMR_nhomalt, gnomAD_exomes_non_ukb_ASJ_AC, gnomAD_exomes_non_ukb_ASJ_AN, gnomAD_exomes_non_ukb_ASJ_AF, gnomAD_exomes_non_ukb_ASJ_nhomalt, gnomAD_exomes_non_ukb_EAS_AC, gnomAD_exomes_non_ukb_EAS_AN, gnomAD_exomes_non_ukb_EAS_AF, gnomAD_exomes_non_ukb_EAS_nhomalt, gnomAD_exomes_non_ukb_FIN_AC, gnomAD_exomes_non_ukb_FIN_AN, gnomAD_exomes_non_ukb_FIN_AF, gnomAD_exomes_non_ukb_FIN_nhomalt, gnomAD_exomes_non_ukb_MID_AC, gnomAD_exomes_non_ukb_MID_AN, gnomAD_exomes_non_ukb_MID_AF, gnomAD_exomes_non_ukb_MID_nhomalt, gnomAD_exomes_non_ukb_NFE_AC, gnomAD_exomes_non_ukb_NFE_AN, gnomAD_exomes_non_ukb_NFE_AF, gnomAD_exomes_non_ukb_NFE_nhomalt, gnomAD_exomes_non_ukb_SAS_AC, gnomAD_exomes_non_ukb_SAS_AN, gnomAD_exomes_non_ukb_SAS_AF, gnomAD_exomes_non_ukb_SAS_nhomalt

   Columns deleted (dbNSFP_variant): gnomAD_exomes_controls_AC, gnomAD_exomes_controls_AN, gnomAD_exomes_controls_AF, gnomAD_exomes_controls_nhomalt, gnomAD_exomes_non_neuro_AC, gnomAD_exomes_non_neuro_AN, gnomAD_exomes_non_neuro_AF, gnomAD_exomes_non_neuro_nhomalt, gnomAD_exomes_non_cancer_AC, gnomAD_exomes_non_cancer_AN, gnomAD_exomes_non_cancer_AF, gnomAD_exomes_non_cancer_nhomalt, gnomAD_exomes_non_topmed_AC, gnomAD_exomes_non_topmed_AN, gnomAD_exomes_non_topmed_AF, gnomAD_exomes_non_topmed_nhomalt, gnomAD_exomes_controls_AFR_AC, gnomAD_exomes_controls_AFR_AN, gnomAD_exomes_controls_AFR_AF, gnomAD_exomes_controls_AFR_nhomalt, gnomAD_exomes_controls_AMR_AC, gnomAD_exomes_controls_AMR_AN, gnomAD_exomes_controls_AMR_AF, gnomAD_exomes_controls_AMR_nhomalt, gnomAD_exomes_controls_ASJ_AC, gnomAD_exomes_controls_ASJ_AN, gnomAD_exomes_controls_ASJ_AF, gnomAD_exomes_controls_ASJ_nhomalt, gnomAD_exomes_controls_EAS_AC, gnomAD_exomes_controls_EAS_AN, gnomAD_exomes_controls_EAS_AF, gnomAD_exomes_controls_EAS_nhomalt, gnomAD_exomes_controls_FIN_AC, gnomAD_exomes_controls_FIN_AN, gnomAD_exomes_controls_FIN_AF, gnomAD_exomes_controls_FIN_nhomalt, gnomAD_exomes_controls_NFE_AC, gnomAD_exomes_controls_NFE_AN, gnomAD_exomes_controls_NFE_AF, gnomAD_exomes_controls_NFE_nhomalt, gnomAD_exomes_controls_SAS_AC, gnomAD_exomes_controls_SAS_AN, gnomAD_exomes_controls_SAS_AF, gnomAD_exomes_controls_SAS_nhomalt, gnomAD_exomes_controls_POPMAX_AC, gnomAD_exomes_controls_POPMAX_AN, gnomAD_exomes_controls_POPMAX_AF, gnomAD_exomes_controls_POPMAX_nhomalt, gnomAD_exomes_non_neuro_AFR_AC, gnomAD_exomes_non_neuro_AFR_AN, gnomAD_exomes_non_neuro_AFR_AF, gnomAD_exomes_non_neuro_AFR_nhomalt, gnomAD_exomes_non_neuro_AMR_AC, gnomAD_exomes_non_neuro_AMR_AN, gnomAD_exomes_non_neuro_AMR_AF, gnomAD_exomes_non_neuro_AMR_nhomalt, gnomAD_exomes_non_neuro_ASJ_AC, gnomAD_exomes_non_neuro_ASJ_AN, gnomAD_exomes_non_neuro_ASJ_AF, gnomAD_exomes_non_neuro_ASJ_nhomalt, gnomAD_exomes_non_neuro_EAS_AC, gnomAD_exomes_non_neuro_EAS_AN, gnomAD_exomes_non_neuro_EAS_AF, gnomAD_exomes_non_neuro_EAS_nhomalt, gnomAD_exomes_non_neuro_FIN_AC, gnomAD_exomes_non_neuro_FIN_AN, gnomAD_exomes_non_neuro_FIN_AF, gnomAD_exomes_non_neuro_FIN_nhomalt, gnomAD_exomes_non_neuro_NFE_AC, gnomAD_exomes_non_neuro_NFE_AN, gnomAD_exomes_non_neuro_NFE_AF, gnomAD_exomes_non_neuro_NFE_nhomalt, gnomAD_exomes_non_neuro_SAS_AC, gnomAD_exomes_non_neuro_SAS_AN, gnomAD_exomes_non_neuro_SAS_AF, gnomAD_exomes_non_neuro_SAS_nhomalt, gnomAD_exomes_non_neuro_POPMAX_AC, gnomAD_exomes_non_neuro_POPMAX_AN, gnomAD_exomes_non_neuro_POPMAX_AF, gnomAD_exomes_non_neuro_POPMAX_nhomalt, gnomAD_exomes_non_cancer_AFR_AC, gnomAD_exomes_non_cancer_AFR_AN, gnomAD_exomes_non_cancer_AFR_AF, gnomAD_exomes_non_cancer_AFR_nhomalt, gnomAD_exomes_non_cancer_AMR_AC, gnomAD_exomes_non_cancer_AMR_AN, gnomAD_exomes_non_cancer_AMR_AF, gnomAD_exomes_non_cancer_AMR_nhomalt, gnomAD_exomes_non_cancer_ASJ_AC, gnomAD_exomes_non_cancer_ASJ_AN, gnomAD_exomes_non_cancer_ASJ_AF, gnomAD_exomes_non_cancer_ASJ_nhomalt, gnomAD_exomes_non_cancer_EAS_AC, gnomAD_exomes_non_cancer_EAS_AN, gnomAD_exomes_non_cancer_EAS_AF, gnomAD_exomes_non_cancer_EAS_nhomalt, gnomAD_exomes_non_cancer_FIN_AC, gnomAD_exomes_non_cancer_FIN_AN, gnomAD_exomes_non_cancer_FIN_AF, gnomAD_exomes_non_cancer_FIN_nhomalt, gnomAD_exomes_non_cancer_NFE_AC, gnomAD_exomes_non_cancer_NFE_AN, gnomAD_exomes_non_cancer_NFE_AF, gnomAD_exomes_non_cancer_NFE_nhomalt, gnomAD_exomes_non_cancer_SAS_AC, gnomAD_exomes_non_cancer_SAS_AN, gnomAD_exomes_non_cancer_SAS_AF, gnomAD_exomes_non_cancer_SAS_nhomalt, gnomAD_exomes_non_cancer_POPMAX_AC, gnomAD_exomes_non_cancer_POPMAX_AN, gnomAD_exomes_non_cancer_POPMAX_AF, gnomAD_exomes_non_cancer_POPMAX_nhomalt, gnomAD_exomes_non_topmed_AFR_AC, gnomAD_exomes_non_topmed_AFR_AN, gnomAD_exomes_non_topmed_AFR_AF, gnomAD_exomes_non_topmed_AFR_nhomalt, gnomAD_exomes_non_topmed_AMR_AC, gnomAD_exomes_non_topmed_AMR_AN, gnomAD_exomes_non_topmed_AMR_AF, gnomAD_exomes_non_topmed_AMR_nhomalt, gnomAD_exomes_non_topmed_ASJ_AC, gnomAD_exomes_non_topmed_ASJ_AN, gnomAD_exomes_non_topmed_ASJ_AF, gnomAD_exomes_non_topmed_ASJ_nhomalt, gnomAD_exomes_non_topmed_EAS_AC, gnomAD_exomes_non_topmed_EAS_AN, gnomAD_exomes_non_topmed_EAS_AF, gnomAD_exomes_non_topmed_EAS_nhomalt, gnomAD_exomes_non_topmed_FIN_AC, gnomAD_exomes_non_topmed_FIN_AN, gnomAD_exomes_non_topmed_FIN_AF, gnomAD_exomes_non_topmed_FIN_nhomalt, gnomAD_exomes_non_topmed_NFE_AC, gnomAD_exomes_non_topmed_NFE_AN, gnomAD_exomes_non_topmed_NFE_AF, gnomAD_exomes_non_topmed_NFE_nhomalt, gnomAD_exomes_non_topmed_SAS_AC, gnomAD_exomes_non_topmed_SAS_AN, gnomAD_exomes_non_topmed_SAS_AF, gnomAD_exomes_non_topmed_SAS_nhomalt, gnomAD_exomes_non_topmed_POPMAX_AC, gnomAD_exomes_non_topmed_POPMAX_AN, gnomAD_exomes_non_topmed_POPMAX_AF, gnomAD_exomes_non_topmed_POPMAX_nhomalt, gnomAD_genomes_controls_and_biobanks_AC, gnomAD_genomes_controls_and_biobanks_AN, gnomAD_genomes_controls_and_biobanks_AF, gnomAD_genomes_controls_and_biobanks_nhomalt, gnomAD_genomes_non_neuro_AC, gnomAD_genomes_non_neuro_AN, gnomAD_genomes_non_neuro_AF, gnomAD_genomes_non_neuro_nhomalt, gnomAD_genomes_non_cancer_AC, gnomAD_genomes_non_cancer_AN, gnomAD_genomes_non_cancer_AF, gnomAD_genomes_non_cancer_nhomalt, gnomAD_genomes_non_topmed_AC, gnomAD_genomes_non_topmed_AN, gnomAD_genomes_non_topmed_AF, gnomAD_genomes_non_topmed_nhomalt, gnomAD_genomes_controls_and_biobanks_AFR_AC, gnomAD_genomes_controls_and_biobanks_AFR_AN, gnomAD_genomes_controls_and_biobanks_AFR_AF, gnomAD_genomes_controls_and_biobanks_AFR_nhomalt, gnomAD_genomes_controls_and_biobanks_AMI_AC, gnomAD_genomes_controls_and_biobanks_AMI_AN, gnomAD_genomes_controls_and_biobanks_AMI_AF, gnomAD_genomes_controls_and_biobanks_AMI_nhomalt, gnomAD_genomes_controls_and_biobanks_AMR_AC, gnomAD_genomes_controls_and_biobanks_AMR_AN, gnomAD_genomes_controls_and_biobanks_AMR_AF, gnomAD_genomes_controls_and_biobanks_AMR_nhomalt, gnomAD_genomes_controls_and_biobanks_ASJ_AC, gnomAD_genomes_controls_and_biobanks_ASJ_AN, gnomAD_genomes_controls_and_biobanks_ASJ_AF, gnomAD_genomes_controls_and_biobanks_ASJ_nhomalt, gnomAD_genomes_controls_and_biobanks_EAS_AC, gnomAD_genomes_controls_and_biobanks_EAS_AN, gnomAD_genomes_controls_and_biobanks_EAS_AF, gnomAD_genomes_controls_and_biobanks_EAS_nhomalt, gnomAD_genomes_controls_and_biobanks_FIN_AC, gnomAD_genomes_controls_and_biobanks_FIN_AN, gnomAD_genomes_controls_and_biobanks_FIN_AF, gnomAD_genomes_controls_and_biobanks_FIN_nhomalt, gnomAD_genomes_controls_and_biobanks_MID_AC, gnomAD_genomes_controls_and_biobanks_MID_AN, gnomAD_genomes_controls_and_biobanks_MID_AF, gnomAD_genomes_controls_and_biobanks_MID_nhomalt, gnomAD_genomes_controls_and_biobanks_NFE_AC, gnomAD_genomes_controls_and_biobanks_NFE_AN, gnomAD_genomes_controls_and_biobanks_NFE_AF, gnomAD_genomes_controls_and_biobanks_NFE_nhomalt, gnomAD_genomes_controls_and_biobanks_SAS_AC, gnomAD_genomes_controls_and_biobanks_SAS_AN, gnomAD_genomes_controls_and_biobanks_SAS_AF, gnomAD_genomes_controls_and_biobanks_SAS_nhomalt, gnomAD_genomes_non_neuro_AFR_AC, gnomAD_genomes_non_neuro_AFR_AN, gnomAD_genomes_non_neuro_AFR_AF, gnomAD_genomes_non_neuro_AFR_nhomalt, gnomAD_genomes_non_neuro_AMI_AC, gnomAD_genomes_non_neuro_AMI_AN, gnomAD_genomes_non_neuro_AMI_AF, gnomAD_genomes_non_neuro_AMI_nhomalt, gnomAD_genomes_non_neuro_AMR_AC, gnomAD_genomes_non_neuro_AMR_AN, gnomAD_genomes_non_neuro_AMR_AF, gnomAD_genomes_non_neuro_AMR_nhomalt, gnomAD_genomes_non_neuro_ASJ_AC, gnomAD_genomes_non_neuro_ASJ_AN, gnomAD_genomes_non_neuro_ASJ_AF, gnomAD_genomes_non_neuro_ASJ_nhomalt, gnomAD_genomes_non_neuro_EAS_AC, gnomAD_genomes_non_neuro_EAS_AN, gnomAD_genomes_non_neuro_EAS_AF, gnomAD_genomes_non_neuro_EAS_nhomalt, gnomAD_genomes_non_neuro_FIN_AC, gnomAD_genomes_non_neuro_FIN_AN, gnomAD_genomes_non_neuro_FIN_AF, gnomAD_genomes_non_neuro_FIN_nhomalt, gnomAD_genomes_non_neuro_MID_AC, gnomAD_genomes_non_neuro_MID_AN, gnomAD_genomes_non_neuro_MID_AF, gnomAD_genomes_non_neuro_MID_nhomalt, gnomAD_genomes_non_neuro_NFE_AC, gnomAD_genomes_non_neuro_NFE_AN, gnomAD_genomes_non_neuro_NFE_AF, gnomAD_genomes_non_neuro_NFE_nhomalt, gnomAD_genomes_non_neuro_SAS_AC, gnomAD_genomes_non_neuro_SAS_AN, gnomAD_genomes_non_neuro_SAS_AF, gnomAD_genomes_non_neuro_SAS_nhomalt, gnomAD_genomes_non_cancer_AFR_AC, gnomAD_genomes_non_cancer_AFR_AN, gnomAD_genomes_non_cancer_AFR_AF, gnomAD_genomes_non_cancer_AFR_nhomalt, gnomAD_genomes_non_cancer_AMI_AC, gnomAD_genomes_non_cancer_AMI_AN, gnomAD_genomes_non_cancer_AMI_AF, gnomAD_genomes_non_cancer_AMI_nhomalt, gnomAD_genomes_non_cancer_AMR_AC, gnomAD_genomes_non_cancer_AMR_AN, gnomAD_genomes_non_cancer_AMR_AF, gnomAD_genomes_non_cancer_AMR_nhomalt, gnomAD_genomes_non_cancer_ASJ_AC, gnomAD_genomes_non_cancer_ASJ_AN, gnomAD_genomes_non_cancer_ASJ_AF, gnomAD_genomes_non_cancer_ASJ_nhomalt, gnomAD_genomes_non_cancer_EAS_AC, gnomAD_genomes_non_cancer_EAS_AN, gnomAD_genomes_non_cancer_EAS_AF, gnomAD_genomes_non_cancer_EAS_nhomalt, gnomAD_genomes_non_cancer_FIN_AC, gnomAD_genomes_non_cancer_FIN_AN, gnomAD_genomes_non_cancer_FIN_AF, gnomAD_genomes_non_cancer_FIN_nhomalt, gnomAD_genomes_non_cancer_MID_AC, gnomAD_genomes_non_cancer_MID_AN, gnomAD_genomes_non_cancer_MID_AF, gnomAD_genomes_non_cancer_MID_nhomalt, gnomAD_genomes_non_cancer_NFE_AC, gnomAD_genomes_non_cancer_NFE_AN, gnomAD_genomes_non_cancer_NFE_AF, gnomAD_genomes_non_cancer_NFE_nhomalt, gnomAD_genomes_non_cancer_SAS_AC, gnomAD_genomes_non_cancer_SAS_AN, gnomAD_genomes_non_cancer_SAS_AF, gnomAD_genomes_non_cancer_SAS_nhomalt, gnomAD_genomes_non_topmed_AFR_AC, gnomAD_genomes_non_topmed_AFR_AN, gnomAD_genomes_non_topmed_AFR_AF, gnomAD_genomes_non_topmed_AFR_nhomalt, gnomAD_genomes_non_topmed_AMI_AC, gnomAD_genomes_non_topmed_AMI_AN, gnomAD_genomes_non_topmed_AMI_AF, gnomAD_genomes_non_topmed_AMI_nhomalt, gnomAD_genomes_non_topmed_AMR_AC, gnomAD_genomes_non_topmed_AMR_AN, gnomAD_genomes_non_topmed_AMR_AF, gnomAD_genomes_non_topmed_AMR_nhomalt, gnomAD_genomes_non_topmed_ASJ_AC, gnomAD_genomes_non_topmed_ASJ_AN, gnomAD_genomes_non_topmed_ASJ_AF, gnomAD_genomes_non_topmed_ASJ_nhomalt, gnomAD_genomes_non_topmed_EAS_AC, gnomAD_genomes_non_topmed_EAS_AN, gnomAD_genomes_non_topmed_EAS_AF, gnomAD_genomes_non_topmed_EAS_nhomalt, gnomAD_genomes_non_topmed_FIN_AC, gnomAD_genomes_non_topmed_FIN_AN, gnomAD_genomes_non_topmed_FIN_AF, gnomAD_genomes_non_topmed_FIN_nhomalt, gnomAD_genomes_non_topmed_MID_AC, gnomAD_genomes_non_topmed_MID_AN, gnomAD_genomes_non_topmed_MID_AF, gnomAD_genomes_non_topmed_MID_nhomalt, gnomAD_genomes_non_topmed_NFE_AC, gnomAD_genomes_non_topmed_NFE_AN, gnomAD_genomes_non_topmed_NFE_AF, gnomAD_genomes_non_topmed_NFE_nhomalt, gnomAD_genomes_non_topmed_SAS_AC, gnomAD_genomes_non_topmed_SAS_AN, gnomAD_genomes_non_topmed_SAS_AF, gnomAD_genomes_non_topmed_SAS_nhomalt

   NEW VERSION (February 18, 2024): dbNSFP v4.6 is released. ClinVar has been updated to 20240215. GTEx V8 splicing QTLs (sQTLs) have been added. eQTLs from eQTLGen phase I have been added. There was a bug in v4.5 causing a large proportion of ESM1b scores to be misaligned. It has been fixed. We thank Dr. In-Hee Lee for reporting this bug.

   Two branches of dbNSFP are provided: dbNSFP4.6a suitable for academic use, which includes all the resources,  and dbNSFP4.6c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, GenoCanyon, and AlphaMissense.  dbNSFP4.6a can be downloaded from Amazon or  Box or googledrive or  softgenetics ftp. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.6c can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here

   Columns added (dbNSFP_variant): GTEx_V8_sQTL_gene, GTEx_V8_sQTL_tissue, eQTLGen_snp_id, eQTLGen_gene_id, eQTLGen_gene_symbol, eQTLGen_cis_or_trans

   Columns name changes (dbNSFP_variant): GTEx_V8_gene changed to GTEx_V8_eQTL_gene, GTEx_V8_eQTL_tissue changed to GTEx_V8_eQTL_tissue

   NEW VERSION (November 2, 2023): dbNSFP v4.5 is released. ClinVar has been updated to 20231028. ESM1b, EVE and AlphaMissense scores have been added. AlphaMissense scores are for non-commercial research use only: "AlphaMissense Database Copyright (2023) DeepMind Technologies Limited. All predictions are provided for non-commercial research use only under CC BY-NC-SA license." This distribution of the derived AlphaMissense_score, AlphaMissense_rankscore, and AlphaMissense_pred in dbNSFP are also under CC BY-NC-SA license and only included in the "a" branch of dbNSFP. A copy of CC BY-NC-SA license can be found at https://creativecommons.org/licenses/by-nc-sa/4.0/. 

   Two branches of dbNSFP are provided: dbNSFP4.5a suitable for academic use, which includes all the resources,  and dbNSFP4.5c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, GenoCanyon, and AlphaMissense.  dbNSFP4.5a can be downloaded from Amazon or  Box or googledrive or  softgenetics ftp. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.5c can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here

    Columns added (dbNSFP_variant): ESM1b_score, ESM1b_rankscore, ESM1b_pred, EVE_score, EVE_rankscore, EVE_Class10_pred, EVE_Class20_pred, EVE_Class25_pred, EVE_Class30_pred, EVE_Class40_pred, EVE_Class50_pred, EVE_Class60_pred, EVE_Class70_pred, EVE_Class75_pred, EVE_Class80_pred, EVE_Class90_pred, AlphaMissense_score, AlphaMissense_rankscore, AlphaMissense_pred

   NEW VERSION (May 6, 2023):  dbNSFP v4.4 is released. gMVP and VARITY scores have been added. Allele frequencies of ALFA (Allele Frequency Aggregator) have been added. dbSNP has been updated to b156. clinvar has been updated to 20230430. phyloP30way_mammalian has been replaced by phyloP470way_mammalian. phastCons30way_mammalian has been replaced by phastCons470way_mammalian. A bug in MutPred scores (not all SNVs causing the same AA change have scores) has been fixed.  We thank Mária Šurinová for reporting this bug.  

    Two branches of dbNSFP are provided: dbNSFP4.4a suitable for academic use, which includes all the resources,  and dbNSFP4.4c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, and GenoCanyon.  dbNSFP4.4a can be downloaded from Amazon or  Box or googledrive or  softgenetics ftp. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.4c can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here

  Columns added (dbNSFP_variant):  gMVP_score, gMVP_rankscore, VARITY_R_score, VARITY_R_rankscore, VARITY_ER_score, VARITY_ER_rankscore, VARITY_R_LOO_score, VARITY_R_LOO_rankscore, VARITY_ER_LOO_score, VARITY_ER_LOO_rankscore, ALFA_European_AC, ALFA_European_AN, ALFA_European_AF, ALFA_African_Others_AC, ALFA_African_Others_AN, ALFA_African_Others_AF, ALFA_East_Asian_AC, ALFA_East_Asian_AN, ALFA_East_Asian_AF, ALFA_African_American_AC, ALFA_African_American_AN, ALFA_African_American_AF, ALFA_Latin_American_1_AC, ALFA_Latin_American_1_AN, ALFA_Latin_American_1_AF, ALFA_Latin_American_2_AC, ALFA_Latin_American_2_AN, ALFA_Latin_American_2_AF, ALFA_Other_Asian_AC, ALFA_Other_Asian_AN, ALFA_Other_Asian_AF, ALFA_South_Asian_AC, ALFA_South_Asian_AN, ALFA_South_Asian_AF, ALFA_Other_AC, ALFA_Other_AN, ALFA_Other_AF, ALFA_African_AC, ALFA_African_AN, ALFA_African_AF, ALFA_Asian_AC, ALFA_Asian_AN, ALFA_Asian_AF, ALFA_Total_AC, ALFA_Total_AN, ALFA_Total_AF

  Columns name changes (dbNSFP_variant): phyloP30way_mammalian changed to phyloP470way_mammalian, phyloP30way_mammalian_rankscore changed to phyloP470way_mammalian_rankscore, phastCons30way_mammalian changed to phastCons470way_mammalian, phastCons30way_mammalian_rankscore changed to phastCons470way_mammalian_rankscore

   UPDATE (January 23, 2023): There is a bug in dbNSFP4.3_gene.gz in the columns Interactions(IntAct), Interactions(BioGRID), Interactions(ConsensusPathDB) when counting the interactive genes in dbNSFP4.3_gene.complete.gz, as missing interaction "." was counted as one interaction.  We thank 山田 涼太 for reporting this bug. If you need to use those interaction numbers, please re-download the whole dbNSFP zip file or use this file to replace the old one.   

  NEW VERSION (February 18, 2022):  dbNSFP v4.3 is released. REVEL scores have been updated with transcript ids, i.e., the scores are now transcript-specific. Genotypes of Chagyrskaya  Neandertals have been added. dbSNP has been updated to b155. ClinVar has been updated to 20220122.  dbNSFP4.3a can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.3c can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here

  Columns added (dbNSFP_variant): ChagyrskayaNeandertal

  UPDATE (July 27, 2021)Allele Frequencies of gnomAD v3.1 mtDNA have been added. Users who want to use those annotations can download the corresponding files for v4.2a (Amazon or Box or google drive) and v4.2c (Amazon or Box or google drive) and the updated readme files for v4.2a and v4.2c

   UPDATE (July 14, 2021):  There is a bug in the search program when adding columns of gene annotations. We thank Eun Gyo Kim for reporting it. Please re-download the search programs for  v4.2a and v4.2c and unzip the files to the dbNSFP folder. 

   UPDATE (April 14, 2021): We added the support of searching SpliceAI to the command-line only version of the search programs for v4.2a and v4.2c. If you want to use those programs, please download the corresponding zip files. Then unzip the files to the dbNSFP folder and run them according to the updated search_dbNSFP readme file. 

    NEW VERSION (April 6, 2021):  dbNSFP v4.2 is released. MetaRNN scores (https://doi.org/10.1101/2021.04.09.438706) have been added. MetaRNN is a deep learning based ensemble pathogenicity prediction score for nsSNVs and non-frameshift indels. MetaRNN used a recurrent neural network (RNN) to integrate information from 16 high-level pathogenicity prediction scores, 8 conservation scores, and allele frequency information from the 1000 Genomes Project (1000GP), ExAC, and gnomAD. Allele frequencies of gnomAD exome have been updated to r2.1.1. Allele Frequencies of gnomAD genome have been updated to v3.1. dbSNP has been updated to 154. clinvar has been updated to 20210131. dbNSFP4.2a can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.2c can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here.

    Columns added (dbNSFP_variant): MetaRNN_score, MetaRNN_rankscore, MetaRNN_pred, gnomAD_exomes_non_neuro_AN, gnomAD_exomes_non_neuro_AF, gnomAD_exomes_non_neuro_nhomalt, gnomAD_exomes_non_cancer_AC, gnomAD_exomes_non_cancer_AN, gnomAD_exomes_non_cancer_AF, gnomAD_exomes_non_cancer_nhomalt, gnomAD_exomes_non_topmed_AC, gnomAD_exomes_non_topmed_AN, gnomAD_exomes_non_topmed_AF, gnomAD_exomes_non_topmed_nhomalt, gnomAD_exomes_non_neuro_AFR_AC, gnomAD_exomes_non_neuro_AFR_AN, gnomAD_exomes_non_neuro_AFR_AF, gnomAD_exomes_non_neuro_AFR_nhomalt, gnomAD_exomes_non_neuro_AMR_AC, gnomAD_exomes_non_neuro_AMR_AN, gnomAD_exomes_non_neuro_AMR_AF, gnomAD_exomes_non_neuro_AMR_nhomalt, gnomAD_exomes_non_neuro_ASJ_AC, gnomAD_exomes_non_neuro_ASJ_AN, gnomAD_exomes_non_neuro_ASJ_AF, gnomAD_exomes_non_neuro_ASJ_nhomalt, gnomAD_exomes_non_neuro_EAS_AC, gnomAD_exomes_non_neuro_EAS_AN, gnomAD_exomes_non_neuro_EAS_AF, gnomAD_exomes_non_neuro_EAS_nhomalt, gnomAD_exomes_non_neuro_FIN_AC, gnomAD_exomes_non_neuro_FIN_AN, gnomAD_exomes_non_neuro_FIN_AF, gnomAD_exomes_non_neuro_FIN_nhomalt, gnomAD_exomes_non_neuro_NFE_AC, gnomAD_exomes_non_neuro_NFE_AN, gnomAD_exomes_non_neuro_NFE_AF, gnomAD_exomes_non_neuro_NFE_nhomalt, gnomAD_exomes_non_neuro_SAS_AC, gnomAD_exomes_non_neuro_SAS_AN, gnomAD_exomes_non_neuro_SAS_AF, gnomAD_exomes_non_neuro_SAS_nhomalt, gnomAD_exomes_non_neuro_POPMAX_AC, gnomAD_exomes_non_neuro_POPMAX_AN, gnomAD_exomes_non_neuro_POPMAX_AF, gnomAD_exomes_non_neuro_POPMAX_nhomalt, gnomAD_exomes_non_cancer_AFR_AC, gnomAD_exomes_non_cancer_AFR_AN, gnomAD_exomes_non_cancer_AFR_AF, gnomAD_exomes_non_cancer_AFR_nhomalt, gnomAD_exomes_non_cancer_AMR_AC, gnomAD_exomes_non_cancer_AMR_AN, gnomAD_exomes_non_cancer_AMR_AF, gnomAD_exomes_non_cancer_AMR_nhomalt, gnomAD_exomes_non_cancer_ASJ_AC, gnomAD_exomes_non_cancer_ASJ_AN, gnomAD_exomes_non_cancer_ASJ_AF, gnomAD_exomes_non_cancer_ASJ_nhomalt, gnomAD_exomes_non_cancer_EAS_AC, gnomAD_exomes_non_cancer_EAS_AN, gnomAD_exomes_non_cancer_EAS_AF, gnomAD_exomes_non_cancer_EAS_nhomalt, gnomAD_exomes_non_cancer_FIN_AC, gnomAD_exomes_non_cancer_FIN_AN, gnomAD_exomes_non_cancer_FIN_AF, gnomAD_exomes_non_cancer_FIN_nhomalt, gnomAD_exomes_non_cancer_NFE_AC, gnomAD_exomes_non_cancer_NFE_AN, gnomAD_exomes_non_cancer_NFE_AF, gnomAD_exomes_non_cancer_NFE_nhomalt, gnomAD_exomes_non_cancer_SAS_AC, gnomAD_exomes_non_cancer_SAS_AN, gnomAD_exomes_non_cancer_SAS_AF, gnomAD_exomes_non_cancer_SAS_nhomalt, gnomAD_exomes_non_cancer_POPMAX_AC, gnomAD_exomes_non_cancer_POPMAX_AN, gnomAD_exomes_non_cancer_POPMAX_AF, gnomAD_exomes_non_cancer_POPMAX_nhomalt, gnomAD_exomes_non_topmed_AFR_AC, gnomAD_exomes_non_topmed_AFR_AN, gnomAD_exomes_non_topmed_AFR_AF, gnomAD_exomes_non_topmed_AFR_nhomalt, gnomAD_exomes_non_topmed_AMR_AC, gnomAD_exomes_non_topmed_AMR_AN, gnomAD_exomes_non_topmed_AMR_AF, gnomAD_exomes_non_topmed_AMR_nhomalt, gnomAD_exomes_non_topmed_ASJ_AC, gnomAD_exomes_non_topmed_ASJ_AN, gnomAD_exomes_non_topmed_ASJ_AF, gnomAD_exomes_non_topmed_ASJ_nhomalt, gnomAD_exomes_non_topmed_EAS_AC, gnomAD_exomes_non_topmed_EAS_AN, gnomAD_exomes_non_topmed_EAS_AF, gnomAD_exomes_non_topmed_EAS_nhomalt, gnomAD_exomes_non_topmed_FIN_AC, gnomAD_exomes_non_topmed_FIN_AN, gnomAD_exomes_non_topmed_FIN_AF, gnomAD_exomes_non_topmed_FIN_nhomalt, gnomAD_exomes_non_topmed_NFE_AC, gnomAD_exomes_non_topmed_NFE_AN, gnomAD_exomes_non_topmed_NFE_AF, gnomAD_exomes_non_topmed_NFE_nhomalt, gnomAD_exomes_non_topmed_SAS_AC, gnomAD_exomes_non_topmed_SAS_AN, gnomAD_exomes_non_topmed_SAS_AF, gnomAD_exomes_non_topmed_SAS_nhomalt, gnomAD_exomes_non_topmed_POPMAX_AC, gnomAD_exomes_non_topmed_POPMAX_AN, gnomAD_exomes_non_topmed_POPMAX_AF, gnomAD_exomes_non_topmed_POPMAX_nhomalt, gnomAD_genomes_MID_AC, gnomAD_genomes_MID_AN, gnomAD_genomes_MID_AF, gnomAD_genomes_MID_nhomalt, gnomAD_genomes_controls_and_biobanks_AC, gnomAD_genomes_controls_and_biobanks_AN, gnomAD_genomes_controls_and_biobanks_AF, gnomAD_genomes_controls_and_biobanks_nhomalt, gnomAD_genomes_non_neuro_AC, gnomAD_genomes_non_neuro_AN, gnomAD_genomes_non_neuro_AF, gnomAD_genomes_non_neuro_nhomalt, gnomAD_genomes_non_cancer_AC, gnomAD_genomes_non_cancer_AN, gnomAD_genomes_non_cancer_AF, gnomAD_genomes_non_cancer_nhomalt, gnomAD_genomes_non_topmed_AC, gnomAD_genomes_non_topmed_AN, gnomAD_genomes_non_topmed_AF, gnomAD_genomes_non_topmed_nhomalt, gnomAD_genomes_controls_and_biobanks_AFR_AC, gnomAD_genomes_controls_and_biobanks_AFR_AN, gnomAD_genomes_controls_and_biobanks_AFR_AF, gnomAD_genomes_controls_and_biobanks_AFR_nhomalt, gnomAD_genomes_controls_and_biobanks_AMI_AC, gnomAD_genomes_controls_and_biobanks_AMI_AN, gnomAD_genomes_controls_and_biobanks_AMI_AF, gnomAD_genomes_controls_and_biobanks_AMI_nhomalt, gnomAD_genomes_controls_and_biobanks_AMR_AC, gnomAD_genomes_controls_and_biobanks_AMR_AN, gnomAD_genomes_controls_and_biobanks_AMR_AF, gnomAD_genomes_controls_and_biobanks_AMR_nhomalt, gnomAD_genomes_controls_and_biobanks_ASJ_AC, gnomAD_genomes_controls_and_biobanks_ASJ_AN, gnomAD_genomes_controls_and_biobanks_ASJ_AF, gnomAD_genomes_controls_and_biobanks_ASJ_nhomalt, gnomAD_genomes_controls_and_biobanks_EAS_AC, gnomAD_genomes_controls_and_biobanks_EAS_AN, gnomAD_genomes_controls_and_biobanks_EAS_AF, gnomAD_genomes_controls_and_biobanks_EAS_nhomalt, gnomAD_genomes_controls_and_biobanks_FIN_AC, gnomAD_genomes_controls_and_biobanks_FIN_AN, gnomAD_genomes_controls_and_biobanks_FIN_AF, gnomAD_genomes_controls_and_biobanks_FIN_nhomalt, gnomAD_genomes_controls_and_biobanks_MID_AC, gnomAD_genomes_controls_and_biobanks_MID_AN, gnomAD_genomes_controls_and_biobanks_MID_AF, gnomAD_genomes_controls_and_biobanks_MID_nhomalt, gnomAD_genomes_controls_and_biobanks_NFE_AC, gnomAD_genomes_controls_and_biobanks_NFE_AN, gnomAD_genomes_controls_and_biobanks_NFE_AF, gnomAD_genomes_controls_and_biobanks_NFE_nhomalt, gnomAD_genomes_controls_and_biobanks_SAS_AC, gnomAD_genomes_controls_and_biobanks_SAS_AN, gnomAD_genomes_controls_and_biobanks_SAS_AF, gnomAD_genomes_controls_and_biobanks_SAS_nhomalt, gnomAD_genomes_non_neuro_AFR_AC, gnomAD_genomes_non_neuro_AFR_AN, gnomAD_genomes_non_neuro_AFR_AF, gnomAD_genomes_non_neuro_AFR_nhomalt, gnomAD_genomes_non_neuro_AMI_AC, gnomAD_genomes_non_neuro_AMI_AN, gnomAD_genomes_non_neuro_AMI_AF, gnomAD_genomes_non_neuro_AMI_nhomalt, gnomAD_genomes_non_neuro_AMR_AC, gnomAD_genomes_non_neuro_AMR_AN, gnomAD_genomes_non_neuro_AMR_AF, gnomAD_genomes_non_neuro_AMR_nhomalt, gnomAD_genomes_non_neuro_ASJ_AC, gnomAD_genomes_non_neuro_ASJ_AN, gnomAD_genomes_non_neuro_ASJ_AF, gnomAD_genomes_non_neuro_ASJ_nhomalt, gnomAD_genomes_non_neuro_EAS_AC, gnomAD_genomes_non_neuro_EAS_AN, gnomAD_genomes_non_neuro_EAS_AF, gnomAD_genomes_non_neuro_EAS_nhomalt, gnomAD_genomes_non_neuro_FIN_AC, gnomAD_genomes_non_neuro_FIN_AN, gnomAD_genomes_non_neuro_FIN_AF, gnomAD_genomes_non_neuro_FIN_nhomalt, gnomAD_genomes_non_neuro_MID_AC, gnomAD_genomes_non_neuro_MID_AN, gnomAD_genomes_non_neuro_MID_AF, gnomAD_genomes_non_neuro_MID_nhomalt, gnomAD_genomes_non_neuro_NFE_AC, gnomAD_genomes_non_neuro_NFE_AN, gnomAD_genomes_non_neuro_NFE_AF, gnomAD_genomes_non_neuro_NFE_nhomalt, gnomAD_genomes_non_neuro_SAS_AC, gnomAD_genomes_non_neuro_SAS_AN, gnomAD_genomes_non_neuro_SAS_AF, gnomAD_genomes_non_neuro_SAS_nhomalt, gnomAD_genomes_non_cancer_AFR_AC, gnomAD_genomes_non_cancer_AFR_AN, gnomAD_genomes_non_cancer_AFR_AF, gnomAD_genomes_non_cancer_AFR_nhomalt, gnomAD_genomes_non_cancer_AMI_AC, gnomAD_genomes_non_cancer_AMI_AN, gnomAD_genomes_non_cancer_AMI_AF, gnomAD_genomes_non_cancer_AMI_nhomalt, gnomAD_genomes_non_cancer_AMR_AC, gnomAD_genomes_non_cancer_AMR_AN, gnomAD_genomes_non_cancer_AMR_AF, gnomAD_genomes_non_cancer_AMR_nhomalt, gnomAD_genomes_non_cancer_ASJ_AC, gnomAD_genomes_non_cancer_ASJ_AN, gnomAD_genomes_non_cancer_ASJ_AF, gnomAD_genomes_non_cancer_ASJ_nhomalt, gnomAD_genomes_non_cancer_EAS_AC, gnomAD_genomes_non_cancer_EAS_AN, gnomAD_genomes_non_cancer_EAS_AF, gnomAD_genomes_non_cancer_EAS_nhomalt, gnomAD_genomes_non_cancer_FIN_AC, gnomAD_genomes_non_cancer_FIN_AN, gnomAD_genomes_non_cancer_FIN_AF, gnomAD_genomes_non_cancer_FIN_nhomalt, gnomAD_genomes_non_cancer_MID_AC, gnomAD_genomes_non_cancer_MID_AN, gnomAD_genomes_non_cancer_MID_AF, gnomAD_genomes_non_cancer_MID_nhomalt, gnomAD_genomes_non_cancer_NFE_AC, gnomAD_genomes_non_cancer_NFE_AN, gnomAD_genomes_non_cancer_NFE_AF, gnomAD_genomes_non_cancer_NFE_nhomalt, gnomAD_genomes_non_cancer_SAS_AC, gnomAD_genomes_non_cancer_SAS_AN, gnomAD_genomes_non_cancer_SAS_AF, gnomAD_genomes_non_cancer_SAS_nhomalt, gnomAD_genomes_non_topmed_AFR_AC, gnomAD_genomes_non_topmed_AFR_AN, gnomAD_genomes_non_topmed_AFR_AF, gnomAD_genomes_non_topmed_AFR_nhomalt, gnomAD_genomes_non_topmed_AMI_AC, gnomAD_genomes_non_topmed_AMI_AN, gnomAD_genomes_non_topmed_AMI_AF, gnomAD_genomes_non_topmed_AMI_nhomalt, gnomAD_genomes_non_topmed_AMR_AC, gnomAD_genomes_non_topmed_AMR_AN, gnomAD_genomes_non_topmed_AMR_AF, gnomAD_genomes_non_topmed_AMR_nhomalt, gnomAD_genomes_non_topmed_ASJ_AC, gnomAD_genomes_non_topmed_ASJ_AN, gnomAD_genomes_non_topmed_ASJ_AF, gnomAD_genomes_non_topmed_ASJ_nhomalt, gnomAD_genomes_non_topmed_EAS_AC, gnomAD_genomes_non_topmed_EAS_AN, gnomAD_genomes_non_topmed_EAS_AF, gnomAD_genomes_non_topmed_EAS_nhomalt, gnomAD_genomes_non_topmed_FIN_AC, gnomAD_genomes_non_topmed_FIN_AN, gnomAD_genomes_non_topmed_FIN_AF, gnomAD_genomes_non_topmed_FIN_nhomalt, gnomAD_genomes_non_topmed_MID_AC, gnomAD_genomes_non_topmed_MID_AN, gnomAD_genomes_non_topmed_MID_AF, gnomAD_genomes_non_topmed_MID_nhomalt, gnomAD_genomes_non_topmed_NFE_AC, gnomAD_genomes_non_topmed_NFE_AN, gnomAD_genomes_non_topmed_NFE_AF, gnomAD_genomes_non_topmed_NFE_nhomalt, gnomAD_genomes_non_topmed_SAS_AC, gnomAD_genomes_non_topmed_SAS_AN, gnomAD_genomes_non_topmed_SAS_AF, gnomAD_genomes_non_topmed_SAS_nhomalt

    Columns name changes (dbNSFP_variant): rs_dbSNP151 changed to rs_dbSNP

    UPDATE (March 12, 2021): A bug fixed. Some ALoFT scores/information are missing in dbNSFP. We thank Dr. Shuwei Li for reporting this bug. If you want to use ALoFT scores, please re-download the updated version. dbNSFP4.1a can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.1c can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here.

    UPDATE (Feb 10, 2021): A bug fixed. The gnomAD_pLI, gnomAD_pRec and gnomAD_pNull scores in dbNSFP4.1_gene.gz and dbNSFP4.1_gene.complete.gz have a problem that the scores are not always corresponding to the canonical transcripts of the genes. We thank Dr. Raphaël Helaers for reporting this bug. If you want to use those scores, please download the updated version of dbNSFP4.1_gene.gz and dbNSFP4.1_gene.complete.gz to replace the old files.

    UPDATE (Jan 27, 2021): Because the search programs provided in v4.1 cannot run in a command-line environment without X11 support, here we add back the command-line only version of the search programs for v4.1a and v4.1c. If you want to use those programs, please download the corresponding zip files. Then unzip the files to the dbNSFP folder and run them according to the updated search_dbNSFP readme file. 

    UPDATE (June 16, 2020): dbNSFP v4.1 is released. BayesDel (https://doi.org/10.1002/humu.23158), ClinPred (https://doi.org/10.1016/j.ajhg.2018.08.005) and LIST-S2 (https://doi.org/10.1093/nar/gkaa288) scores have been added. CADD has been updated to v1.6, CADD score based on hg19 model has been added.  gnomAD genomes have been updated to r3.0: populations AMI (Amish) and SAS (South Asian) have been added; controls have been removed. Clinvar, GTEx have been updated. HPO terms have been added to the dbNSFP_gene. search_dbNSFP programs now support searching SpliceAI (https://doi.org/10.1016/j.cell.2018.12.015) as an attached database, please refer to the readme files of the search_dbNSFP programs for details. 

    Two branches of dbNSFP are provided: dbNSFP4.1a suitable for academic use, which includes all the resources,  and dbNSFP4.1c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, and GenoCanyon.  

    dbNSFP4.1a can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.1c can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. A web service is provided at http://database.liulab.science/dbNSFP for querying a short list of SNPs.

    Columns added (dbNSFP_variant): BayesDel_addAF_score, BayesDel_addAF_rankscore, BayesDel_addAF_pred, BayesDel_noAF_score, BayesDel_noAF_rankscore, BayesDel_noAF_pred, LIST-S2_score, LIST-S2_rankscore, LIST-S2_pred, CADD_raw_hg19, CADD_raw_rankscore_hg19, CADD_phred_hg19, gnomAD_genomes_AMI_AC, gnomAD_genomes_AMI_AN, gnomAD_genomes_AMI_AF, gnomAD_genomes_AMI_nhomalt, gnomAD_genomes_SAS_AC, gnomAD_genomes_SAS_AN, gnomAD_genomes_SAS_AF, gnomAD_genomes_SAS_nhomalt

    Columns name changes (dbNSFP_variant): GTEx_V7_gene changed to GTEx_V8_gene, GTEx_V7_tissue changed to GTEx_V8_tissue

    Columns deleted (dbNSFP_variant): gnomAD_genomes_controls_AC, gnomAD_genomes_controls_AN, gnomAD_genomes_controls_AF, gnomAD_genomes_controls_nhomalt, gnomAD_genomes_controls_AFR_AC, gnomAD_genomes_controls_AFR_AN, gnomAD_genomes_controls_AFR_AF, gnomAD_genomes_controls_AFR_nhomalt, gnomAD_genomes_controls_AMR_AC, gnomAD_genomes_controls_AMR_AN, gnomAD_genomes_controls_AMR_AF, gnomAD_genomes_controls_AMR_nhomalt, gnomAD_genomes_controls_ASJ_AC, gnomAD_genomes_controls_ASJ_AN, gnomAD_genomes_controls_ASJ_AF, gnomAD_genomes_controls_ASJ_nhomalt, gnomAD_genomes_controls_EAS_AC, gnomAD_genomes_controls_EAS_AN, gnomAD_genomes_controls_EAS_AF, gnomAD_genomes_controls_EAS_nhomalt, gnomAD_genomes_controls_FIN_AC, gnomAD_genomes_controls_FIN_AN, gnomAD_genomes_controls_FIN_AF, gnomAD_genomes_controls_FIN_nhomalt, gnomAD_genomes_controls_NFE_AC, gnomAD_genomes_controls_NFE_AN, gnomAD_genomes_controls_NFE_AF, gnomAD_genomes_controls_NFE_nhomalt, gnomAD_genomes_controls_POPMAX_AC, gnomAD_genomes_controls_POPMAX_AN, gnomAD_genomes_controls_POPMAX_AF, gnomAD_genomes_controls_POPMAX_nhomalt

    Columns added (dbNSFP_gene): HPO_id, HPO_name

    UPDATE (May 15, 2020):  A minor bug is fixed in dbNSFP v4.0. In the previous release, the column Primate_AI_pred was not 100% correct. We thank Alex Kouris for reporting this issue. If you want to use Primate_AI_pred please download it again. 

    Two branches of dbNSFP are provided: dbNSFP4.0a suitable for academic use, which includes all the resources,  and dbNSFP4.0c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon.  

    dbNSFP4.0a can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0c can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here

   UPDATE (December 5, 2019): A minor bug is fixed in dbNSFP v4.0. In the previous release the content of the following columns were compressed, i.e. if annotations for all transcripts are identical, only one annotation was presented: genename, cds_strand, refcodon, codonpos,  codon_degeneracy, FATHMM_score, FATHMM_pred, Interpro_domain. In this new release, those columns are decompressed, i.e. have the same number of annotations as the number of transcripts. A Java-based graphic user interface (GUI) search program (search_dbNSFP40a.jar or  search_dbNSFP40c.jar) has been added. Users can double-click the jar file to launch the GUI (it supports command-line also, please check the search_dbNSFP readme pdf for details). 

    Two branches of dbNSFP are provided: dbNSFP4.0a suitable for academic use, which includes all the resources,  and dbNSFP4.0c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon.  

    dbNSFP4.0a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    NEW VERSION (May 3, 2019): dbNSFP v4.0 is formally released. HGVS c. and p. presentations from ANNOVAR, SnpEff and VEP have been added. search_dbNSFP now supports search based on HGVS c. and p. presentations. Please refer to search_dbNSFP40a.readme.pdf or search_dbNSFP40c.readme.pdf for details. MedGen ID, OMIM ID and Orphanet ID from clinvar have been added. 

    Two branches of dbNSFP are provided: dbNSFP4.0a suitable for academic use, which includes all the resources,  and dbNSFP4.0c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. 

    dbNSFP4.0a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    UPDATE (February 20, 2019): dbNSFP v4.0b2 is released for beta testing. Uniprot sprot_varsplic was included in the mapping from Uniprot to Ensembl. Fixed column title inconsistency between the README file and data file. (We thank Kevin Xin and Julius Jacobsen for pointing out the inconsistency.) dbMTS was added as an attached database. search_dbNSFP added support for searching dbMTS with option '-m'. 

    Two branches of dbNSFP are provided: dbNSFP4.0b2a suitable for academic use, which includes all the resources,  and dbNSFP4.0b2c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon.  Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com) for commercial usage of dbNSFP. 

    dbNSFP4.0b2a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0b2c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    UPDATE (December 30, 2018): A bug causing id mapping issue from Uniprot to Ensembl, which further causing increased missing rates of Polyphen2, MutationAssessor and DEOGEN2, has been found and fixed (We thank Dr. Daniele Raimondi). If you downloaded dbNSFP v4.0b1 before December 30, please download it again.

    

    NEW VERSION (December 8, 2018): dbNSFP v4.0b1 is released for beta testing. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 29/Ensembl 94 with human reference sequence hg38. Eight deleteriousness prediction scores (ALoFT, DEOGEN2, FATHMM-XF, MPC, MVP,  PrimateAI, LINSIGHT, SIFT4G) have been added. Three conservation scores (phyloP17way_primate, phastCons17way_primate, bStatistic)  have been added. Allele frequencies from the gnomAD controls subsets, eQTLs from the Geuvadis project, and genotypes of a Vindija33.19  Neanderthal have been added. Some resources have been updated, including VEST (We thank Dr. Karchin), CADD, M-CAP, ancestral alleles, dbSNP, ClinVar, GTEx and InterPro. The presentation of the prediction scores has been further improved by adding the correspondence to transcript/protein ids in a systematic way. APPRIS, GENCODE_basic, TSL and VEP_canonical have been added to facilitate the choice of appropriate transcripts. dbNSFP_gene has also been completely rebuilt using the up-to-date resources. HIPred, gene constraint scores from the gnomAD data, essential genes predictions based on CRISPR, gene-trap and gene networks have been added. 

    Two branches of dbNSFP are provided: dbNSFP4.0b1a suitable for academic use, which includes all the resources,  and dbNSFP4.0b1c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. 

    dbNSFP4.0b1a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0b1c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here.  

    Columns added (dbNSFP_variant): VindijiaNeandertal, Uniprot_acc, Uniprot_entry, APPRIS, GENCODE_basic, TSL, VEP_canonical, MVP_score, MVP_rankscore, MPC_score, MPC_rankscore, PrimateAI_score, PrimateAI_rankscore, PrimateAI_pred, DEOGEN2_score, DEOGEN2_rankscore, DEOGEN2_pred, fathmm-XF_coding_score, fathmm-XF_coding_rankscore, fathmm-XF_coding_pred, bStatistic, bStatistic_rankscore, Aloft_Fraction_transcripts_affected, Aloft_prob_Tolerant, Aloft_prob_Recessive, Aloft_prob_Dominant, Aloft_pred, Aloft_Confidence, UK10K_AC, UK10K_AF, gnomAD_exomes_controls_AC, gnomAD_exomes_controls_AN, gnomAD_exomes_controls_AF, gnomAD_exomes_controls_nhomalt, gnomAD_exomes_controls_AFR_AC, gnomAD_exomes_controls_AFR_AN, gnomAD_exomes_controls_AFR_AF, gnomAD_exomes_controls_AFR_nhomalt, gnomAD_exomes_controls_AMR_AC, gnomAD_exomes_controls_AMR_AN, gnomAD_exomes_controls_AMR_AF, gnomAD_exomes_controls_AMR_nhomalt, gnomAD_exomes_controls_ASJ_AC, gnomAD_exomes_controls_ASJ_AN, gnomAD_exomes_controls_ASJ_AF, gnomAD_exomes_controls_ASJ_nhomalt, gnomAD_exomes_controls_EAS_AC, gnomAD_exomes_controls_EAS_AN, gnomAD_exomes_controls_EAS_AF, gnomAD_exomes_controls_EAS_nhomalt, gnomAD_exomes_controls_FIN_AC, gnomAD_exomes_controls_FIN_AN, gnomAD_exomes_controls_FIN_AF, gnomAD_exomes_controls_FIN_nhomalt, gnomAD_exomes_controls_NFE_AC, gnomAD_exomes_controls_NFE_AN, gnomAD_exomes_controls_NFE_AF, gnomAD_exomes_controls_NFE_nhomalt, gnomAD_exomes_controls_SAS_AC, gnomAD_exomes_controls_SAS_AN, gnomAD_exomes_controls_SAS_AF, gnomAD_exomes_controls_SAS_nhomalt, gnomAD_exomes_controls_POPMAX_AC, gnomAD_exomes_controls_POPMAX_AN, gnomAD_exomes_controls_POPMAX_AF, gnomAD_exomes_controls_POPMAX_nhomalt, gnomAD_exomes_nhomalt, gnomAD_exomes_AFR_nhomalt, gnomAD_exomes_AMR_nhomalt, gnomAD_exomes_ASJ_nhomalt, gnomAD_exomes_EAS_nhomalt, gnomAD_exomes_FIN_nhomalt, gnomAD_exomes_NFE_nhomalt, gnomAD_exomes_SAS_nhomalt, gnomAD_exomes_POPMAX_AC, gnomAD_exomes_POPMAX_AN, gnomAD_exomes_POPMAX_AF, gnomAD_exomes_POPMAX_nhomalt, gnomAD_exomes_flag, gnomAD_genomes_flag, gnomAD_genomes_nhomalt,gnomAD_genomes_AFR_nhomalt, gnomAD_genomes_AMR_nhomalt, gnomAD_genomes_ASJ_nhomalt, gnomAD_genomes_EAS_nhomalt, gnomAD_genomes_FIN_nhomalt,gnomAD_genomes_NFE_nhomalt, gnomAD_genomes_POPMAX_nhomalt, gnomAD_genomes_controls_AC, gnomAD_genomes_controls_AN, gnomAD_genomes_controls_AF, gnomAD_genomes_controls_nhomalt, gnomAD_genomes_controls_AFR_AC, gnomAD_genomes_controls_AFR_AN, gnomAD_genomes_controls_AFR_AF, gnomAD_genomes_controls_AFR_nhomalt, gnomAD_genomes_controls_AMR_AC, gnomAD_genomes_controls_AMR_AN, gnomAD_genomes_controls_AMR_AF, gnomAD_genomes_controls_AMR_nhomalt, gnomAD_genomes_controls_ASJ_AC, gnomAD_genomes_controls_ASJ_AN, gnomAD_genomes_controls_ASJ_AF, gnomAD_genomes_controls_ASJ_nhomalt, gnomAD_genomes_controls_EAS_AC, gnomAD_genomes_controls_EAS_AN, gnomAD_genomes_controls_EAS_AF, gnomAD_genomes_controls_EAS_nhomalt, gnomAD_genomes_controls_FIN_AC, gnomAD_genomes_controls_FIN_AN, gnomAD_genomes_controls_FIN_AF, gnomAD_genomes_controls_FIN_nhomalt, gnomAD_genomes_controls_NFE_AC, gnomAD_genomes_controls_NFE_AN, gnomAD_genomes_controls_NFE_AF, gnomAD_genomes_controls_NFE_nhomalt, gnomAD_genomes_controls_POPMAX_AC, gnomAD_genomes_controls_POPMAX_AN, gnomAD_genomes_controls_POPMAX_AF, gnomAD_genomes_controls_POPMAX_nhomalt, Geuvadis_eQTL_target_gene, clinvar_hgvs, clinvar_var_source, Eigen-raw_coding_rankscore, SIFT4G_score, SIFT4G_pred, SIFT4G_converted_rankscore, phyloP17way_primate, phyloP17way_primate_rankscore, phastCons17way_primate, phastCons17way_primate_rankscore

    Columns name changes (dbNSFP_variant): MutationAssessor_score_rankscore to MutationAssessor_rankscore, VEST3_score to VEST4_score, VEST3_rankscore to VEST4_rankscore, GenoCanyon_score_rankscore to GenoCanyo_rankscore, integrated_fitCons_score_rankscore to integrated_fitCons_rankscore, GM12878_fitCons_score_rankscore to GM12878_fitCons_rankscore, H1-hESC_fitCons_score_rankscore to H1-hESC_fitCons_rankscore, HUVEC_fitCons_score_rankscore to HUVEC_fitCons_rankscore, phyloP20way_mammalian to phyloP30way_mammalian, phyloP20way_mammalian_rankscore to phyloP30way_mammalian_rankscore, phastCons20way_mammalian to   phastCons30way_mammalian, phastCons20way_mammalian_rankscore to   phastCons30way_mammalian_rankscore, clinvar_golden_stars to clinvar_review, GTEx_V6p_gene to GTEx_V7_gene, GTEx_V6p_tissue to GTEx_V7_tissue, Eigen-raw to Eigen-raw_coding, Eigen-phred to Eigen-phred_coding, Eigen-PC-raw to Eigen-PC-raw_coding, Eigen-PC-phred to Eigen-PC-phred_coding, Eigen-PC-raw_rankscore to Eigen-PC-raw_coding_rankscore, rs_dbSNP150 to rs_dbSNP151, clinvar_rs to clinvar_id. 

    Columns deleted (dbNSFP_variant): Uniprot_acc_Polyphen2,   Uniprot_id_Polyphen2, Uniprot_aapos_Polyphen2, MutationAssessor_UniprotID, MutationAssessor_variant, Transcript_id_VEST3, Transcript_var_VEST3, gnomAD_exomes_OTH_AC, gnomAD_exomes_OTH_AN, gnomAD_exomes_OTH_AF, gnomAD_genomes_OTH_AC, gnomAD_genomes_OTH_AN, gnomAD_genomes_OTH_AF, Eigen_coding_or_noncoding

    Columns added (dbNSFP_gene): gnomAD_pLI, gnomAD_pRec, gnomAD_pNull, HIPred_score, HIPred, Essential_gene_CRISPR, Essential_gene_CRISPR2, Essential_gene_gene-trap, Gene_indispensability_score, Gene_indispensability_pred

    

    REMINDER: For whole genome annotation, we recommend our whole genome annotation pipeline WGSA. Currently it supports SNP and indel annotation using hg19 and hg38 coordinates. dbNSFP v2.9.3 (the last dbNSFP native on hg19) is a component resource. 

    REMINDER: if your snp coordinates are based on hg19, remember to add option "-v hg19" when using the search program because the default position is now in hg38.    

ATTACHED DATABASE:

    

    dbMTS collects all potential SNVs microRNA target seed regions in human 3’UTRs and provides their functional predictions and annotations to facilitate the steps of filtering and prioritizing SNVs from a huge list of all SNVs discovered in a whole exome sequencing (WES) study. The core functional annotations in the database are the targeting efficacy scores for the reference and mutant loci based on three microRNA target prediction algorithms, TargetScan v7.0, RNAhybrid, and miRanda. Based on their predictions, we further classify the effect of each SNV into three categories, substitution, target loss and target gain. The maximum difference between the reference score and variant-induced score was calculated to estimate how the microRNA targeting efficacy was changed after introducing the variant. Additional functional annotations of the SNVs were also collected in dbMTS including variant consequences by SnpEff, VEP and ANNOVAR, dbSNP variant IDs, GWAS Catalog entries, allele frequencies from various populations, clinical consequences from ClinVar, expression quantitative trait loci (eQTLs) from GTEx, mappability scores etc. 

    dbMTS v1.0 is available for download from box or softgenetics ftp. A description of the columns can be found here. A preprint describing the database can be found on the bioRxiv

    If you use dbMTS, please cite out paper:

      dbscSNV includes all potential human SNVs within splicing consensus regions (−3 to +8 at the 5’ splice site and −12 to +2 at the 3’ splice site), i.e. scSNVs, related functional annotations and two ensemble prediction scores for predicting their potential of altering splicing. 

    UPDATE (April 12, 2015): dbscSNV has been updated to v1.1 and added hg38 positions liftovered from its hg19 positions. dbscSNV1.1 is available for download from the Box or softgenetics ftp. Since v3.0b2 the companion search program supports searching dbNSFP along with dbscSNV1.1 using option "-s".

    dbscSNV v1.0 is available for download from the Box or googledrive. Since v2.6 the companion search program supports searching dbNSFP along with dbscSNV using option "-s". The description of the columns is here

    If you use dbscSNV, please cite our paper:

    1. Jian X, Boerwinkle E and Liu X. 2014. In silico prediction of splice-altering single nucleotide variants in the human genome. Nucleic Acids Research 42(22):13534-13544.

    SPIDEX: Since v3.0 the companion search program supports searching the free non-commercial research version of SPIDEX 1.0 along with dbscSNV and SpliceAI using the "-s" option. The SPIDEX free non-commercial research version 1.0 can be downloaded from ANNOVAR. To enable search, you shall already have unziped dbscSNV files in the same folder as the dbNSFP files. As to the SPIDEX file, first download hg19_spidex.zip. Unzip the file and put the file hg19_spidex.txt in the same folder as the dbNSFP files. The coordinates in the input file must be in hg19. Matching SPIDEX entries will be output to files with the user specified output file name and an extension of “.SPIDEX”. The description of the columns is here

    SpliceAI: Since v4.1 search_dbNSFP supports searching the free non-commercial v1.3 of SpliceAI along with dbscSNV and SPIDEX using the "-s" option. The SpliceAI free non-commercial version 1.3 can be downloaded from https://basespace.illumina.com. To enable search, you shall already have unziped dbscSNV files in the same folder as the dbNSFP files. To download the SpliceAI files, login, find the project “Predicting splicing from primary sequence”. Then from the tab “FILES” click folder “genome_scores_v1.3”. Download file spliceai_scores.masked.snv.hg38.vcf.gz for querying hg38 based input file and spliceai_scores.masked.snv.hg19.vcf.gz for querying hg19 based input file. Put the files in the same folder as the dbNSFP files. Do not unzip. The coordinates in the input file can be in hg38 or hg19. Matching SpliceAI entries will be output to files with the user-specified output file name and an extension of ".SpliceAI". The description of the columns is here

UPDATE HISTORY:

    UPDATE (January 12, 2018): search_dbNSFP35a is updated for supporting searching the SPIDEX file downloaded from ANNOVAR. You can download it here. Just replace the search_dbNSFP35a.class using the one from the zip file. More details can be found in the updated readme file

    

    UPDATE (August 6, 2017): dbNSFP v3.5 is released. Allele frequencies from the exomes and genomes of the Genome Aggregation Database (gnomAD) have been added. Interpro, dbSNP, clinvar, ancestral alleles, Altai Neanderthal genotypes, Denisova genotypes and GTEx eQTLs have been updated. dbNSFP_gene has been rebuilt with updated annotations. Other changes to dbNSFP_gene include: Interactions columns now show the gene list instead of the total number; GTEx gene expression annotations have been removed; LoF FDR p-value from RVIS has been added; Genome-wide haploinsufficiency score (GHIS) has been added; LoF and CNV intolerance/tolerance scores based on ExAC data have been added. 

    Two branches of dbNSFP v3.5 are provided: dbNSFP3.5a suitable for academic use, which includes all the resources, and dbNSFP3.5c suitable for commercial use, which does not include Polyphen2, VEST3, REVEL, CADD and DANN. dbNSFP3.5a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.5c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here.  

    Columns added (dbNSFP_variant): gnomAD_exomes_AC, gnomAD_exomes_AN, gnomAD_exomes_AF, gnomAD_exomes_AFR_AC, gnomAD_exomes_AFR_AN, gnomAD_exomes_AFR_AF, gnomAD_exomes_AMR_AC, gnomAD_exomes_AMR_AN, gnomAD_exomes_AMR_AF, gnomAD_exomes_ASJ_AC, gnomAD_exomes_ASJ_AN, gnomAD_exomes_ASJ_AF, gnomAD_exomes_EAS_AC, gnomAD_exomes_EAS_AN, gnomAD_exomes_EAS_AF, gnomAD_exomes_FIN_AC, gnomAD_exomes_FIN_AN, gnomAD_exomes_FIN_AF, gnomAD_exomes_NFE_AC, gnomAD_exomes_NFE_AN, gnomAD_exomes_NFE_AF, gnomAD_exomes_SAS_AC, gnomAD_exomes_SAS_AN, gnomAD_exomes_SAS_AF, gnomAD_exomes_OTH_AC, gnomAD_exomes_OTH_AN, gnomAD_exomes_OTH_AF, gnomAD_genomes_AC, gnomAD_genomes_AN, gnomAD_genomes_AF, gnomAD_genomes_AFR_AC, gnomAD_genomes_AFR_AN, gnomAD_genomes_AFR_AF, gnomAD_genomes_AMR_AC, gnomAD_genomes_AMR_AN, gnomAD_genomes_AMR_AF, gnomAD_genomes_ASJ_AC, gnomAD_genomes_ASJ_AN, gnomAD_genomes_ASJ_AF, gnomAD_genomes_EAS_AC, gnomAD_genomes_EAS_AN, gnomAD_genomes_EAS_AF, gnomAD_genomes_FIN_AC, gnomAD_genomes_FIN_AN, gnomAD_genomes_FIN_AF, gnomAD_genomes_NFE_AC, gnomAD_genomes_NFE_AN, gnomAD_genomes_NFE_AF, gnomAD_genomes_OTH_AC, gnomAD_genomes_OTH_AN, gnomAD_genomes_OTH_AF.

    Columns name changes (dbNSFP_variant): rs_dbSNP147  replaced by rs_dbSNP150, GTEx_V6_gene replaced by GTEx_V6p_gene, GTEx_V6_tissue replaced by GTEx_V6p_tissue. 

    Columns added (dbNSFP_gene): LoF-FDR_ExAC, GHIS, ExAC_pLI, ExAC_pRec, ExAC_pNull, ExAC_nonTCGA_pLI, ExAC_nonTCGA_pRec, ExAC_nonTCGA_pNull, ExAC_nonpsych_pLI, ExAC_nonpsych_pRec, ExAC_nonpsych_pNull, ExAC_del.score, ExAC_dup.score, ExAC_cnv.score, ExAC_cnv_flag.

    Columns name changes (dbNSFP_gene): RVIS replaced by RVIS_EVS, RVIS_percentile replaced by RVIS_percentile_EVS, RVIS_ExAC_0.05%(AnyPopn) replaced by RVIS_ExAC, %RVIS_ExAC_0.05%(AnyPopn) replaced by RVIS_percentile_ExAC

    Columns removed (dbNSFP_gene): GTEx expression columns (222 to 433)  

   UPDATE (March 12, 2017): dbNSFP v3.4 and v2.9.3 are released. REVEL score (doi: 10.1016/j.ajhg.2016.08.016) and MutPred score (doi: 10.1093/bioinformatics/btp528) have been added. Please note REVEL is free for non-commercial usage only. SORVA gene ranking scores (doi: 10.1101/103218) have been added to gene annotation. We thank Ms. Aliz R Rao for providing SORVA scores. 

    Two branches of dbNSFP v3.4 are provided: dbNSFP3.4a suitable for academic use, which includes all the resources, and dbNSFP3.4c suitable for commercial use, which does not include Polyphen2, VEST3, REVEL, CADD and DANN. dbNSFP3.4a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.4c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here.  dbNSFP2.9.3 can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. 

    Columns added (dbNSFP_variant): REVEL_score, REVEL_rankscore, MutPred_score, MutPred_rankscore, MutPred_protID, MutPred_AAchange, MutPred_Top5features.

    Columns added (dbNSFP_gene): SORVA_LOF_MAF0.005_HetOrHom, SORVA_LOF_MAF0.005_HomOrCompoundHet, SORVA_LOF_MAF0.001_HetOrHom, SORVA_LOF_MAF0.001_HomOrCompoundHet, SORVA_LOForMissense_MAF0.005_HetOrHom, SORVA_LOForMissense_MAF0.005_HomOrCompoundHet, SORVA_LOForMissense_MAF0.001_HetOrHom, SORVA_LOForMissense_MAF0.001_HomOrCompoundHet.

    UPDATE (November 30, 2016): dbNSFP v3.3 and v2.9.2 are released. M-CAP score (DOI: 10.1038/ng.3703) has been added. We thank Dr. Gill Bejerano for providing the score. Eigen and Eigen PC scores have been updated to v1.1. dbSNP has been updated to v147. clinvar has been updated to 20161101.

    Two branches of dbNSFP v3.3 are provided: dbNSFP3.3a suitable for academic use, which includes all the resources, and dbNSFP3.3c suitable for commercial use, which does not include Polyphen2, VEST3, CADD and DANN. dbNSFP3.3a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.3c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here.  dbNSFP2.9.2 can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. 

    Columns name changes (dbNSFP_variant): rs_dbSNP146  replaced by rs_dbSNP147.

    Columns added (dbNSFP_variant): M-CAP_score, M-CAP_rankscore, M-CAP_pred, Eigen_coding_or_noncoding, Eigen-PC-phred.

    Columns removed (dbNSFP_variant): Eigen-raw_rankscore.

    UPDATE (March 20, 2016): dbNSFP v3.2 is released. Eigen score, Eigen PC score (doi: 10.1038/ng.3477) and GenoCanyon score (doi:10.1038/srep10576) have been added. Allele frequencies of two commonly used subsets of ExAC data (nonTCGA and nonpsych) have been added. Mutation Assessor scores have been updated to release 3. PhyloP7way_vertebrate and PhastCons7way_vertebrate conservation scores have been updated to phyloP100way_vertebrate and PhastCons100way_vertebrate, respectively. rankscores have been updated accordingly. Ancestral alleles have been updated based on Ensembl 84. dbSNP has been updated to build 146. Clinvar has been updated to 20160302, review status (golden stars) was added. InterPro has been updated to v56. Gene name cross-links, IntAct, Uniprot, GWAS catalog, BioGRID, GO, ConsensusPathDB, mouse genes and zebra fish genes information for the dbNSFP_gene table have been updated.

    Two branches of dbNSFP are provided: dbNSFP3.2a suitable for academic use, which includes all the resources, and dbNSFP3.2c suitable for commercial use, which does not include Polyphen2, VEST3, CADD and DANN. dbNSFP3.2a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.2c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here.

    Columns name changes (dbNSFP_variant): rs_dbSNP144, Uniprot_id_MutationAssessor, Uniprot_variant_MutationAssessor, phyloP7way_vertebrate, phyloP7way_vertebrate_rankscore, phastCons7way_vertebrate, phastCons7way_vertebrate_rankscore replaced by rs_dbSNP146, MutationAssessor_UniprotID, MutationAssessor_variant, phyloP100way_vertebrate, phyloP100way_vertebrate_rankscore, phastCons100way_vertebrate, phastCons100way_vertebrate_rankscore, respectively. 

    Columns added (dbNSFP_variant): Eigen-raw, Eigen-phred, Eigen-raw_rankscore, Eigen-PC-raw, Eigen-PC-raw_rankscore, GenoCanyon_score, GenoCanyon_score_rankscore, ExAC_nonTCGA_AC, ExAC_nonTCGA_AF, ExAC_nonTCGA_Adj_AC, ExAC_nonTCGA_Adj_AF, ExAC_nonTCGA_AFR_AC, ExAC_nonTCGA_AFR_AF, ExAC_nonTCGA_AMR_AC, ExAC_nonTCGA_AMR_AF, ExAC_nonTCGA_EAS_AC, ExAC_nonTCGA_EAS_AF, ExAC_nonTCGA_FIN_AC, ExAC_nonTCGA_FIN_AF, ExAC_nonTCGA_NFE_AC, ExAC_nonTCGA_NFE_AF, ExAC_nonTCGA_SAS_AC, ExAC_nonTCGA_SAS_AF, ExAC_nonpsych_AC, ExAC_nonpsych_AF, ExAC_nonpsych_Adj_AC, ExAC_nonpsych_Adj_AF, ExAC_nonpsych_AFR_AC, ExAC_nonpsych_AFR_AF, ExAC_nonpsych_AMR_AC, ExAC_nonpsych_AMR_AF, ExAC_nonpsych_EAS_AC, ExAC_nonpsych_EAS_AF, ExAC_nonpsych_FIN_AC, ExAC_nonpsych_FIN_AF, ExAC_nonpsych_NFE_AC, ExAC_nonpsych_NFE_AF, ExAC_nonpsych_SAS_AC, ExAC_nonpsych_SAS_AF, clinvar_golden_stars.

    UPDATE (March 30, 2016): dbNSFP v2.9.1 is released, which is an update of the v2.x versions whose core SNV set was based on hg19. MutationTaster has been updated to those based on Ensembl 69, i.e. the same version as in dbNSFP v3. Mutation Assessor has been updated to release 3.  Since the release of dbNSFP v3, the v2.x versions were under limited maintenance. WGSA is currently our flagship annotation pipeline for hg19. The scores in dbNSFP v3 but not in v2.9 can be obtained through WGSA. dbNSFP2.9.1 can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. 

   NOTICE (January 21, 2016): We were informed that commercial usage of Polyphen-2 scores also needs a licence. Therefore, we have removed Polyphen-2 scores from the "c" branches of dbNSFP v3.x. Those scores are still available in dbNSFP "a" branches, but users need to get their own licences for commercial usage.  

    UPDATE (November 24, 2015): dbNSFP v3.1 is released. Significant eQTLs from GTEx V6 have been added. dbSNP rs has been updated to build 144. Gene expression information (rpkm of RNAseq) of 53 tissues from GTEx V6 has been added to dbNSFP_gene. Three gene intolerance scores (RVIS based on ExAC r0.3, GDI and LoFtool) have been added to dbNSFP_gene. Please join our Email group for future update announcements from dbNSFP. 

    Two branches of dbNSFP are provided: dbNSFP3.1a suitable for academic use, which includes all the resources, and dbNSFP3.1c suitable for commercial use, which does not include VEST3, CADD and DANN. dbNSFP3.1a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.1c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    Columns changes (dbNSFP_variant): rs_dbSNP142 replaced by rs_dbSNP144. 

    Columns added (dbNSFP_variant): GTEx_V6_gene, GTEx_V6_tissue.

    Columns added (dbNSFP_gene): 212 columns (for GTEx) inserted between Expression(GNF/Atlas) and Interactions(IntAct); 15 columns (for RVIS, GDI and LoFtool) inserted between RVIS_percentile and Essential_gene.

    UPDATE (October 11, 2015): There is a bug in search_dbNSFP30a and search_dbNSFP30c when using vcf input files with the -p option, which causes missing vcf columns in case there are multiple rows in dbNSFP matching the same SNV in the vcf file. For those who have already downloaded dbNSFPv3.0a or dbNSFPv3.0c, you can download the bug-fixed version of search_dbNSFP30a and search_dbNSFP30c and replace the old files. The distribution of dbNSFPv3.0a and dbNSFPv3.0c have been updated to include the bug-fixed search program. 

    

    UPDATE (September 4, 2015): In the variant files released on August 3, in case there are multiple MutationTaster scores/predictions for a SNV, only a subset including all different score/prediction combinations were presented. This may cause difficulties to relate the scores/predictions to the transcripts shown by http://www.mutationtaster.org/ChrPos.html. Now this issue has been fixed. Please download the database again using the following link if MutationTaster scores/predictions for all the transcripts are needed. Please note the corresponding transcript IDs are not included in dbNSFP but can be queried at http://www.mutationtaster.org/ChrPos.html. We thank Ian Maurer for reporting this issue.

    

    UPDATE (August 13, 2015): In the variant files released on August 3, in case there are multiple FATHMM scores/predictions for a SNV, only the (predicted) most deleterious one is presented, instead of all scores/predictions. Now this issue has been fixed. Please download the database again using the following link if transcript-specific FATHMM scores/predictions are needed. We thank Zena Ng for reporting this issue.

    

    NEW VERSION (August 3, 2015): dbNSFP v3.0 is released. Three new functional prediction scores (DANN, fathmm-MKL and fitCons) and two conservation scores (phyloP20way_mammalian and phastCons20way_mammalian) have been added. For commercial application of DANN, please contact Daniel Quang (dxquang@uci.edu). CADD scores have been updated to v1.3. For commercial licensing of CADD please contact Jennifer McCullar (mccullaj@uw.edu). I thank Kirill Prusov and Dr. Xueqiu Jian for suggestions on README files. Please join our Email group for future update announcements from dbNSFP. 

     Two branches of dbNSFP are provided: dbNSFP3.0a suitable for academic use, which includes all the resources, and dbNSFP3.0c suitable for commercial use, which does not include VEST3, CADD and DANN. dbNSFP3.0a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.0c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    Columns updated: CADD_raw (dbNSFP v3.0a only), CADD_raw_rankscore (dbNSFP v3.0a only), CADD_phred (dbNSFP v3.0a only). 

    New columns: DANN_score (dbNSFP v3.0a only), DANN_rankscore (dbNSFP v3.0a only), fathmm-MKL_coding_score, fathmm-MKL_coding_rankscore, fathmm-MKL_coding_pred, fathmm-MKL_coding_group, integrated_fitCons_score, integrated_fitCons_rankscore, integrated_confidence_value, GM12878_fitCons_score, GM12878_fitCons_rankscore, GM12878_confidence_value, H1-hESC_fitCons_score, H1-hESC_fitCons_rankscore, H1-hESC_confidence_value, HUVEC_fitCons_score, HUVEC_fitCons_rankscore, HUVEC_confidence_value.

    UPDATE (June 11, 2015): Two of the column descriptions are found missing in the readme files: ESP6500_AA_AC and ESP6500_EA_AC. Here are the updated readme files: dbNSFP3.0b2a.readme.txt and dbNSFP3.0b2c.readme.txt. I thank Dr. Seung-Tae Lee for reporting this issue.

    UPDATE (April 12, 2015): dbNSFP v3.0 beta2 is released. This update fixed the issues due to inconsistent mitochondrial reference sequences used by different resources. I thank Dr. Lishuang Shen at MEEI for helping solving the issues. For mitochondrial SNV, the pos (i.e. hg38) refers to the rCRS (GenBank: NC_012920) and hg19_pos refers to a YRI sequence (GenBank: AF347015). The ancestral allele of mitochondrial SNV now comes from the Reconstructed Sapiens Reference Sequence (RSRS, doi:10.1016/j.ajhg.2012.03.002). The affected content include ancestral alleles, Neanderthal/Denisova genotypes and MutationTaster columns of the chrM file. The rankscores of MutationTaster has also been updated to reflect the update of its chrM scores. dbscSNV has been updated to v1.1 and added hg38 positions liftovered from its hg19 positions. Using search_dbNSFP30b2a or search_dbNSFP30b2c you can search dbscSNV1.1 along with dbNSFP v3.0b2 with either hg19 coordinates or hg38 coordinates. If you find any bugs/issues or have questions/comments please feel free to contact me. 

    Two branches of dbNSFP are provided: dbNSFP3.0b2a suitable for academic use, which includes all the resources, and dbNSFP3.0b2c suitable for commercial use, which does not include VEST3 and CADD. dbNSFP3.0b2a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.0b2c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

     UPDATE (April 6, 2015): dbNSFP v3.0 beta1 is released. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 22/Ensembl 79 with human reference sequence hg38. Putative genes have been included. Genes with incomplete 5' have been excluded (I thank Chris Gillies for reporting the issues for genes with incomplete 5' end). Genes on mitochondrial DNA have been included. Allele frequencies from the UK10K cohorts and genotypes of two Neanderthals have been added. Some resources have been updated, including the MutationTaster (I thank Dr. Dominik Seelow for kindly providing the scores), allele frequencies from the 1000 Genomes Project populations, ancestral alleles, dbSNP, ClinVar and InterPro. The presentation of the prediction scores has been improved by adding columns for the corresponding transcript/protein ids. PhyloP and PhastCons conservation scores based on hg19 have been replaced by the scores based on hg38. Some resources have been dropped due to various reasons, including SLR test statistic, UniSNP ids, allele frequencies from the ARIC cohorts and allele counts in COSMIC. dbNSFP_gene has also been completely rebuilt using the up-to-date resources. Residual Variation Intolerance Scores (RVIS) have been added. GO Slim terms have been replaced by full GO terms. If you find any bugs/issues or have questions/comments please feel free to contact me. 

    Two branches of dbNSFP are now provided: dbNSFP3.0b1a suitable for academic use, which includes all the resources, and dbNSFP3.0b1c suitable for commercial use, which does not include VEST3 and CADD. dbNSFP3.0b1a can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP3.0b1c can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    UPDATE (February 3, 2015): dbNSFP v2.9 is released. SIFT score has been updated to ensembl66 version. PROVEAN (Protein Variation Effect Analyzer) score v1.1 has been added. I thank Dr. Yongwook Choi from J. Craig Venter Institute for providing the SIFT and PROVEAN scores. CADD score has been updated to 1.2 version. Please note the following copyright statement for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)." Allele frequency v0.3 of ~60,706 unrelated individuals from the Exome Aggregation Consortium (ExAC) has been added. ExAC data are released under a Fort Lauderdale Agreement. Please refer to http://exac.broadinstitute.org/terms for terms of use. The zipped database (7.8 Gb in size) can be downloaded from Amazon, googledrive or onedrive. The md5sum of the zip file can be found here. A README file is here. I thank Dr. CS (Jonathan) Liu from Softgenetics for providing hosting space.

    UPDATE (December 16, 2014): Some rows in the dbNSFP2.8_gene and dbNSFP2.8_gene.complete were truncated. I thank Jocob Hsu for identifying this issue. If you have already download dbNSFPv2.8 you can download and replace the old files with the updated files: dbNSFP2.8_gene and dbNSFP2.8_gene.complete. The updated complete database can be downloaded with the links below.

    UPDATE (November 21, 2014): dbNSFP v2.8 is released. COSMIC (Catalogue Of Somatic Mutations In Cancer) annotations have been added. Pathway information from BioCarta and KEGG (old version) has been added to the dbNSFP2.8_gene. A bug causing inconsistency between MutationTaster scores and MutationTaster_pred, which affects v2.5 to v2.7, has been fixed. I thank Adam Novak for reporting this bug. The zipped database (6.8 Gb in size) can be downloaded from Amazon or googledrive. The md5sum of the zip file can be found here. A README file is here

    UPDATE (Septermber 12, 2014): dbNSFP v2.7 is released. Chromosomes and positions of human reference hg38 have been added. search_dbNSFP27.class now supports query dbNSFP using the positions based on hg38 with the "-v hg38" option.  clinvar (freeze 20140902) annotations have been added. Allele frequencies from 2303 exomes of African Americans  and 3203 exomes of European Americans from the Atherosclerosis Risk in Communities (ARIC) cohort study  have been added. As the columns for gene interactions in dbNSFP_gene table contain very long strings, especially  for gene UBC, which may cause problems when viewing the results in Excel, now we only report the number of  interacting genes in those columns. Full information is retained in the dbNSFP_gene.complete table. The zipped database (6.8 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here.

    

    UPDATE (July 26, 2014): dbNSFP v2.6 is released. rs numbers from dbSNP 141 have been added to the variant database files. Mouse and zebra fish homolog genes and phenotypes have been added to the gene database file (I thank Alex Li for his suggestion and helps). Trait_association(GWAS) was also updated. The zipped database (6.7 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here

    CORRECTION (September 9, 2014): the rs numbers in v2.6 are from the latest dbSNP 141 (not 138 as previously noted). The README file has been updated accordingly. I thank Jason J. Corneveaux for pointing this out. 

    UPDATE (June 1, 2014): dbNSFP v2.5 is released. A new functional score VEST 3.0 has been added. We thank Dr. Karchin for kindly providing the score. Non-commercial use of VEST is free. Commercial users of VEST please contact the Johns Hopkins Technology Transfer office. A bug that causes the MutationTaster score error since v2.1 for variants with a prediction of  "Polymorphism_automatic" has been fixed. We thank John McGuigan and James Ireland for reporting this bug. As MutationTaster can also predict splicing change and other functional effects, in case a variant has multiple predictions based on their different model, we took the most damaging score and prediction for dbNSFP. The zipped database (7.3 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here.

    UPDATE (March 5, 2014): dbNSFP v2.4 is released. A whole genome functional prediction score called CADD was added, along with five more conservation scores (phyloP46way_primate, phyloP100way_vertebrate, phastCons46way_primate, phastCons46way_placental, phastCons100way_vertebarate). Please note the following copyright statement for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)." To facilitate comparison between scores, we added rank scores for most functional prediction scores and conservation scores, and replacing the  "converted" scores in the previous versions. In short, for a given type of prediction/conservation scores, all its scores in dbNSFP were first ranked and the rankscore is the rank divided by the total number of all its scores. Roughly speaking, the rankscore will range from 0 to 1, and the larger the score, the higher rank the score in dbNSFP, therefore the SNP is more likely to have damaging effect. The zipped database (6.9 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here

    UPDATE (Fedruary 12, 2014): A bug was fixed in dbNSFP v2.2 and v2.3, which caused missing delimiters in columns aapos_SIFT, SIFT_score_converted and SIFT_pred. For those who need to use information from those columns, please re-download the database(s) using the above links.

    UPDATE (January 26, 2014): dbNSFP v2.3 is released. In collaboration with Dr. Kai Wang's lab at USC, we constructed two ensemble scores (MetaSVM and MetaLR) based on 10 component scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in the 1000 genomes populations. Based on our comparison, the two ensemble scores outperform all their component scores. A manuscript describing the ensemble scores has been published in Human Molecular Genetics. This release added the two ensemble scores and their predictions. The zipped database (4.4 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here.

    UPDATE (January 23, 2014): dbNSFP v2.2 is released. SIFT and FATHMM now have multiple scores corresponding to different Ensembl ENSP ids and amino acid positions (aapos_SIFT and aapos_FATHMM). Accordingly, our companion search program now supports SNP searches based on Ensembl ENSP ids and amino acid positions. A bug is fixed for a small proportion of MutationTaster scores. The zipped database (4 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here.

    UPDATE (October 3, 2013): dbNSFP v2.1 is released. MutationTaster and FATHMM scores have been updated. To facilitate interpretation of the prediction scores, converted scores of SIFT, LRT, MutationTaster, MutationAssessor and FATHMM have been added. The converted scores are all scaled to 0~1 with the larger number indicating more likely to be damaging. Columns of SIFT and FATHMM predictions have been added. The gene database has also been updated. Database IDs are updated. GO Slim terms, pathway and protein interaction information from the ConsensusPathDB, and list of essential and non-essential genes (based on phenotypes of mouse homologs) have been added. The zipped database (3.3 Gb in size) can be downloaded from Amazon. The md5sum of the zip file can be found here. A README file is here

    UPDATE (August 12, 2013): The java search program is updated with an option for users to choose whether to output all columns from the input vcf file to the output file. You can download it from here.

    

    UPDATE (May 31, 2013): The source code of the companion Java search program is now available under the RECEX SHARED SOURCE LICENSE. You can download it from here.  

    UPDATE (March 22, 2013): A bug which caused a lot of missing FATHMM scores has been fixed. The database files have been updated. Please use the above link (February 25, 2013) to download the database. The alternative companion java search program (March 12, 2013) is now the default search program included in the zip file. 

    UPDATE (March 12, 2013): Here is an alternative companion java search program, which outputs queries that are not found into an error file instead of the system output. It can be downloaded from here. You can just replace the companion search program packed with the database file. 

    NEW VERSION (February 25, 2013): Finally dbNSFP v2.0 is released. A new functional prediction score FATHMM is added.  It can be downloaded from Amazon. A README file is here.

    UPDATE (November 19, 2012): a bug was found in the companion java search program search_dbNSFP20b4, which causes missing output when only position queries are included in the input file. The fixed program can be download from here. The program in the database zip file linked above has been replaced too.

    UPDATE (October 27, 2012): dbNSFP v2.0b4 is released. A new functional prediction score MutationAssessor is added. Allele frequencies from ESP 5400 data set are replaced by ESP 6500 data set. It can be downloaded from Amazon. A README file is here

    UPDATE (August 28, 2012): The companion java search program search_dbNSFP20b3 is updated. Added features include supporting vcf file as input file and options for output contents (columns). It can be downloaded from here. A README file is here. Simply replacing the old search_dbNSFP20b3.class file with the new file.    

    UPDATE (July 2, 2012): dbNSFP v2.0b3 is released. To facilitate filtering, an additional 2.2 million splicing site SNPs have been added to dbNSFP_variant. In the table those SNPs have missing (".") in aaref, aaalt and "-1" in aapos. There's no change to the format of search input file.  It can be downloaded from Amazon. A README file is here.  Bug reports are very welcome.

    UPDATE (June 2, 2012): dbNSFP v2.0 beta 2 is released, which includes both the dbNSFP_variant and dbNSFP_gene sub-databases. Slight changes have been made to the Ensembl gene and transcript ids of dbNSFP_variant in order to be compatible to other database sources. For each gene, dbNSFP_gene includes various ids of the gene for different databases, function description, gene expression information, gene interaction information, diseases or traits the gene causes or associated with, estimated probability of haploinsufficiency,  estimated probability of causing recessive disease, etc. It can be downloaded from Amazon. A README file is here.  Bug reports are very welcome. 

    UPDATE (April 11, 2012): The long waited dbNSFP v2.0 is on the horizon now. The new database is rebuilt based on the Gencode release 9 / Ensembl version 64. The default coordinate is hg19, but hg18 is still supported. There will be two parts of the database: one focuses on variant annotation and the other focuses on gene annotation. The variant sub-database is now open for beta test and can be downloaded from Amazon. A README file is here. SIFT, Polyphen-2 and MutationTaster scores are updated. Please note that now all scores are RAW scores, without imputation and transformation. One more conservation score, SiPhy, is added along with other new annotations such as the protein functional domains, the allele frequencies observed in the 1000 Genomes phase 1 data and the NHLBI's Exome Sequencing Project data, etc. Bug reports are very welcome.

    

    dbNSFP_light is a light version of dbNSFP, which contains less annotation entries but some additional 9,285,316 NSs that are not in CCDS version 20090327.

    dbNSFP_light v1.0 can be downloaded from Amazon. A README file is here. Scores of PhyloP, SIFT, Polyphen2, LRT and MutationTaster are included but missing data are not imputed. Prediction of LRT and MutationTaster are also included, as well as the omega estimated by LRT. A companion Java program called search_dbNSFP_light.class can be downloaded from here and used for local queries. 

    dbNSFP_light v1.1 added GERP++ neutral rates and RS scores. It can be downloaded from Amazon (including readme and the corresponding java search program). A README file is here.

    dbNSFP_light v1.2 added Uniprot ID, accession number and amino acid position based on the Polyphen-2 annotations. Users can now search amino acid change directly referring to a Uniprot ID or accession number. dbNSFP_light v1.2 can be downloaded from Amazon (including readme and the corresponding java search program). A README file is here.

    dbNSFP_light v1.3 updated SIFT scores (August, 2011 version) and Polyphen-2 scores (May, 2011 version). SIFT: 7,097,009 scores added, 48,011,111 updated. Polyphen-2: 2,136,757 scores added, 53,712,654 updated. Uniprot ID, accession number and amino acid position based on the Polyphen-2 annotations have been updated too. It can be downloaded from Amazon (including readme and the corresponding java search program). A README file is here.

    dbNSFP v1.3 added Uniprot ID, accession number and amino acid position based on the Polyphen-2 annotations. Users can now search amino acid change directly referring to a Uniprot ID or accession number. dbNSFP v1.3 can be downloaded from Amazon (including readme and the corresponding java search program). A README file is here.

    Update (Nov. 10, 2011): A bug was found in the conpanion search program for dbNSFP v1.3, which causes invalid search using AA mutations with Uniprot ID or accession number. Please use the updated search program. The search program in the  dbNSFP v1.3 zip file has been updated.

    dbNSFP v1.2 added GERP++ neutral rates and RS scores. It can be downloaded from Amazon (including readme and the corresponding java search program). A README file is here.

    dbNSFP v1.1 added the following entries: rs numbers from UniSNP (a cleaned version of dbSNP build 129), allele frequency recorded in dbSNP, allele frequency reported by 1000 Genomes Project, alternative gene names, descriptive gene name, database cross references (gene IDs of HGNC, MIM, Ensembl and HPRD). The unziped database is 18Gb.

    dbNSFP v1.1 can be downloaded from Amazon. A README file is here

    A companion Java program called search_dbNSFP11.class can be downloaded from here and used for local queries.

    

    dbNSFP v1.0 can be downloaded from Amazon. A README file is here. More details about the database can be found in our paper

    A companion Java program called search_dbNSFP.class can be downloaded from here and used for local queries.