http://en.wikipedia.org/wiki/Virtual_screening
From Wikipedia, the free encyclopedia
Virtual screening (VS) is a computational technique used in drug discovery research. It involves the rapid in silico assessment of large libraries of chemical structures in order to identify those structures most likely to bind to a drug target, typically a proteinreceptor or enzyme.[1][2]
Virtual screening has become an integral part of the drug discovery process. Related to the more general and long pursued concept of database searching, the term "virtual screening" is relatively new. Walters, et al. define virtual screening as "automatically evaluating very large libraries of compounds" using computer programs.[3] As this definition suggests, VS has largely been a numbers game focusing on questions like how can we filter down the enormous chemical space of over 1060 conceivable compounds[citation needed] to a manageable number that can be synthesized, purchased, and tested. Although filtering the entire chemical universe might be a fascinating question, more practical VS scenarios focus on designing and optimizing targeted combinatorial libraries and enriching libraries of available compounds from in-house compound repositories or vendor offerings.
The purpose of virtual screening to come up with hits of novel chemical structure that bind to the macromolecular target of interest. Thus, success of a virtual screen is defined in terms of finding interesting new scaffolds rather than many hits. Interpretations of VS accuracy should therefore be considered with caution. Low hit rates of interesting scaffolds are clearly preferable over high hit rates of already known scaffolds.
[edit]
There are two broad categories of screening techniques: ligand-based and structure-based.[4]
Given a set of structurally diverse ligands that binds to a receptor, a model of the receptor can be built based on what binds to it. These are known as pharmacophore models. A candidate ligand can then be compared to the pharmacophore model to determine whether it is compatible with it and therefore likely to bind.[5]
Another approach to ligand-based virtual screening is to use chemical similarity analysis methods to scan a database of molecules against one active ligand structure.
Structure-based virtual screening involves docking of candidate ligands into a protein target followed by applying a scoring function to estimate the likelihood that the ligand will bind to the protein with high affinity.[6][7]
The size of the task requires a parallel computing infrastructure, such as a cluster of Linux systems, running a batch queue processor to handle the work, such as Sun Grid Engine or Torque PBS.
A means of handling the input from large compound libraries is needed. This requires a form of compound database that can be queried by the parallel cluster, delivering compounds in parallel to the various compute nodes. Commercial database engines may be too ponderous, and a high speed indexing engine, such as Berkeley DB, may be a better choice. Furthermore, it may not be efficient to run one comparison per job, because the ramp up time of the cluster nodes could easily outstrip the amount of useful work. To work around this, it is necessary to process batches of compounds in each cluster job, aggregating the results into some kind of log file. A secondary process, to mine the log files and extract high scoring candidates, can then be run after the whole experiment has been run.
[edit]
[edit]
Categories: Bioinformatics | Drug discovery | Cheminformatics