Post date: 30-Jul-2007 00:26:54
: A Functional Overview of Homology Modelling :
The basic project layout. The source code is in git (incrementally updated). However, I haven't hosted the site as yet.
Project Modules:
Multiple Sequence Alignment:
The original idea was to use Clustal W for MSA and generation of phylogenetic trees. Basically, as given in the introduction to the tool, Clustal W helps perform multiple sequence alignment for the unknown sequences given as a flat file format. In addition, it also creates Phylogenetic trees which can be used for evolutionary studies.
This application will be written in either jsf ( I have always hated it just because it requires us to write sooo much of crap even to do simple things) or javafx (kinda sucks as of now, but hoping for it to get better with Java8).
I do realize this idea is even Bigger than the Big Idea. And that's why this web presence comes into picture. Guys please do Like and +1 my site (you know, we live in a market economy, so I do need a little SEO to get myself high in ranks, so that people may see that I exist and I can garner as many hands as I can, you see, like they say, the Government is not going to bail you out). But also take some time out to visit my Git page and help me out. That's something I'll honestly appreciate.
Homology modelling:
This is a part of Version 2.0 and beyond. Just a small intro of what lies ahead.
Create a complete visualization tool on the lines of SwissPDBViewer or PyMol.
Basically the tool is supposed to consist of the following features:
Visual interactive real time structure modifications.
Heat Map.
Ramachandran Plot.
Multiple Regression and Maximum Likelihood analysis.
H-Bond and S-Bond interactions.
Stereo studies.
Complete integration with the Structure Prediction Module.
Every time a new sequence model is created, there should be ability to auto export the model to this module rendered as a 3D structure.
The vice versa should also hold true. i.e., when we interact with the model and add or remove chains the subsequent changes should be easily aligned using msa and hmm.
Profile HMMs are built using a three step process.
hmmsearch - this step helps widen our horizon of search and provides suggestive leads from the set of databank.
hmmallign - this step helps perform the actual alignment.
hmmcalibrate - this step helps calibrate the aligned sequence for optimization of further searches.
The big idea is to design my own algorithms, preferably in java
yeah! yeah! i know... slow, it is!
but then for web purposes java ee provides the most stable platform other than .net {i dont really know much about .net since in our days, .net was still very much a proprietary thingy, even for students!}; php is there but its just toooo scripty, doesn't have the solid feel of Objects; and then there's RoR but then that's too sketchy; so well, Java it is.
Structure Prediction:
This is the final step for Version 1.0
In the past, I have mostly relied on Modeller (written and compiled by Andrej Sali), but then I was a student and there were no Copyright issues.
Now, things have changed a lot. I am not a student anymore, and Andrej Sali has got kinda famous, so a little restrictive about the use of his software.
Nevertheless, the code is there upon request, and so far I remember, he once said he would only be too happy if the code could be ported to java by somebody else.
Basically, modeller scripts work in the way as follows.
align: this module aligns the submitted sequences. Since our sequence is already aligned, it works faster. The algorithm used is different from the ones used above, so the accuracy is also improved.
Basically, it aligns a block of sequences with a block of structures. The gap penalty depends on the 3D structure of all sequences in block 1. The variable gap penalty can favor gaps in exposed regions, avoid gaps within secondary structure elements, favor gaps in curved parts of the main chain, and minimize the distance between the two positions spanning a gap. The modeller align command is preferred for aligning a sequence with structure(s) in comparative modeling because it tends to place gaps in a better structural context.
model: this module performs the actual modelling
evaluate: this module evaluates the aligned sequences and curates the optimal models. This is achieved with the DOPE score. Mostly, the number of templates used to build the model is not that important to the evaluating algorithm.
However, the molpdf and DOPE scores are not 'absolute' measures, in the sense that they can only be used to rank models calculated from the same alignment. Other scores are transferable. But the evaluation also provides a plot of the structures which can be further used for regression analysis. (NOT to be a part of Version 1.0).
Ohh ya! long long ago, when I first started out on the project, it used to look like this: