The generation of the consensus, for both variants of the consensus, have been written in php, and contained with consensus.php, which is attached. Consensus 1: Simple Consensus This consensus looks at the cuts made by each method, and groups the methods where the respective cuts fall with a 20% residue boundary of the reference method. This 20% is worked out two ways (i) for chains where the domains are all continuous, this is the length of the chain divided by the number of domains assigned in the chain, multiplied by 100 and then divided by the percentage figure to use, in this case 20. There is future scope to experiment with this figure (ii) for chains where domains are fragmented, this will be the length of the chain divided by the number of cuts made in the chain, multiplied by 100 and divided by the percentage. So, the first method is made into a new group. Then, each cut made by the next method is compared with the cuts made by the first method. Should each cut fall within the 20% residue boundary set around the position where the first's method cut appears, then the second method is put in the same group. If not, this is put in a new group. All methods after this will then work there way through the groups, falling into the group where the boundaries are all within the window. Each group is then given a score. The score is the number of methods in the group divided by the total number of methods that have made an assignment on this chain. The consensus is then the top scoring group if the score is over 0.4 and no other group has a score with 0.1 of this score. If this is not the case, the top two groups are presented as potential consensus assignments. Should no group score over 0.4, there is no consensus found. The consensus is displayed using the same applet as the comparison graph, alongside the assignments made by CATH and SCOP in order that these can be compared. Consensus 2: Weighted Consensus This approach takes the performance of each algorithm into account , as well as other knowledge, such as secondary structure partitioning, the type of fold (based on CATH definition) and the tendency of the algorithm to fragment the domains. The methods are grouped in the same way as detailed in the previous approach. This time the score for each group is generated differently. Each algorithm has an initial weighting gained through previous performance analysis (Holland et al, 2006). This weighting is either increased or reduced based on the methods performance in comparison to the rest of the methods. Initially, each algorithm is given the following weight: • PDP: 84.4% • NCBI: 81.9% • DomainParser2: 78.1% • DDomain: 76.5% • PUU: 74.0% • DHcL: 68.3% • Dodis: 40% There is a page on the website which details how these are then adjusted to find the weighting for the method, which is then used to create a reliability score for the group (http://pdomains.sdsc.edu/v2/consensus2rules.php). The php containing these pages are attached, as well as php for the page containing the rules of weighted consensus. Also attached are the two "working" pages - these are displayed while Simple and Weighted consensus are being calculated (working.php and working2.php respectively) and the gif spinner that is used on the page |