Please direct correspondence regarding SuperFine and DACTAL to Tandy Warnow.
SuperFine is software for constructing supertrees from source phylogenies.
This section details steps for installing and running SuperFine. We ran SuperFine on a machine running Linux release 2.6.24.6+desktop10+25.63. If you experience difficulty installing or running the software, please contact us at the e-mail address below.
1. Install PAUP*.
2. Installing python packages.
The software packages listed below are Python source distributions. To use them, you must first have Python installed on your system; for details on obtaining and installing Python, please visit the Python home page. We used Python version 2.6.
1. Install the newick_modified package.
2. Install the spruce package.: spruce-1.0.tar.gz
3. Install the dendropy package: DendroPy-v2.4.0.tar.gz (shown below)
4. Install the reup package: reup-1.0.tar.gz (shown below)
Running SuperFine: Copy the "runReup.py" script from the "bin" subdirectory of the location in which you installed the Python packages to the directory in which you are working. Run the script using the command "python runReup.py -r gmrp <source_trees_file>", where "<source_trees_file>" is the name of a file containing source trees in Newick format.
Nelesen, S., K. Liu, L.-S. Wang, C. R. Linder, and T. Warnow. "DACTAL: divide-and-conquer trees (almost) without alignments." Bioinformatics Vol 28, ISMB 2012, pages i274-i282. (PDF). The DACTAL method uses SuperFine within a divide-and-conquer framework.
DACTAL (Divide-And-Conquer Trees without ALignments) is a software package for estimating phylogenies from ultra-large datasets with up to tens of thousands of unaligned nucleotide sequences and with many kb of sequence length.
tar xjf dactal-ship.tar.bz2
Then, read the instructions in ./dactal-ship/README
to install DACTAL.
K. Liu and T. Warnow, "Treelength Optimization for Phylogeny Estimation," PLoS ONE, vol. 7, no. 3:e33104, 2012, doi: 10.1371/journal.pone.0033104.
All files are in compressed .tar.bz2 format. To extract a compressed <file>, use the command: tar xjf <file>
There are 30 100-taxon model conditions, with either long, medium, or short gap length types. Each model condition is referred to with the string "<number of taxa><gap length type: L for long, M for medium, S for short><id number>". Each model condition has 20 replicate datasets.
After extracting the compressed file sim_100.tar.bz2, the directory structure <model condition parameter string>/R<replicate number>/ will have the following files:
There are 4 biological datasets. For each dataset, an associated link to a tarball is listed below. The tarball contains a single directory <dataset name>/R0/ that contains the following files:
rRNA datasets:
For reference purposes, the original Gutell lab CRW datasets can be accessed at the following links. Please note that these original datasets differ from the cleaned datasets above, which were the datasets actually used in the experiments. Furthermore, these original uncleaned curated alignments are not in FASTA format, and are instead in a more verbose GenBank format explained here.
Original uncleaned datasets: