![]() |
fdiscboot |
To carry out a bootstrap (or jackknife, or permutation test) with some method in the package, you may need to use three programs. First, you need to run SEQBOOT to take the original data set and produce a large number of bootstrapped or jackknifed data sets (somewhere between 100 and 1000 is usually adequate). Then you need to find the phylogeny estimate for each of these, using the particular method of interest. For example, if you were using DNAPARS you would first run SEQBOOT and make a file with 100 bootstrapped data sets. Then you would give this file the proper name to have it be the input file for DNAPARS. Running DNAPARS with the M (Multiple Data Sets) menu choice and informing it to expect 100 data sets, you would generate a big output file as well as a treefile with the trees from the 100 data sets. This treefile could be renamed so that it would serve as the input for CONSENSE. When CONSENSE is run the majority rule consensus tree will result, showing the outcome of the analysis.
This may sound tedious, but the run of CONSENSE is fast, and that of SEQBOOT is fairly fast, so that it will not actually take any longer than a run of a single bootstrap program with the same original data and the same number of replicates. This is not very hard and allows bootstrapping or jackknifing on many of the methods in this package. The same steps are necessary with all of them. Doing things this way some of the intermediate files (the tree file from the DNAPARS run, for example) can be used to summarize the results of the bootstrap in other ways than the majority rule consensus method does.
If you are using the Distance Matrix programs, you will have to add one extra step to this, calculating distance matrices from each of the replicate data sets, using DNADIST or GENDIST. So (for example) you would run SEQBOOT, then run DNADIST using the output of SEQBOOT as its input, then run (say) NEIGHBOR using the output of DNADIST as its input, and then run CONSENSE using the tree file from NEIGHBOR as its input.
The resampling methods available are:
Andrew Rambaut's BEAST XML format | http://evolve.zoo.ox.ac.uk/beast/introXML.html and http://evolve.zoo.ox.ac.uk/beast/referenindex.html | A format for alignments. There is also a format for phylogenies described there. |
MSAML M | http://xml.coverpages.org/msaml-desc-dec.html | Defined by Paul Gordon of University of Calgary. See his big list of molecular biology XML projects. |
BSML | http://www.bsml.org/resources/default.asp | Bioinformatic Sequence Markup Language includes a multiple sequence alignment XML format |
Standard (Mandatory) qualifiers: [-infile] discretestates (no help text) discretestates value [-outfile] outfile Output file name [-outancfile] outfile Out ancestor file name [-outmixfile] outfile Out mix file name [-outfactfile] outfile Out fact file name Additional (Optional) qualifiers (* if not always prompted): -mixfile properties File of mixtures -ancfile properties File of ancestors -weights properties Weights file -factorfile properties Factors file -test menu Choose test * -regular toggle Altered sampling fraction * -fracsample float Samples as percentage of sites * -morphseqtype menu Output format * -blocksize integer Block size for bootstraping * -reps integer How many replicates * -justweights menu Write out datasets or just weights * -seed integer Random number seed between 1 and 32767 (must be odd) -printdata boolean Print out the data at start of run * -[no]dotdiff boolean Use dot-differencing -[no]progress boolean Print indications of progress of run Advanced (Unprompted) qualifiers: (none) Associated qualifiers: "-outfile" associated qualifiers -odirectory2 string Output directory "-outancfile" associated qualifiers -odirectory3 string Output directory "-outmixfile" associated qualifiers -odirectory4 string Output directory "-outfactfile" associated qualifiers -odirectory5 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths |
Standard (Mandatory) qualifiers | Allowed values | Default | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
[-infile] (Parameter 1) |
(no help text) discretestates value | Discrete states file | |||||||||||||
[-outfile] (Parameter 2) |
Output file name | Output file | <sequence>.fdiscboot | ||||||||||||
[-outancfile] (Parameter 3) |
Out ancestor file name | Output file | |||||||||||||
[-outmixfile] (Parameter 4) |
Out mix file name | Output file | |||||||||||||
[-outfactfile] (Parameter 5) |
Out fact file name | Output file | |||||||||||||
Additional (Optional) qualifiers | Allowed values | Default | |||||||||||||
-mixfile | File of mixtures | Property value(s) | |||||||||||||
-ancfile | File of ancestors | Property value(s) | |||||||||||||
-weights | Weights file | Property value(s) | |||||||||||||
-factorfile | Factors file | Property value(s) | |||||||||||||
-test | Choose test |
|
b | ||||||||||||
-regular | Altered sampling fraction | Toggle value Yes/No | No | ||||||||||||
-fracsample | Samples as percentage of sites | Number from 0.100 to 100.000 | 100.0 | ||||||||||||
-morphseqtype | Output format |
|
p | ||||||||||||
-blocksize | Block size for bootstraping | Integer 1 or more | 1 | ||||||||||||
-reps | How many replicates | Integer 1 or more | 100 | ||||||||||||
-justweights | Write out datasets or just weights |
|
d | ||||||||||||
-seed | Random number seed between 1 and 32767 (must be odd) | Integer from 1 to 32767 | 1 | ||||||||||||
-printdata | Print out the data at start of run | Boolean value Yes/No | No | ||||||||||||
-[no]dotdiff | Use dot-differencing | Boolean value Yes/No | Yes | ||||||||||||
-[no]progress | Print indications of progress of run | Boolean value Yes/No | Yes | ||||||||||||
Advanced (Unprompted) qualifiers | Allowed values | Default | |||||||||||||
(none) |
Program name | Description |
---|---|
ednacomp | DNA compatibility algorithm |
ednadist | Nucleic acid sequence Distance Matrix program |
ednainvar | Nucleic acid sequence Invariants method |
ednaml | Estimates phylogenies from nucleic acid sequence Maximum Likelihood |
ednamlk | Estimates phylogenies from nucleic acid sequence Maximum Likelihood with molecular clock |
ednapars | DNA parsimony algorithm |
ednapenny | Penny algorithm for DNA |
eprotdist | Protein distance algorithm |
eprotpars | Protein parsimony algorithm |
erestml | Restriction site Maximum Likelihood method |
eseqboot | Bootstrapped sequences algorithm |
fdnacomp | DNA compatibility algorithm |
fdnadist | Nucleic acid sequence Distance Matrix program |
fdnainvar | Nucleic acid sequence Invariants method |
fdnaml | Estimates nucleotide phylogeny by maximum likelihood |
fdnamlk | Estimates nucleotide phylogeny by maximum likelihood |
fdnamove | Interactive DNA parsimony |
fdnapars | DNA parsimony algorithm |
fdnapenny | Penny algorithm for DNA |
fdolmove | Interactive Dollo or Polymorphism Parsimony |
ffreqboot | Bootstrapped genetic frequencies algorithm |
fproml | Protein phylogeny by maximum likelihood |
fpromlk | Protein phylogeny by maximum likelihood |
fprotdist | Protein distance algorithm |
fprotpars | Protein pasimony algorithm |
frestboot | Bootstrapped restriction sites algorithm |
frestdist | Computes distance matrix from restriction sites or fragments |
frestml | Restriction site maximum Likelihood method |
fseqboot | Bootstrapped sequences algorithm |
fseqbootall | Bootstrapped sequences algorithm |
Although we take every care to ensure that the results of the EMBOSS version are identical to those from the original package, we recommend that you check your inputs give the same results in both versions before publication.
Please report all bugs in the EMBOSS version to the EMBOSS bug team, not to the original author.