TAXODIUM utility web-page

(formerly TAXON)

Download executables, version 1.2

 

Windows

Mac

Linux

 

Description

TAXODIUM utility is designed for building three-item statement (3TS)-matrices from binary, ordered and unordered multistate characters, with fractional and uniform weighting of the resulting statements.

Command line interface

TAXODIUM does not require any installation process. Run the program without arguments to see the command reference.

First argument must always be the name of the CSV file with input matrix. One or several options may follow the input file name. Order of appearance of the options is not important. Table 1 shows the list of available options.

 

Table 1 TAXODIUM v1.2 options

Option

Description

Input  symbols

-ib  

 input: binary (default)

-iom 

 input: ordered multistate

-ium 

 input: unordered multistate

-idna

 input: DNA/RNA

-ip  

 input: protein

Conversion method

-m3  

 method: 3TS (default, G-conversion = the value of the outgroup exhaustive)

Output symbols

-ob  

 output: binary (default)

-om  

 output: multistate

-odna

 output: DNA/RNA

-op  

 output: protein

Fractional weights and outgroups

-mus

Unique statements per input statement only (default: off)

-fw  

 print fractional weights and save all 3TSs in matrix (default: off)

-og  

 print outgroup (in case of G- conversion, default: off)

Output formats

-phy 

 enable PHYLIP output (default: on if no other output selected)

-nex 

 enable NEXUS  output (default: off)

-csv 

 enable CSV    output (default: off)

 

Example 1

Input matrix format example is shown below:

 

taxonA,0,0

taxonB,=,0

taxonC,>,3

taxonD,@,4

taxonE,@,6

 

First (leftmost) column contains names of taxa, all following columns contain characters. Symbols allowed for each input option are shown in Table 2.

 

Table 2 Input file symbols

Input option

Symbols

Binary

0 1

Ordered multistate

0 1 2 3 4 5 6 7 8 9 : < = > @ A B C D E F G H I J K

Unordered multistate

0 1 2 3 4 5 6 7 8 9 : < = > @ A B C D E F G H I J K

DNA/RNA

A C G T U R Y S W K M B D H V

protein

A C D E F G H I K L M N P Q R S T V W Y

 

Additionally, input file can contain a predefined outgroup taxon name. It must always be last line in the input file, in the following format:

Out,taxonB

In the example above, “Out” is a reserved keyword. No real taxa must be named with that name in user’s input files. “taxonB” is the name of the outgroup taxon. If outgroup taxon is found in the input file, the utility will do the following operations for each input file character individually:

1.   Find which symbol is contained in requested outgroup taxon (taxonB in this example).

2.   Output statements will be written only if their outgroup matches the symbol found in step 1.

Example 2

G-conversion with the binary 3TS matrix output from standard DNA matrix in simplified NEXUS format with outgroup added, all 3TS fractionally weighted:

 

taxodium.exe input.csv -idna -ob -og -fw -nex

 

Please note that the command line interface may change in future versions. Please see the documentation provided with each version of the utility for complete details.

 

Limitations and performance

Currently, the maximum count of taxa in the input matrix must not exceed 5000, and the maximum count of characters is 100000. These values can be modified in the source code if necessary. The output matrix is constructed entirely in computer's RAM before being written on disk. If a computer has enough RAM to accommodate the entire output matrix then the processing will occur with maximum possible performance. If the amount of RAM is not sufficient, a typical operating system (such as Windows or Linux) will attempt to use disk swapping. This will affect the performance severely, but the program will still finish processing. Finally, if the size of the disk swap file is not sufficient, TAXODIUM will report memory allocation error and show the amount of memory required to accommodate the output matrix. In such case, the user should increase the size of the disk swap file and rerun the utility.

Contact

Evgeny_at_ufl_dot_edu