Home
User Manual for SplitsTree4 V4.13.1
Contents
1. 27 28 29 30 31 32 33 34 D H Huson M A Steel and J Whitfield Reducing distortion in phylogenetic networks In P Biicher and B M E Moret editors Algorithms in Bioinformatics LNBI 4175 pages 150 161 2006 P J Lockhart M A Steel and D H Huson Invariable site models and their uses in phylogeny reconstruction Syst Biol 49 2 225 232 2000 D R Maddison D L Swofford and W P Maddison NEXUS an extendible file format for systematic information System Bio 46 4 590 621 1997 E Mossel and M Steel A phase transition for a random cluster model on phylogenetic trees Submitted 2003 M Nei and J C Miller A simple method for estimating average number of nucleotide substi tutions within and between populations from restriction data Genetics 125 873 879 1990 N Saitou and M Nei The Neighbor Joining method a new method for reconstructing phy logenetic trees Molecular Biology and Evolution 4 406 425 1987 R R Sokal and C D Michener A statistical method for evaluating systematic relationships University of Kansas Scientific Bulletin 28 1409 1438 1958 M A Steel Recovering a tree from the leaf colorations it generates under a Markov model Appl Math Lett 7 2 19 24 1994 D L Swofford G J Olsen P J Waddell and D M Hillis Chapter 11 Phylogenetic inference In D M Hillis C Moritz and B K Mable editors Molecular Systematics pages 407 514 Sin
2. 30 From File 30 From Graph 24 gap 16 GapDist 40 gene content 40 GeneContentDistance 18 40 geodescically pruned quasi median network 42 GIF 27 global 30 graphical attributes 15 Greedily Make Compatible 17 Greedily Make Weakly Compatible 17 Greedy Compatible 26 Greedy Weakly Compatible 26 Group Identical Haplotypes 14 Hamming 41 Hasegawa Kishino and Yano model 41 Help command 47 Hide All 24 Hide Incompatible Splits 21 Hide Label 22 hide list 23 Hide Non Selected Splits 20 Hide Selected Splits 11 17 20 Highlight Confidence 15 31 HKY85 17 41 53 How to Cite 21 How to cite 4 IDs 31 Import command 47 Intervals 31 Invert Selection 12 14 Invisible 30 italics 30 Jama 48 JPEG 27 JTT 45 Jukes Cantor model 41 JukesCantor 17 41 K2P 17 41 K3ST 17 41 Keep Only Selected Taxa 16 Kimura 2P 41 Kimura 3ST 41 Label Color 30 Label Fill Color 30 layout 28 least squares 25 least squares fit 29 Line Color 30 Linux 5 6 Load command 47 Load Multi Labeled Tree 14 Load Trees 14 Lock Edge Lengths 28 lock edge lengths 28 LogDet 17 41 LogHamming 41 LSFit 29 Mac OS 5 MacOS 6 Magnify All Mode 15 Main 11 maximum likelihood 44 maximum parsimony 44 median joining network 42 median network 42 MedianJoining 19 42 MedianNetwork 19 42 merge a set of trees 14 Message 31 Message Window 21 23 31 Messages 3
3. 44 Phylogram 20 23 44 PHYML 44 PhyML 18 44 PhyML Path 10 Pipeline 23 pipeline 6 Pipeline Characters 24 Pipeline Characters Filter 24 Pipeline Characters Method 24 Pipeline Characters Select 24 Pipeline Distances 24 Pipeline Distances Method 24 Pipeline Quartets 25 Pipeline Quartets Method 25 Pipeline Splits 25 Pipeline Splits Filter 25 Pipeline Splits Method 25 Pipeline Taxa 23 Pipeline Taxa Filter 23 Pipeline Trees 25 Pipeline Trees Filter 25 Pipeline Trees Method 25 Pipeline Trees Select 25 Pipeline Unaligned 24 Pipeline Unaligned Method 24 pmb 45 PNG 27 popup menu 22 Preferences 15 23 27 preferences 27 Preferences Defaults 28 Preferences General 27 Preferences Layout 28 Preferences Status Line 28 Preferences Toolbar 28 Previous Tree 10 18 23 Print 14 23 properties file 27 ProteinMLdist 18 45 PrunedQuasiMedian 42 PTreeSplits 45 Quartets 34 Quit 14 Quit command 47 Radial 16 Random Colors 30 Redo 14 Redraw All Splits 21 RefinedBuneman Tree 18 45 Register 21 register 31 Regular Expression 30 Replace 13 30 Replace All 30 Reroot 20 Reset 15 22 23 Reset Label Positions 23 Restore All Sites 16 Restore All Splits 17 23 Restore All Taxa 16 23 restriction sites 43 55 Rhodopsin 45 RootedEqualAngle 20 23 45 Rotate Left 15 23 31 Rotate Right 15 23 31 Run 31 RunConvexHull 40 RYSplits 45 sampl
4. If there are no non gap and non missing data states on one side of a split then the other side must show only one state and is not allowed to have any gap or missing data state After determining the supporting characters then press Show on All Taxa or Show on Selected Taxa to see the character states in the network view 16 4 Distances Tab The Pipeline Distances tab has only one sub tab The Pipeline Distances Method sub tab is used to choose and configure a method to apply to the current Distances block See Section 22 for a description of all available methods 24 16 5 Quartets Tab The Pipeline Quartets tab has only one sub tab The Pipeline Quartets Method sub tab is used to choose and configure a method to apply to the current Quartets block See Section 22 for a description of all available methods 16 6 Trees Tab The Pipeline Trees tab has three sub tabs The Pipeline Trees Method sub tab is used to choose and configure a method to apply to the current Trees block See Section 22 for a description of all available methods The Pipeline Trees Filter sub tab is used to exclude trees sites from the analysis The Pipeline Trees Select sub tab is used to highlight a set of trees in the displayed network by selecting all splits in the network that are contained in the set of chosen trees 16 7 Splits Tab The Pipeline Splits tab has two sub tabs The Pipeline Splits Method sub tab is used to choose and configure a method to a
5. In menu item Zoom Out item is a short cut to the View Zoom Out menu item Reset item is a short cut to the View Reset menu item 22 e The Reset Label Positions item resets all node labels to their default positions e The Restore All Taxa item restores all excluded taxa e The Restore All Splits item restores all excluded splits e The Set Window Size item can be used to set the size of the Main window 14 Tool Bar For easier access to frequently used options and methods SplitsTree4 provides a tool bar at the top of the Main window The tool bar can be configured using the Preferences Toolbar tab By default the tool bar contains the following items File Open File gt Clone File gt Save As File gt Print File Export Image Edit gt Find Replace View gt Reset View Zoom In View Zoom Out View gt Rotate Right View gt Rotate Left Edit Preferences Window Message Window Draw gt EqualAngle Draw RootedEqualAngle Draw Phylogram Trees Previous Tree and Trees Next Tree 15 Data Entry Dialog The Data Entry dialog can be used to enter data by hand or by copy and paste in any of the file formats that the program supports see Section 21 16 Pipeline Window The Pipeline window is accessed using the Analysis Configure Pipeline item It con trols all aspects of the computational pipeline that SplitsTree4 uses It is organized into nine tabs reflecting the order of the blocks
6. Labeled Nodes 14 Edit Select Nodes 14 Edit Undo 14 31 editing trees 26 Enter a Command 15 21 Enter Data 5 13 EPS 27 EPS command 47 epsilon 42 equal daylight 40 EqualAngle 20 23 40 Estimate Invariable Sites 19 examples 5 48 Exclude Constant Sites 16 Exclude Gap Sites 16 Exclude Parsimony Uninformative Sites 16 Exclude Selected Splits 17 20 26 Exclude Selected Taxa 16 22 Export 5 9 13 26 Export command 47 Export Image 5 9 14 23 27 F81 17 40 F84 18 40 FastA 38 Felsenstein 81 40 Felsenstein 84 40 File 13 File command 47 File gt Clone 13 23 File gt Close 13 File gt Enter Data 5 13 File gt Export 5 9 13 26 File gt Export Image 5 9 14 23 27 File gt New 13 File gt 0pen 5 9 13 23 29 File gt Open Recent 9 13 File gt Print 14 23 File gt Quit 14 File gt Replace 13 File gt Save 7 9 13 File gt Save As 5 7 9 13 23 29 File gt Tools 14 File gt Tools gt Concatenate Sequences 14 File Tools Group Identical Haplotypes 14 File gt Tools gt Load Multi Labeled Tree 14 File gt Tools gt Load Trees 14 Fill Color 30 Filter Characters 16 Filter Splits 17 Filter Taxa 16 Filter Trees 17 FilteredSuperNetwork 19 40 Find All 30 Find First 30 Find Next 30 Find Replace 15 23 30 Fit 28 fit 28 Flip 15 font famly 30 font setting 30 Format 22 Format Nodes and Edges 5 13 15 30 Forward
7. Register item is used to open the register dialog e The Window Nexus Syntax submenu provides the syntax of all Nexus blocks that SplitsTree4 can process e The Window Command Syntax item summarizes the syntax of the commands that can be used with the command line mode version of the program or which can be present in an input file e The Window Enter a Command item allows entry of a single command e The Window Message Window item opens the message window The bottom of the Window menu contains a list of all open windows 12 11 Configuring Methods By default selecting a menu item that applies some method to the given data such as e g the Trees gt NJ will open up a dialog in the Pipeline window which one can use to configure the method by selecting appropriate options For methods that have no options or that are always used with the same options selecting the Don t show this dialog for this method again checkbox will tell the program not to open the dialog in the future but rather to immediately apply the named method Note that configuration dialogs hidden in this way can still be accessed using the Analysis gt Configure Recent Methods submenu 21 13 Popup Menus Right clicking on the Main window will open a popup menu There are three different menus that will appear depending upon what is hit by the mouse click If the mouse is clicked on a node of a network then this opens the Node popup menu which has
8. This method externally runs the ClustalW sequence alignment program 34 Usage ClustalW GapOpen lt int gt GapExtension lt double gt WeightMatrix lt java lang String gt OptionalParameter lt java lang String gt PathToCommand lt java lang String gt Input Unaligned Output Characters Coalescent This method transforms a set of quartets to a set of splits representing a binary phylogenetic tree by applying the coalescent method described in 28 Usage Coalescent Input Quartets Output Splits ConsensusNetwork This method computes the consensus splits of trees 3 18 to produce a consensus network Usage ConsensusNetwork Threshold lt double gt EdgeWeights lt String gt Input Trees Output Splits ConsensusTree This method computes different types of consensus trees Usage ConsensusTree EdgeWeights lt String gt Method lt String gt Input Trees Output Splits ConvexHull This method computes a splits graph using the convex hull extension algorithm 9 Usage ConvexHull Weights lt boolean gt ScaleNodesMaxSize lt int gt Input Splits Output Network DNA2Splits This method converts DNA characters to splits by setting the majority state against all other states Usage DNA2Splits AddAllTrivial lt boolean gt MinSplitWeight lt int gt Input Characters Output Splits DQuartets This method compute all quartets with positive isolation index 2 Usage DQuartets threshold lt double gt In
9. a file LOAD TREEFILES filel filen load trees from one or more files LOAD CHARFILES filel filen concatenate sequences from one or more files LOAD FILE file open or import a file SAVE FILE file REPLACE YES NO APPEND YES NO DATA ALL 1list of blocks save all data or named blocks to a file in Nexus format EXPORT FILE file FORMAT format REPLACE YES NO APPEND YES NO DATA list of blocks export data in the named formatAA EXPORTGRAPHICS format EPS PNG GIF JPG SVG file file REPLACE YES NO TEXTASSHAPES YES NO TITLE title export graphics in specified format default is EPS UPDATE rerun computations to bring data up to date BOOTSTRAP RUNS number of runs perform bootstrapping on character data DELETEEXCLUDED delete all sites from characters block that are currently excluded ASSUME assumption set an assumption using any statement defined in the ST_ASSUMPTIONS block SHOW DATA list of blocks show the named data blocks CYCLE KEEP cycle set the graph layout cycle to KEEP or to a given cycle HELP show this info HELP DATA list of blocks show syntax of named blocks HELP TRANSFORM transform show usage of a specific data transformation VERSION report version 47 ABOUT show info on version and authors QUIT exit program To begin or end a multi line input enter a backslash MX 24 Examples Example files are provided with the program They are contained in the examples sub direct
10. a set of splits that are called weakly compatible while Neighbor Net produces a set of splits that is circular as originally defined in 2 A split network is a more general type of phylogenetic graph that can represent any collection of splits whether incompatible or not 2 For a compatible set of splits it is always possible to represent each split by a single branch and thus the resulting graph is a tree In general however this will not be possible and in a split network usually a whole band of parallel branches also called parallel edges is required to represent a single split A phylogenetic tree is therefore a special case of a split network In SplitsTree4 if you click on any branch in the representation of a split network then all branches corresponding to the same split will be highlighted If one were to delete all branches corresponding to a given split S then the remaining graph will consist of precisely two components G4 and Gp and as above the we have S 4 where A is the set of all taxa contained in Gx and B is the set of all taxa contained in Gp A more detailed discussion of the fundamental concepts can be found in 22 and 21 6 Opening Reading and Writing Files To open a file select the File Open menu item and then browse to the desired file Alternatively if the file was recently opened by the program then it may be contained in the File Open Recent submenu The native file format of SplitsT
11. already been kept The Data Greedily Make Weakly Compatible item uses a greedy approach to makes the splits in the Splits block weakly compatible in decreasing order of weight the algorithm adds the next split to the set of kept splits if it is weakly compatible with all splits that have already been kept The Data Exclude Selected Splits item removes all splits from the analysis whose edges in the network are selected The Splits and Network block are modified accordingly See also Draw Hide Selected Splits The Data Restore All Splits item restores all splits that were excluded using the previous menu item The Data Filter Splits item opens the Pipeline Splits Filter tab that can be used to interactively include or exclude splits from the analysis The Data Filter Trees item opens the Pipeline Trees Filter tab that can be used to interactively include or exclude trees from the analysis The DataSet Tree Names item opens a dialog that allows one to set the tree names Distances Menu Distances menu contains the following items The Distances gt UncorrectedP item requests the program to compute distances using the UncorrectedP method The Distances LogDet item requests the program to compute distances using the LogDet method The Distances HKY85 item requests the program to compute distances using the HKY85 method The Distances gt JukesCantor item requests the program to compute distances using the JukesCa
12. inter actively Show Scale Bar The program uses a small scale bar in the top left hand corner of the Network tab to indicate the scale of the net work use Split Selection Mode When this feature is on clicking on an edge will select all edges that correspond to the same split Any choices made here can either be applied to the current document or can be made the default for all subsequently opened documents The Preferences Defaults tab controls the default sizes shapes colors and fonts for nodes and edges see also Format Nodes and Edges The Preferences Layout tab controls aspects of the layout of trees and split networks Recompute The taxon layout is recomputed each time the Splits block is modified Stabilize Use the union of the set of current splits and of the previ ously computed splits to compute a taxon layout This is useful when one wants to compare two different networks computed on the same taxon set by switching back and forth between the two computations the program will find a joint taxon layout for both graphs if one exists In this case both graphs will be laid out in such a way that in both graphs the taxa appear in approximately the same region of the canvas Snowball Similar to the previous method this method uses all splits ever computed in a given dataset to produce a taxon layout This doesn t work very well Use this layout Use the provided taxon layout to draw the network The Preferences Too
13. item is selected then the query is interpreted as a Java regular expression The scope can be global or restricted to items that are already selected selected The direction in which the next match is searched for can be selected using the Forward and Backward buttons Press the Close Find First or Find Next buttons to close the dialog or find the first or next occurrence of the query respectively Press the Find A11 button to find all occurrences of the query Press the Replace or Replace A11 button to replace the next or all occurrences of the query The search can be applied to different targets e Nodes search all node labels e Edges search among edge labels e Source search in the source tab e Messages search among text in the Messages window Press the From File button to load a set of queries one per line from a file 19 5 Format Nodes and Edges Window The Format Nodes and Edges window is opened using the View gt Format Nodes and Edges item Its purpose is to modify the appearance of nodes and their labels and edges and their labels in the displayed network The first row of items is used to set the font choosing the font famly italics and bold The second row of items is used to set the color of nodes and edges The check boxes Line Color Fill Color Label Color and Label Fill Color determine which color features are set The Color Chooser can be used to set a specific color Use the Random Colors or Inv
14. make the program select all characters that support the selected split or splits that is all characters for which the character states are different on both sides of the split and constant one at least one side of the splits The Draw Hide Selected Splits item can be used to remove all selected splits di rectly from the depicted network This does not change the Splits block See also Data Exclude Selected Splits The Draw Hide Non Selected Splits item can be used to remove all non selected splits directly from the depicted network This does not change the Splits block See also Data Exclude Selected Splits 20 e The Draw gt Hide Incompatible Splits item can be used to remove all splits that are incompatible with the set of currently selected splits directly from the depicted network This does not change the Splits block e The Draw Redraw All Splits item redraws the network using all splits including those excluded using the previous menu item e The Draw Select Trees item opens the Pipeline Trees Select tab that can be used to select all splits in a given network that are contained in a given set of selected input trees 12 10 Window Menu The Window menu contains the following items e The Window About item shows information about the version of SplitsTree4 Under MacOS this item is found in the SplitsTree menu e The Window How to Cite item gives instructions on how to cite the program e The Window
15. syntax BEGIN TAXA DIMENSIONS NTAX number of taxa TAXLABELS taxon_1 taxon_2 taxon_ntax TAXINFO info_1 info_2 info_ntax END The TAXLABELS statement is optional if it is followed by a source block that contains all taxa labels 20 2 Unaligned Block The Unaligned block contains unaligned sequences It has the following syntax BEGIN UNALIGNED DIMENSIONS NTAX number of taxa FORMAT DATATYPE STANDARD DNA RNA NUCLEOTIDE PROTEIN RESPECTCASE MISSING symbo1 SYMBOLS symbol symbol LABELS LEFT NO MATRIX taxonlabel1 sequence 32 taxonlabel2 sequence taxonlabelN sequence END 20 3 Characters Block The Characters block contains aligned character sequences It has the following syntax BEGIN CHARACTERS DIMENSIONS NTAX number of taxa NCHAR number of characters FORMAT DATATYPE STANDARD DNA RNA PROTEIN RESPECTCASE MISSING symbo1 GAP symbo1 SYMBOLS symbol symbol LABELS NO LEFT TRANSPOSE NO YES INTERLEAVE NO YES TOKENS N0 3 CHARWEIGHTS wgt_1 wgt_2 wgt_nchar CHARSTATELABELS character number character name state name state name MATRIX sequence data in specified format END 20 4 Distances Block The Distances block contains a matrix of pairwise distances It has the following syntax BEGIN DISTANCES DIMENSIONS NTAX number of taxa FORMAT TRIANGLE LOWER UPPER BO
16. 0 minimum spanning network 42 MinSpanningNetwork 19 42 missing character 16 MRJAdapter 48 mtMAM 45 mtREV24 45 multi threading 6 multiple documents 6 Muscle 42 Names 31 Neighbor Net 43 NeighborNet 18 43 NeiMiller 18 43 Network 11 18 35 network 7 Networks gt ConsensusNetwork 19 Networks gt FilteredSuperNetwork 19 Networks MedianJoining 19 Networks gt MedianNetwork 19 Networks MinSpanningNetwork 19 Networks NeighborNet 18 Networks ParsimonySplits 19 Networks SpectralSplits 19 Networks SplitDecomposition 19 Networks SuperNetwork 19 New 13 new Nexus 32 New taxa set 16 Newick 5 Next Tree 10 18 23 Nexus 32 Nexus Syntax 21 NJ 18 21 43 No Overlaps 13 15 NoAlign 43 Node 22 node labels automatic layout 15 node labels radial layout 16 Node Shape 31 Node Size 31 54 Node gt Copy Label 22 Node gt Edit Label 22 Node gt Exclude Selected Taxa 22 Node gt Format 22 Node gt Hide Label 22 Node gt Show Id 22 Node Show Name 22 Nodes 30 NoGraph 20 43 non trivial splits 8 None 26 NoSplits 43 Number of Replicates 31 old Nexus 32 Open 5 9 13 23 29 Open File 29 Open Recent 9 13 OptimizeBoxeslIterations 40 parallel branches 9 parallel edges 9 parsimony uninformative 16 ParsimonySplits 19 44 partial trees 11 24 40 46 Paste 14 PDF 27 Phylip 38 44 PhylipParsimony 5 18
17. 18 23 Trees gt NJ 18 21 Trees gt PhylipParsimony 18 Trees gt PhyML 18 Trees gt Previous Tree 10 18 23 Trees gt RefinedBunemanTree 18 Trees gt TreeSelector 10 18 Trees gt UPGMA 18 TreeSelector 10 18 46 trivial splits 8 Type setting conventions 4 56 Unaligned 32 Window gt Restore All Taxa 23 UncorrectedP 17 46 Window Set Window Size 23 Undo 14 31 Window Zoom In 22 Unix 5 6 Window Zoom Out 22 Update command 47 Windows 5 6 UPGMA 18 46 Use Magnifier 15 Z closure 40 46 use Split Selection Mode 28 Zoom In 15 22 23 Zoom Out 15 22 23 Version command 47 View 15 View Data 15 View Data Characters 15 View Data Distances 15 View Data Splits 15 View Flip 15 View Format Nodes and Edges 5 13 15 30 View Highlight Confidence 15 31 View Invert Selection 12 View Magnify All Mode 15 View Node Label Layout gt No Overlaps 13 15 View Node Label Layout Radial 16 View Node Label Layout Simple 16 View Reset 15 22 23 View Rotate Left 15 23 View Rotate Right 15 23 View Use Magnifier 15 View Zoom In 15 22 23 View Zoom Out 15 22 23 WAG 45 weakly compatible 9 Weight Threshold 26 Weights 31 Whole words only 30 Window 21 22 Window gt About 21 31 Window Command Syntax 21 Window Enter a Command 15 21 Window How to Cite 21 Window gt Message Window 21 23 31 W
18. 2 NO CODON3 EXTREES NONE list of original tree labels LAYOUTSTRATEGY STABILIZE SNOWBALL KEEP NO AUTOLAYOUTNODELABELS UPTODATE END 37 21 File Formats By default SplitsTree4 reads and writes data in Nexus format The program can read and export the following additional formats FastA Phylip and ClustalW Name file suffix data type new Nexus nex nxs taxa unaligned sequences aligned sequences charac ters distances quartets trees splits networks old Nexus nex nxs aligned characters distances 27 trees FastA fa fasta unaligned sequences or aligned characters Phylip phy dst dist aligned characters or dis 13 tances ClustalW aln aligned characters 34 22 All Methods Here we list all methods supported by SplitsTree4 We describe the usage and list the input and output Nexus blocks Binary2Splits This method converts binary characters to splits Usage Binary2Splits AddAllTrivial lt boolean gt MinSplitWeight lt int gt Input Characters Output Splits BioNJ This method computes the Bio NJ tree 15 Usage BioNJ Input Distances Output Trees BunemanQuartets This method computes all quartets with positive Buneman index 8 Usage BunemanQuartets threshold lt double gt Input Distances Output Quartets BunemanTree This method computes the Buneman tree 8 Usage BunemanTree Input Distances Output Splits 38 ClustalW
19. Method_2 lt String gt DistanceMeasure_1 lt String gt DistanceMeasure_2 lt String gt ObjectiveScore lt String gt LogFile lt boolean gt LogFileName lt String gt OptionalParameter lt String gt PathToCommand lt String gt Input Unaligned Output Characters NeiMiller This method computes distances from restriction sites using the Nei and Miller method 29 Usage NeiMiller Input Characters Output Distances NJ This method computes the Neighbour Joining tree 30 Usage NJ Input Distances Dutput Trees NeighborNet This method computes the Neighbor Net splits 7 to produce a Neighbor Net network Usage NeighborNet Variance lt String gt Minimize_AIC lt boolean gt Input Distances Output Splits NoAlign This method obtains a trivial sequence alignment by adding gaps to the end of each sequence to make all sequences have the same length Usage Noalign Input Unaligned Output Characters NoGraph This method prevents the program from constructing a final network Usage NoGraph Input Splits Output Network NoSplits This method prevents the program from constructing splits from a set of trees Usage NoSplits Input Trees Output Splits 43 ParsimonySplits This method computes the set parsimony splits 2 Usage ParsimonySplits Input Characters Output Splits PhylipParsimony This method computes the maximum parsimony tree from DNA sequences using an external call to t
20. NO YES ALL YES NO RUNS the number of runs LENGTH sample length SAME SEED random number seed SAVEWEIGHTS yes no FIXSPLITS yes no COMPUTEDIST yes no OUTPUTFILE file name MATRIX label_1 value_1 split_1 label_2 value_2 split_2 36 label_nsplits value_nsplits split_nsplits label_nsplits 1 value_ nsplits 1 splits_ nsplits 1 label_n value_n splits_n 3J END 20 10 Sets Block The Sets block can be used to define sets of taxa or characters It has the following syntax BEGIN Sets TAXSET taxset name taxon list CHARSET charset name character list CHARPARTITION charpart name 1 charset name character list END 20 11 ST Assumptions Block The ST_Assumptions block controls the processing of the data along the pipeline It contains all choices made by the user that affect computations It has the following syntax BEGIN ST_ASSUMPTIONS UNALIGNTRANSFORM name parameters CHARTRANSFORM name parameters DISTTRANSFORM name parameters SPLITSTRANSFORM name parameters SPLITSPOSTPROCESS NO LEASTSQUARES FILTER 14GREEDYCOMPATIBLE WEAKLYCOMPATIBLE WEIGHT VALUE value CONFIDENCE VALUE valueDIMENSION VALUE value NONE EXTAXA 1NONE list of original taxa labels EXCHAR 1NONE list of original char positions EXCLUDE NO GAPS NO NONPARSIMONY NO CONSTANT CONSTANT number CNO CODON1 NO CODON
21. TH CENO DIAGONAL LABELS LEFT NO 5 MATRIX distance data in specified format 33 END 20 5 Quartets Block The Quartets block contains a list of quartets It has the following syntax BEGIN Quartets DIMENSIONS NTAX number of taxa NQUARTETS number of quartets FORMAT LABELS LEFT NO WEIGHTS YES NO MATRIX label11 weight1 al b1 c1 dl labeln weightn an bn cn dn END 20 6 Trees Block The Trees block contains a list of phylogenetic trees It has the following syntax BEGIN Trees PROPERTIES PARTIALTREES YES NO TRANSLATE nodeLabel1i taxoni nodeLabel2 taxon2 nodeLabelN taxonN 3 TREE namel treel in Newick format TREE name2 tree2 in Newick format TREE nameM treeM in Newick format END 20 7 Splits Block The Splits block contains a list of splits It has the following syntax BEGIN Splits DIMENSIONS NTAX number of taxa NSPLITS number of splits 34 FORMAT LABELS LEFT NO WEIGHTS YES NO CONFIDENCES YES NO INTERVALS YES NO 33 THRESHOLD non negative numnber PROPERTIES FIT non negative number leastsquares COMPATIBLE CYCLIC WEAKLY COMPATIBLE INCOMPATIBLE CYCLE taxon_i_1 taxon_i_2 taxon_i_ntax SPLITSLABELS label_1 label_2 label_nsplits MATRIX label_1 weight_1 confidence_1 split_1 label_2 weight_2 confidence_2 split_2 label_nsplits weight_nspli
22. This item opens a dialog for opening multiple files The program parses all given files and extracts any trees found in them and opens a new Document containing all trees found The File gt Tools gt Load Multi Labeled Tree item can be used to read a single tree containing multiple labels and to convert it into a single labeled split network and will be made available upon publication of the paper describing this new method The File gt Tools gt Concatenate Sequences item can be used to concatenate sequences from different files into one document It requires that each of the input files uses exactly the same set of taxon labels The File Tools Group Identical Haplotypes tools is used to detect multiple identical sequences or taxa These are recognised from the Characters or if there is no Characters block from the distances block Only selected sites are used to test if two taxa are distinguishable A new document is created identical taxa are collapsed into a single taxa with label TYPEx x ranging from 1 2 3 and the original taxa stored in the info field of the Taxa block Note that files with hundreds or thousands of identical sequences can cause SplitsTree to stall One solution is to cancel the computations after data is read in and to then use this tool to create a new smaller document containing only distinct sequences The File gt Print item is used to print the current network The File gt Quit item quits the progra
23. To change this number to 99 say type setprop label layout iterations 99 using the Window Enter a Command item 15 12 4 The The View gt Node Label Layout Radial check box item controls whether node labels are displayed in a radial fashion At present there is no way to change the individual angles used in the display The View Node Label Layout gt Simple check box item controls whether a simple node label layout is used Data Menu Data menu contains the following items The Data Keep Only Selected Taxa item removes all taxa from the analysis whose nodes in the network are unselected All data associated with the removed taxa is removed from the original source block and all subsequent data is recomputed The Data Exclude Selected Taxa item removes all taxa from the analysis whose nodes in the network are selected All data associated with the removed taxa is removed from the original source block and all subsequent data is recomputed The Data Restore All Taxa item restores all taxa that where previously removed using either of the previous two menu items or by the next menu item The Data Filter Taxa item opens the Pipeline Taxa Filter tab that can be used to interactively include or exclude taxa from the analysis The Data Taxon Sets submenu is used to select all nodes labeled by a given taxon set or to add new taxa sets to the Sets block The Sets block can be copied to other files to provide a convenie
24. User Manual for SplitsTree4 V4 13 1 Daniel H Huson and David Bryant April 16 2013 with contributions from Markus Franz Migiiel Jette Tobias Kloepper and Michael Schr der www splitstree org Contents Contents 1 Introduction 2 Getting Started 3 Obtaining and Installing the Program 4 Program Overview 5 Splits Trees and Networks 6 Opening Reading and Writing Files 7 Estimating Distances 8 Building and Processing Trees 9 Building and Drawing Networks 10 Main Window 10 11 11 JM Wetworke Tab oia do be a de dla a 11 MA Data Tab ocurrir cs ia a daa da bee 12 MES Tool ua as a ee Ha A oe E A e a 12 11 Graphical Interaction with the Network 12 12 Main Menus 13 19 1 File Men wk hc ee ee a a Ee ea ee ee 13 122 Ean Memis eoe ee se we EO ee ee ee ee we wt 14 123 VI MEDE coso rios Be he Sd BO ce ok amp Gas ee eS Ge RE Ee BS 15 124 Data Meniti cares es eae Dek Se oh ow SB he ee Berek Roo es ge chk ee a Se i a 16 12 9 Distances Men o eud c ee e Yo we eee ee ee a ee ee ee oe e o 17 120 Tress Ment oe ck ke we Re AR ee A A a SO a eS a A Ree 18 127 Newark Memi 2200 Saa ee ee Grid we es Ge a ae eS eee A eB ee as 18 e Mauw oos es ee ee eh ee a ee ee 19 120 Diaw Meni s xc A oa Aw eke MA Ae eee EBS Ee BESO SS RE ee BE i 20 POW indow Menn oc a eR a oo ee A ee Sh a wm Eo a aS 21 12 li Contievenio Methods 04 e ke ee SO Se Se a a BA a ES 21 13 Popup Menus 22 14 Tool Bar 23 15 Data Entry Dialog 23 16 Pipeline Wind
25. auer Associates Inc 2nd edition 1996 J D Thompson D G Higgins and T J Gibson CLUSTAL W improving the sensitivity of progressive multiple sequence alignment through sequence weighting position specific gap penalties and weight matrix choice Nucl Acids Res 22 4673 4680 1994 50 Index SplitsTree def 27 aln 38 dist 38 dst 38 fa 38 fasta 38 nex 38 nxs 38 phy 38 About 21 31 About command 47 All 16 Allow Graph Editing 28 Analysis 19 Analysis gt Bootstrap 19 31 Analysis gt Compute delta scores 20 Analysis gt Compute Phylogenetic Diversity 19 Analysis gt Conduct Phi test for Recombination 20 Analysis Configure Pipeline 5 10 20 23 Analysis gt Configure Recent Methods 5 10 20 21 Analysis gt Estimate Invariable Sites 19 Analysis gt Show Bootstrap Network 19 Analysis gt Show Confidence Network 19 Assume command 47 Backward 30 batik 48 Binary2Splits 38 BioNJ 18 38 block 32 blocks 6 BMP 27 bold 30 Bootstrap 19 31 Bootstrap command 47 box opening 40 branch 8 BunemanQuartets 38 BunemantTree 18 38 Case sensitive 30 Characters 15 33 Choose Datatype 29 circular 9 Clear All Taxa Sets 16 Clone 13 23 Close 13 30 Closest Tree 26 ClustalW 38 39 Coalescent 39 coalescent method 39 Color Chooser 30 colors setting 30 Command Syntax 21 command line mode 47 command line options 46 compat
26. ck 244k 2 see ae eRe a paa e re eee de hae aaa a 21 7 Solis BRE o ok bok wk es Ee he E Ble ee we EE A Me A Sa 20 8 Network Block gt i s eke ee a ek eee ee eee ee ee ee ee a 20 9 Bootstrap Block lt e ss ca eb ea Re eG a Ae ee ee ee Ee eo we aS 20 105ets Block 2 4 22k ga he Raw REED RRR ERG Gee eee eae we HS 20 115 Assumptions Block 00 604 5 nr coa aier sa goi ee es ee ER 21 File Formats 22 All Methods 23 Command Line Options and Mode 24 Examples 27 29 29 29 29 30 30 31 31 31 31 31 32 32 32 33 33 34 34 34 35 36 37 37 38 38 46 48 25 Acknowledgments 48 References 48 Index 51 1 Introduction Disclaimer This software is provided AS IS without warranty of any kind This is develop mental code and we make no pretension as to it being bug free and totally reliable Use at your own risk We will accept no liability for any damages incurred through the use of this software Use of the SplitsTree4 is free however the program is not open source Type setting conventions In this manual we use e g Network NeighborNet to indicate the NeighborNet menu item in the Network menu We use e g Main Source to indicate the Source tab in the Main window How to cite If you publish results obtained in part by using SplitsTree4 then we require that you acknowledge this by citing the program as follows e D H Huson and D Bryant Application of Phylogenetic Networks in Evolutionary Stud ies Mo
27. computes all RY splits Usage RYSplits Input Characters Output Splits SpectralSplits This method computes all splits arising using spectral analysis 17 Usage SpectralSplits Threshold lt double gt Method lt String gt Weight_ATvsGC lt double gt Weight_AGvsCT lt double gt Weight_ACvsGT lt double gt Input Characters Output Splits SplitDecomposition This method computes the split decomposition 2 45 Usage SplitDecomposition Input Distances Output Splits SuperNetwork This method computes a super network from partial trees using the Z closure algorithm 23 Usage SuperNetwork EdgeWeights lt String gt ZRule lt boolean gt NumberOfRuns lt int gt SuperTree lt boolean gt ApplyRefineHeuristic lt boolean gt Input Trees Output Splits TreeSelector This method is used to select a single tree from a set of trees Usage TreeSelector Which lt int gt Input Trees Output Splits UncorrectedP This method calculates uncorrected observed P distances This is identical to the Hamming method Usage Uncorrected_P ignoregaps lt boolean gt Input Characters Output Distances UPGMA This method computes UPGMA tree 31 Usage UPGMA Input Distances Output Trees 23 Command Line Options and Mode SplitsTree4 has the following command line options g lt switch gt default true GUI mode p lt String gt default HOME SplitsTree def Properties file i lt String g
28. constructing trees both from distance and char acter data The following distance based methods are available NJ BioNJ UPGMA BunemanTree and RefinedBunemanTree The program provides front ends for several tree based programs currently PhyML and PhylipParsimony SplitsTree4 will pass these programs the current sequences and place any trees produced in the SplitsTree4 Trees block Our distribution of SplitsTree4 does not provide any external programs so you must install them separately In the dialog that is associated with such a method you must specify the location of the external application e g using the PhyML Path text field Once entered the program will remember this entry as the default value SplitsTree4 provides a number of methods for processing a collection of trees The Trees gt TreeSelector menu contains items to select individual trees or to compute a consen sus of trees If the Trees TreeSelector is selected then the Trees gt Previous Tree and Trees gt Next Tree items can be used to move from one tree to the next in the Trees block Pressing the shift key when selecting either of these methods will move to the first tree or the last tree respectively The Trees gt ConsensusTree can be used to compute the majority or strict consensus tree As with distances when you select such a menu item the Pipeline window will open and will display a panel for setting the parameters of the method If you always use the sam
29. ct Edges 15 Deselect Nodes 15 Disclaimer 4 display of sites 24 Distances 15 17 33 Distances gt F81 17 Distances gt F 84 18 Distances gt GeneContentDistance 18 Distances gt HKY85 17 Distances gt JukesCantor 17 Distances gt K2P 17 Distances gt K3ST 17 Distances LogDet 17 Distances NeiMiller 18 Distances gt ProteinMLdist 18 Distances gt UncorrectedP 17 distortion filter 40 DNA2Splits 39 Document 6 Don t show this dialog for this method again 21 DQuartets 39 Draw 20 Draw ConvexHull 20 Draw EqualAngle 20 23 Draw Hide Incompatible Splits 21 Draw Hide Non Selected Splits 20 Draw Hide Selected Splits 11 17 20 Draw gt NoGraph 20 Draw gt Phylogram 20 23 Draw Redraw All Splits 21 Draw Reroot 20 Draw RootedEqualAngle 20 23 Draw Select Characters 20 Draw gt Select Trees 11 21 Edge 22 edge 8 Edge Style 31 Edge Width 31 Edge gt Copy Label 22 Edge Edit Label 22 Edge Format 22 Edge Hide Label 22 Edge Show Confidence 22 Edge Show Id 22 Edge Show Interval 22 Edge gt Show Weight 22 Edges 30 Edit 14 Edit Label 22 Edit gt Copy 14 Edit gt Cut 14 Edit Deselect All 15 Edit gt Deselect Edges 15 Edit Deselect Nodes 15 Edit Find Replace 15 23 30 Edit Invert Selection 14 Edit Paste 14 Edit Preferences 15 23 27 Edit Redo 14 Edit Select All 14 Edit Select Edges 14 Edit Select
30. e parameters for a given method or if the method has no parameters then selecting the Don t show this dialog to configure this method again will prevent the dialog from appearing again All methods can be configured using the Pipeline window accessed by Analysis Configure Pipeline or Analysis Configure Recent Methods 10 9 Building and Drawing Networks The Network menu provides methods for computing phylogenetic networks from character se quences distances and trees Methods that compute a split network directly from character sequences provided in the Characters block are ParsimonySplits MedianNetwork MedianJoining and SpectralSplits Note that the Median network method requires binary sequences However given DNA or RNA this program will detect all sites that contain precisely two character states and will build a Median network from these The MedianJoining method computes an unrooted network from binary sequences DNA and other multi state sequences This is an implementation of the algorithm described in 4 In the case of non binary sequences the resulting network will not be a split network Two methods for computing split networks from distances provided in the Distances block are SplitDecomposition and NeighborNet If a set of phylogenetic trees in the Trees block all contain the full set of taxa listed in the Taxa block then the ConsensusNetwork method can be applied to compute a consensus network If howev
31. e the whole image and the Save visible image to save only the part of the image that is currently visible in the main viewer If the chosen format is EPS then selecting the Convert text to graphics check box will request the program to render all text as graphics rather than fonts Pressing the apply button will open a standard file save dialog to determine where to save the graphics file 18 Preferences Window The Preferences window is opened using the Edit Preferences item This window controls program preferences Many of the preferences are persistent that is they remain effective even after closing and re opening the program To remember these choices SplitsTree4 creates and maintains a properties file which is placed in the users home directory and is called something like SplitsTree def It has six tabs The Preferences General tab controls general aspects of the program 27 Allow Graph Editing if set the user can add or delete nodes and edges from dis played network To create a node control double click on the canvas To create an edge between two nodes control click on a node and then drag create an edge If dragging ends on a second node then this is connected to the first otherwise a new node is created To delete a node or delete an edge select it and press the delete key Lock Edge Lengths Usually when interacting with a displayed network we want to lock edge lengths so that they cannot be changed
32. ein 84 model 33 Usage F84 Maximum_Likelihood lt boolean gt Estimate_Base_Frequencies lt boolean gt Normalize lt boolean gt TRatio lt double gt A lt double gt C lt double gt G lt double gt T_U lt double gt Input Characters Output Distances GapDist This method calculates the gap distance from a set of sequences Usage GapDist Input Characters Output Distances GeneContentDistance This method compute distances based on gene content 24 40 Usage GeneContentDistance UseMLDistance lt boolean gt Input Characters Output Distances Hamming This method calculates distances using the Hamming distance This is identical to the UncorrectedP method Usage Hamming ignoregaps lt boolean gt Input Characters Output Distances HKY85 This method calculates distances using the Hasegawa Kishino and Yano model Usage HKY85 Estimate_Base_Frequencies lt boolean gt Normalize lt boolean gt TRatio lt double gt A lt double gt C lt double gt G lt double gt T_U lt double gt P_Invar lt double gt Gamma lt double gt Input Characters Output Distances JukesCantor This method computes distances using the Jukes Cantor model 33 Usage JukesCantor Maximum_Likelihood lt boolean gt Input Characters Output Distances K2P This method calculates distances using the Kimura 2P model 33 Usage K2P Maximum_Likelihood lt boolean gt TRatio lt double gt Input Characters Outpu
33. ending on the type of the source block In the case of a Characters block the program will use UncorrectedP NeighborNet and then EqualAngle to compute a network for the data In the case of a Distances block NeighborNet and EqualAngle are used In the case of a Trees block the first tree in the block is displayed using the EqualAngle algorithm The data contained in the different blocks of the pipeline is displayed in the Data tab The program is designed to operate in two differents modes in a GUI mode the program provides a GUI for the user to interact with the program In command line mode the program reads commands from a file or from standard input and writes output to files or to standard output 5 Splits Trees and Networks Here we give a brief introduction to some of the concepts from phylogenetics that the program is based on Evolutionary relationships between taxa are usually represented using a phylogenetic tree Such trees are often computed from molecular sequence data either directly using methods such as parsimony or indirectly by first computing a distance matrix and then applying a method such as Neighbor Joining Such approaches implicitly or explicitly model the evolution of a single gene under the assumption that the process is dominated by two types of events mutations and speciation events Under more complex models of evolution i e involving gene loss and duplication or hybridization horizontal gene transfe
34. er the Trees block contains partial trees that is trees that do not necessarily all involve identical sets of taxa then the SuperNetwork or FilteredSuperNetwork method can be used to compute a super network The Draw menu determines which algorithm is used to construct the final visualization of the tree or network Existing methods are EqualAngle RootedEqualAngle and Phylogram Additionally the Draw Hide Selected Splits can be used to remove selected splits from the network and the Draw Select Trees can be used to highlight different trees contained in a split network 10 Main Window With SplitsTree4 multiple documents can be opened and worked on simultaneously Each docu ment is represented by a Main window Additional windows or dialogs are sometimes opened to perform certain tasks The Main window is split vertically into two panels e the left hand Data panel is used to display all data associated with a given document in text format and e the right hand Network panel contains the current tree or network We now discuss each separately 10 1 Network Tab The Network tab is used to display the computed tree or network A status line of text along the bottom of the network pane gives a summary of the data such as number of taxa length of 11 sequences etc and a summary of the methods used to compute the network Optionally a scale bar can be displayed in the upper left corner of the Network tab to indicate the sca
35. ew Data item provides a submenu of items that are enabled when the Data tab is selected The View Data Characters y View Data Distances and View gt Data gt Splits submenus control the format used to write the Characters Distances and Splits blocks respectively e The View Reset item resets the layout of the network e The View gt Zoom In item is used to zoom into the network e The View Zoom Out item is used to zoom out from the network e The View gt Rotate Left and View Rotate Right items are used to rotate the network e The View gt Flip item changes the layout of the network to it s mirror image e The View Format Nodes and Edges item opens the Format Nodes and Edges window that is used to modify the graphical attributes of the nodes and edges of the network e The View gt Highlight Confidence item opens the Confidence window that can be used to control how confidence values for splits or edges are highlighted in the network e The View gt Use Magnifier item is used to turn the magnifier functionality on and off e The View gt Magnify All Mode item modifiers the magnification process so that the whole network or tree gets mapped into the magnifier e The View gt Node Label Layout No Overlaps check box item controls whether node labels are automatically placed so as to prevent any overlaps By default the program tries computing a label layout in 10 different random orders of the nodes
36. he Phylip package 13 Usage Input Output PhylipParsimony PhylipPath lt String gt SearchMode lt String gt NumberOfTreesToSave lt int gt InputOrderSeed lt int gt InputOrderJumbles lt int gt DutgroupRoot lt String gt ThresholdParsimony lt double gt TranversionParsimony lt boolean gt WeightsFile lt String gt Characters Trees Phylogram This method computes a traditional phylogenetic tree Usage Input Output PhyML 16 Usage Input Output Phylogram Weights lt boolean gt Cladogram lt boolean gt Dutgroup lt String gt UseOut group lt boolean gt Splits Network This method calculates maximum likelihood trees from DNA sequences using PHYML PhyML PHYMLPath lt String gt TreePath lt String gt Bootstrap lt boolean gt NumberOfBootstrapReplicates lt int gt PrintBootstrap lt boolean gt SubstitutionModel lt java lang String gt OptimiseEquilibriumFrequencies lt boolean gt EquilibriumFrequenciesEmpirical lt boolean gt FrequencyA lt double gt FrequencyC lt double gt FrequencyG lt double gt FrequencyT lt double gt SubstitutionParameterAC lt int gt SubstitutionParameterAG lt int gt SubstitutionParameterAT lt int gt SubstitutionParameterCG lt int gt SubstitutionParameterCT lt int gt SubstitutionParameterGT lt int gt EmpiricalBaseFrequencyEstimates lt boolean gt GammaDistributionParameter lt double gt GammaDistributionParameterFixed lt boolean gt InvariableS
37. ible 8 Compute delta scores 20 Compute Phylogenetic Diversity 19 Concatenate Sequences 14 concatenate sequences 14 Conduct Phi test for Recombination 20 Confidence 15 31 Confidence Threshold 26 Configure Pipeline 5 10 20 23 Configure Recent Methods 5 10 20 21 consensus network 11 39 consensus trees 39 ConsensusNetwork 19 39 ConsensusTree 10 18 39 constant 16 Convert text to graphics 27 ConvexHull 20 39 Copy 14 Copy Label 22 cpREV45 45 create a node 28 create an edge 28 Cut 14 Data 12 15 16 32 Data Entry 23 data type 13 Data Exclude Constant Sites 16 Data Exclude Gap Sites 16 Data Exclude Parsimony Uninformative Sites 16 Data Exclude Selected Splits 17 20 Data Exclude Selected Taxa 16 Data Filter Characters 16 Data Filter Splits 17 Data Filter Taxa 16 Data Filter Trees 17 Data Greedily Make Compatible 17 Data Greedily Make Weakly Compatible 17 Data Keep Only Selected Taxa 16 Data Restore All Sites 16 Data Restore All Splits 17 Data Restore All Taxa 16 Data Set Tree Names 17 Data Taxon Sets 16 Data gt TaxonSets gt All 16 Data gt TaxonSets gt Clear All Taxa Sets 16 Data TaxonSets New taxa set 16 datatype 13 29 Dayhoff 45 DaylighIterations 40 default calculations 7 Delete 24 delete a node 28 delete an edge 28 Delete excluded command 47 Deselect All 15 Desele
38. ies no filter Use the Exclude Selected Splits to remove all splits that are currently selected in the displayed network 17 Export and Export Images Dialogs 17 1 Export The Export dialog is opened using the File gt Export item It is used to save individual blocks of data in any of the formats described in Section 21 The dialog show two lists On the left the set of available Nexus blocks Is listed Depending on the selection of blocks made by the user on the right the set of available export formats is listed Most of the dependences between blocks and formats should be apparent One interesting feature is the following If one has enabled graph editing using the Allow Graph Editing option in the Preferences General tab and has interactively constructed a phylogenetic tree then this tree can be saved in Newick format Hence SplitsTree4 can be used for editing trees 26 17 2 Export Image The Export Image dialog is opened using the File gt Export Image item This dialog is used to save a picture of the current network in a number of different formats The following graphics formats are supported e JPEG Joint Photographic Experts Group e GIF Graphics Interchange Format e EPS Encapsulated PostScript SVG Scalable Vector Graphics e PNG Portable Network Graphics e BMP Bitmap e PDF Portable Document Format There are two radio buttons the Save whole image to sav
39. in one of the following formats Nexus ClustalW PhylipParsimony FastA or Newick Alternatively select the File gt Enter Data menu item and enter data in one of these formats by hand Example files are provided with the program They are contained in the examples sub directory of the installation directory The precise location of the installation directory depends upon your operating system Use the different menu items to determine which methods are applied to the input data The Distances Trees and Network menus contain items that determine how to compute distances trees or a network from the given data Some of the methods provided have parameters that can be set using the Analysis gt Configure Pipeline or Analysis Configure Recent Methods items The computed data can be viewed in text form in the Data tab and can be saved using the File gt Save As item The computed network or tree is displayed in the Network tab Attributes of the network can be changed by selecting nodes or branches also called edges and using the Format Nodes and Edges dialog which is reachable using the View Format Nodes and Edges item Individual blocks of data can be saved in a number of different formats using the File Export item The network displayed in the Network tab can be saved in a graphics format using the File Export Image item 3 Obtaining and Installing the Program SplitsTree4 is written in Java and requires a Java runtime env
40. in the pipeline Pipeline Taxa Pipeline Unaligned Pipeline Characters Pipeline Distances Pipeline Quartets Pipeline Trees and Pipeline Splits This is a potential source of confusion because in the main menus of the Main window we organize the methods by the type of their output rather than by the type of their input as is done here Each tab in controls how a block of the given type is processed to produce a block of a subsequent type Each of the tabs contains upto three sub tabs as mentioned below Here we now describe each of the main tabs 16 1 Taxa Tab The Pipeline Taxa tab consists of precisely one Pipeline Taxa Filter sub tab which is used to include show or exclude hide taxa from all calculations All taxa listed in the show list are included in all calculations where as all taxa listed in the hide list are removed from the data set If a set of taxa are selected in the network displayed in the Main window then these can be 23 added to the show or hide list by pressing the appropriate From Graph button Press Show All or Hide All to show all hide all known taxa Please note that the set of shown or hidden taxa will change automatically when viewing different partial trees to reflect the set of taxa contained in the current tree 16 2 Unaligned Tab The Pipeline Unaligned tab consists of precisely one sub tab The Pipeline Unaligned Method sub tab is used to choose and configure a method to apply to
41. indow Nexus Syntax 21 Window Register 21 Window Reset 22 Window Reset Label Positions 23 Window gt Restore All Splits 23 57
42. ing error 4 Save 7 9 13 Save As 5 7 9 13 23 29 Save command 47 Save visible image 27 Save whole image 27 scale bar 28 Select All 14 Select Characters 20 Select Edges 14 Select Labeled Nodes 14 Select Nodes 14 Select Supporting Characters 20 24 Select Trees 11 21 selected 30 sequence alignment 39 42 43 Set 26 Set Maximum Dimension 26 Set Tree Names 17 Set Window Size 23 Sets 37 Show All 24 Show Bootstrap Network 19 Show Confidence 22 Show Confidence Network 19 Show cycle command 47 Show Id 22 Show Interval 22 show list 23 Show Name 22 Show on All Taxa 24 Show on Selected Taxa 24 Show Scale Bar 28 Show Weight 22 Simple 16 Source 30 source block 6 SpectralSplits 19 45 split 7 split decomposition 45 split encoding 8 split network 9 Split Selection Mode 12 SplitDecomposition 19 45 Splits 15 34 SplitsTree 47 splitstree_macos_4 13 1 dmg 5 splitstree_unix_4 13 1 sh 5 splitstree_windows_4 13 1 exe 5 Spring Embedder Iterations 42 SpringEmbedderlterations 40 ST_Assumptions 37 ST_Bootstrap 36 status line 28 super network 11 super network 40 46 SuperNetwork 19 46 SVG 27 synchronized 7 systematic error 4 Taxa 32 TAXLABELS 32 Taxon Sets 16 tool bar 23 Tools 14 tree 7 tree names 17 Trees 18 34 Trees gt BioNJ 18 Trees gt BunemanTree 18 Trees gt ConsensusTree 10 18 Trees gt Next Tree 10
43. ingEmbedderIterations lt int gt LabelEdges lt boolean gt ShowHaplotypes lt boolean gt SubdivideEdges lt boolean gt ScaleNodesByTaxa lt boolean gt Input Characters Output Network PrunedQuasiMedian This method computes a geodescically pruned quasi median network 1 It uses all sites in the character alignment and treats gaps or missing data as additional sites Usage PrunedQuasiMedian SpringEmbedderlterations lt int gt LabelEdges lt boolean gt ShowHaplotypes lt boolean gt SubdivideEdges lt boolean gt ScaleNodesByTaxa lt boolean gt Input Characters Output Network MinSpanningNetwork This method computes a minimum spanning network 11 It computes the unnormalized Hamming distance between every pair of sequences A spring embedder is used to compute a layout of the graph The number of iterations required will vary by the size and com plexity of the graph and can be set using the Spring Embedder Iterations item If Subdivide Edges is selected then edges are divided into sub edges one for each change along the edge Usage MinSpanningNetwork Epsilon lt int gt SpringEmbedderlterations lt int gt LabelEdges lt boolean gt ShowHaplotypes lt boolean gt SubdivideEdges lt boolean gt ScaleNodesByTaxa lt boolean gt Input Characters Output Network Muscle This method externally runs the Muscle sequence alignment program 10 42 Usage Muscle Maxiters lt int gt ClusterMethod_1 lt String gt Cluster
44. ironment version 1 5 or newer freely available from www java org SplitsTree4 is installed using an installer program that is freely available from www splitstree org There are four different installers targeting different operating systems e splitstree windows 4 13 1 exe provides an installer for Windows e splitstree_macos_4 13 1 dmg provides an installer for Mac OS e splitstree unix 4 13 1 sh provides a shell installer for Linux and Unix 4 Program Overview In this section we give an overview over the main design goals and features of this program Basic knowledge of the underlying design of the program should make it easier to use the program SplitsTree4 is written in the programming language Java The advantages of this is that we can provide versions that run under the Linux MacOS Windows and Unix operating systems Addi tionally this makes it possible for the program to support plug ins that can add new functionality to the program such as new methods and import or export modules A potential draw back is that an algorithm implemented in Java will generally run slower than the same algorithm implemented in C or C Earlier versions 1 3 of the program 20 were written in C and only contain a small part of what is now available with SplitsTree4 SplitsTree4 uses multi threading and supports multiple documents This means that that you can work on more than one data set simultaneously in different windows and run m
45. isible button to use random colors or to set no color 30 The third row of items is used to set the Node Size and Node Shape The fourth row of items is used to set the Edge Width and Edge Style The fifth row is used to set the node labels to Names or IDs and also to rotate the node labels using the buttons Rotate Left and Rotate Right The final row is used to set the edge labels to Weights IDs Confidence values or confidence Intervals Please note that any changes made only apply to the currently selected nodes or edges If there are no edges or nodes selected then changes will apply to all nodes and edges Any change made in this dialog box can be reversed using Edit Undo 19 6 Highlight Confidence Window The Highlight Confidence Window is opened using the View gt Highlight Confidence window item Use this window to request that edges and or edge shading is done to reflect the confidence values associated with each split 19 7 Bootstrap Window The Bootstrap window is opened using the Analysis Bootstrap item Enter the Number of Replicates and then press Run to execute 19 8 Message Window The Message window is opened using the Window Message Window item The program writes all messages to this window 19 9 About Window The About Window is opened using the Window gt About item It reports the version of the program 19 10 Register Window Use of Splits Tree4 is free However we require that each use
46. itesProportion lt double gt InvariableSitesProportionFixed lt boolean gt NumberOfSubstitutionCategories lt int gt OneSubstitutionCategory lt boolean gt OptimiseStart lt boolean gt OptimiseStartKeepingTopology lt boolean gt TstvRatio lt double gt TstvRatioFixed lt boolean gt UseBioNJstart lt boolean gt OptimiseRelativeRateParameters lt boolean gt Characters Trees 44 ProteinMLdist This method calculates maximum likelihood protein distance estimates using the following models cpREV4Z5 Dayhoff JTT mtMAM mtREV24 pmb Rhodopsin and WAG 33 Usage ProteinMLdist Gamma lt double gt Model lt String gt PInvar lt double gt Estimate_variances lt boolean gt Input Characters Output Distances PTreeSplits This method computes the parsimony splits tree 2 Usage PTreeSplits Input Characters Output Splits RefinedBunemanTree This method computes the Refined Buneman Tree 5 This module was implemented by Lasse Westh Nielsen and Christian N S Pedersen Usage RefinedBunemanTree Input Distances Output Splits RootedEqualAngle This method computes a rooted split network using the rooted equal angle algorithm 14 Usage RootedEqualAngle OptimizeDaylight lt boolean gt UseWeights lt boolean gt RunConvexHull lt boolean gt DaylightIterations lt int gt OutGroup lt String gt MaxAngle lt int gt SpecialSwitch lt boolean gt Input Splits Output Network RYSplits This method
47. lbar tab allows the user to interactively configure the tool bar The Preferences Status Line tab allows the user to configure the status line displayed along the bottom of the Network tab Here two items are of particular interest If the displayed graph was computed either by the BunemanTree or SplitDecomposition method then the pair wise distances in the graph may under estimate the true distances in the given Distances block The difference between the two can be expressed in terms of the fit value which is activated using the Fit check box which is defined as the sum of all pairwise distances in the graph divided by the sum of all pairwise distances 28 in the given matrix times 100 This is not applicable to other tree or network building methods For other methods please use the LSFit item which computes the least squares fit between the pairwise distances in the graph and the pairwise distances in the matrix 19 Additional Windows and Dialog Here we list all other windows 19 1 Open File The Open File dialog is opened using the File gt 0pen item Use it to open any file containing phylogenetic data in one of the formats described in Section 21 If an error is encountered then the file is opened in the MainSource tab If possible the offending line is highlighted 19 2 Choose Datatype When opening a file containing character sequences or importing sequences from the Data Entry dialog the program must know whethe
48. le of branches The scale bar is can be turned on or of using the Preferences General tab The scale bar can be configured using the Preferences Status Line tab 10 2 Data Tab The Data tab provides a textual display of the data associated with the given document in the program s native Nexus format organized in a linear list of items that can be either collapsed or expanded This view of the data is read only 10 3 Tool bar The tool bar associated with the window provides buttons for quick access to many of the menu items It can be configured using the The Preferences Toolbar tab 11 Graphical Interaction with the Network We now describe how the user can interactively modify the layout and attributes of the displayed network The View menu is used to rotate move and zoom in and out Alternatively the view can be changed using a wheel mouse together with the following modifier keys none zoom in and out Shift rotate ALT option move vertically Shift amp ALT option move horizontally Here are additional modifications that can be performed e The graph can be dragged around by clicking and dragging a node e If all selected nodes lie on one side of a band of parallel branches representing a single split then clicking and dragging on one of the nodes will change the angle of the branches e By default clicking on an edge will select all edges representing the same split and all nodes on the smaller side of the
49. lecular Biology and Evolution 23 2 254 267 2006 software available from www splitstree org Evolutionary relationships are usually represented using phylogenetic trees based on a model of evolution dominated by mutations and speciation events More realistic models must also account for gene genesis loss and duplication events hybridization horizontal gene transfer or recombina tion Here phylogenetic networks have a role to play Moreover network methods also provide a value tool for phylogenetic inference even when retic ulation events do not play an important role The combined effect of sampling error and systematic error makes phylogenetics an uncertain science and network methods provide tools for representing and quantifying this uncertainty The aim of SplitsTree4 is to provide a framework for evolutionary analysis using both trees and networks The program takes as input a set of taxa represented by characters that is aligned sequences distances quartets trees or splits and produces as output trees or networks using a number of different methods This document provides both an introduction and a reference manual for SplitsTree4 2 Getting Started This section describes how to get started First download an installer for the program from from www splitstree org see Section 3 for details Use the File Open menu item to open a file containing data such as character sequences distances or trees The file must be
50. m after asking whether to save unsaved changes Edit Menu Edit menu contains the usual edit related items The Edit gt Undo item is used to undo text editing interactive network manipulation and any item chosen from the View menu The Edit Redo item is used to redo text editing interactive network manipulation and any item chosen from the View menu The Edit gt Cut item is used to cut text The Edit gt Copy item is used to copy text or to copy the current network as an image The Edit gt Paste item is used to paste text The Edit gt Select All item is used to select all nodes and edges of a network The Edit gt Select Nodes item is used to select all nodes of a network The Edit Select Labeled Nodes item is used to select all labeled nodes of a network The Edit gt Select Edges item is used to select all edges of a network The Edit Invert Selection item is used to invert the selection of nodes 14 e The Edit Deselect All item is used to deselect all nodes and edges of a network e The Edit Deselect Nodes item is used to deselect all nodes of a network e The Edit Deselect Edges item is used to deselect all edges of a network e The Edit gt Find Replace item opens the Find Replace dialog e The Edit Preferences opens the Preferences window 123 View Menu The View menu contains items that control aspects of the visualization of a network which are all undo able and redo able e The Vi
51. mple model of sequence data Mol Biol Evol 14 685 695 1997 S Guindon and O Gascuel A simple fast and accurate algorithm to estimate large phylo genies by maximum likelihood Syst Biol 52 5 696 704 2003 M D Hendy and D Penny Spectral analysis of phylogentic data Journal of Classification 10 5 24 1993 B Holland and V Moulton Consensus networks A method for visualizing incompatibilities in collections of trees In G Benson and R Page editors Proceedings of Workshop on Algorithms in Bioinformatics volume 2812 of LNBI pages 165 176 Springer 2003 B R Holland K T Huber A Dress and V Moulton delta plots A tool for analyzing phylogenetic distance data Mol Biol Evol 19 12 2051 2059 December 2002 D H Huson SplitsTree A program for analyzing and visualizing evolutionary data Bioin formatics 14 10 68 73 1998 D H Huson Introduction to phylogenetic networks Tutorial presented at ISMB 2005 D H Huson and D Bryant Application of phylogenetic networks in evolutionary studies Molecular Biology and Evolution 23 254 267 2006 Software available from www splitstree org D H Huson T Dezulian T Kloepper and M A Steel Phylogenetic super networks from par tial trees IEEE ACM Transactions in Computational Biology and Bioinformatics 1 4 151 158 2004 D H Huson and M Steel Phylogenetic trees based on gene content Bioinformatics 20 13 2044 9 2004 49 25 26
52. nt method to quickly label or select taxa in different taxa groups The Data gt TaxonSets gt A11 item selects all taxa The Data TaxonSets New taxa set item creates a new taxa set with the currently selected taxa and a name provided by the user Note that the name should be a valid NEXUS name If a set exists with this name it is overwritten The Data TaxonSets Clear All Taxa Sets removes all taxa sets from the Sets block The Data Exclude Gap Sites item excludes from computation all sites in a Characters block that contain a gap or missing character in any of the sequences The Data Exclude Parsimony Uninformative Sites item excludes from computation all sites in a Characters block that are parsimony uninformative that is which are constant across all but one sequence The Data Exclude Constant Sites item excludes from computation all sites in a Characters block that are constant across all sequences The Data Restore All Sites item restores all sites that where excluded using the above menu items The Data Filter Characters item opens the Pipeline Characters Filter tab that can be used to interactively include or exclude sites from the analysis 16 12 5 The The Data Greedily Make Compatible item uses a greedy approach to makes the splits in the Splits block compatible in decreasing order of weight the algorithm adds the next split to the set of kept splits if it is compatible with all splits that have
53. ntains the original data that is provided to the program Any computations performed by the program will update blocks from left to right along the pipeline starting after the source block Note that some types of computations do not fit into this pipeline design for example SplitsTree4 cannot provide any method that takes a Splits block and produces a Trees block because the latter occurs before the former in the pipeline Typically the user will provide a source block and will then use different menu items to determine how the program will compute data along the pipeline until a Network block has been computed and an image of the network has been drawn in the Main window The program is designed to keep all blocks in the pipeline synchronized by enforcing that the different blocks only contain data that has been computed via the pipeline For example if you attempt to load data from a file in Nexus format that contains a Taxa and two or more other blocks e g a Characters block or Distances block then the program will request you to choose which of the two latter blocks you want to keep This does not apply if the file was created by SplitsTree4 using the File gt Save or File gt Save As command in which case the blocks in the file are synchronized and so none must be discarded and no computations are necessary Once a source block has been provided the program will proceed to perform a chain of default calculations which differ dep
54. ntor method The Distances K2P item requests the program to compute distances using the K2P method The Distances K3ST item requests the program to compute distances using the K3ST method The Distances gt F81 item requests the program to compute distances using the F81 method 17 12 6 The 12 7 The The Distances gt F84 item requests the program to compute distances using the F84 method The Distances ProteinMLdist item requests the program to compute distances using the ProteinMLdist method The Distances gt NeiMiller item requests the program to compute distances using the NeiMiller method The Distances GeneContentDistance item requests the program to compute distances using the GeneContentDistance method Trees Menu Trees menu contains the following items The Trees gt NJ item requests the program to compute a tree using the NJ method The Trees gt BioNJ item requests the program to compute a tree using the BioNJ method The Trees gt UPGMA item requests the program to compute a tree using the UPGMA method The Trees gt BunemanTree item requests the program to compute a tree using the BunemanTree method The Trees gt RefinedBunemanTree item requests the program to compute a tree using the RefinedBunemanTree method The Trees TreeSelector item requests the program to compute a tree using the TreeSelector method If the Trees gt TreeSelector is selected then the Trees gt Previou
55. ory of the installation directory The precise location of the installation directory depends upon your operating system 25 Acknowledgments We would like to thank Barry G Hall Pete Lockhart David Morrison and Mike Steel for many helpful comments This product includes software developed by the Apache Software Foundation http www apache org namely the batik library for generating image files It also contains Jama a Java matrix package http math nist gov javanumerics jama and MRJAdapter a Java package used to help construct user interfaces for the Apple Macintosh References 1 Sarah C Ayling and Terence A Brown Novel methodology for construction and pruning of quasi median networks BMC Bioinformatics 9 115 2008 2 H J Bandelt and A W M Dress A canonical decomposition theory for metrics on a finite set Advances in Mathematics 92 47 105 1992 3 H J Bandelt P Forster B C Sykes and M B Richards Mitochondrial portraits of human population using median networks Genetics 141 743 753 1995 4 Hans J rgen Bandelt Peter Forster and Arne Rohl Median joining networks for inferring intraspecific phylogenies Molecular Biology and Evolution 16 37 48 1999 5 G S Brodal R Fagerberg A Ostlin C N S Pedersen and S S Rao Computing refined buneman trees in cubic time Lecture Notes in Computer Science 2812 259 270 2003 Springer Verlag 6 T Bruen H Philippe and D Bryant A
56. ow 23 oi Tara AR 0 cotos sot ee a a a ee RA a ia a 23 16 2 Unabened Tab a sos aegu scsi a a a we ee ae e 24 Too ara Tab A a Sa ee OSES ewe oe eS ak 24 164 Distances Tap caa pa ke a a a Ee we 24 16 5 uarteis Wal o eg ee le Re eee eee ae Pe ee ee de e ee bas 25 lo Trees Taboe so fe Gwe wo a Rad a se wa ER oo Ba a a 25 16 7 Sputs Tab 24 4254 agoa De hee bape Se eR SG eae eee peed ee He 25 17 Export and Export Images Dialogs 26 Vid 1 2 ccna 6 ee bos bo eRe RS Ee eR Ge Ee eee Ee HS 26 ie Export moge cs e da A A a de HS 18 Preferences Window 19 Additional Windows and Dialog 191 Open Pile 24 4 5 aed a a eo a a le a a a Be 19 2 Choose Datatype gt sa sc et a eR eR a Ee D eee Ed EES ELA SAUS DETERE EE eee Se eee eee ree RE EAR OS eee EOE OS 194 Pond Reploce O 0 504 poe EA AR Bee Oe we Oe elt 19 5 Format Nodes and Edges Window 20 000 ee ee 19 6 Highlight Confidence Window e s e saaa paoa aana a re w ee 19 7 Bootstrap WOMEN ondo sanu ee roke ee e aa e a 19 8 Message Window c e o te coi ea nesa oaoa ee RR ee a a a e o es 199 About Window ic e neos iee View ge kw eae E a ao 19 10Regicter Window Dese RA e A ee ae 20 Nexus Blocks OU Wake BREE os aeaa a Re a ea e G E a gaisa 20 2 Unaligried Boek wee ba i a SA ea e A eee a a a 20 3 Characters Block 2 284 cocos vaea aE a ad a a be was OU Distances Blogk oe cor g i soa daa OR da OR a a da ada aS 5 Quartets Block lt o at ed os eae p a ara wee ee eee a 206 Trogs Blo
57. pply to the current Splits block See Section 22 for a description of all available methods The Pipeline Splits Filter sub tab is used to modify the weights of splits using a least squares optimization or to exclude certain splits from the analysis Possible filters are 25 Greedy Compatible uses a greedy approach to makes the splits com patible in decreasing order of weight the algo rithm adds the next split to the set of kept splits if it is compatible with all splits that have al ready been kept Closest Tree makes the splits compatible by computing the closest tree Greedy Weakly Compatible uses a greedy approach to makes the splits weakly compatible in decreasing order of weight the algorithm adds the next split to the set of kept splits if it is weakly compatible with all splits that have already been kept Weight Threshold removes any split whose weight does not ex ceed the given threshold Pressing the Set but ton will open a histrogram and slider to set the threshold Confidence Threshold removes all splits whose confidence does not ex ceed the given threshold A confidence value usually lies in the range 0 1 and can be ob tained by bootstrapping for example Pressing the Set button will open a histrogram and slider to set the threshold Set Maximum Dimension greedily removes a subset of splits that cause boxes in the network of dimension higher than the given threshold as described in 23 None appl
58. put Distances Output Quartets 39 EqualAngle This method computes a planar split network for a circular sub set of splits 9 If the RunConvexHull option is chosen then the convex hull algorithm is subsequently applied to obtain a graph for the complete set of splits This method provides a number of heuristics for obtaining a better layout 14 set DaylighIterations to 5 to apply the equal daylight heuristic set OptimizeBoxesIterations to 10 to apply the box opening heuristic and set SpringEmbedderIterations to 500 to apply a modified spring embedder Usage EqualAngle UseWeights lt boolean gt RunConvexHull lt boolean gt DaylightIterations lt int gt OptimizeBoxesIterations lt int gt SpringEmbedderIterations lt int gt Input Splits Output Network FilteredSuperNetwork This method computes a super network from partial trees using the Z closure algorithm 23 and a distortion filter 25 Usage FilteredSuperNetwork MinNumberTrees lt int gt MaxDistortionScore lt int gt EdgeWeights lt String AllTrivial lt boolean gt UseTotalScore lt boolean gt Input Trees Output Splits F81 This method calculates distances using the Felsenstein 81 model 33 Usage F81 Maximum_Likelihood lt boolean gt Estimate_Base_Frequencies lt boolean gt Base_Freqs lt N doublei double2 doubleN gt Normalize lt boolean gt Input Characters Output Distances F84 This method Calculates distances using the Felsenst
59. quick and robust statistical test to detect the presence of recombination Genetics in press 2005 7 D Bryant and V Moulton NeighborNet An agglomerative method for the construction of planar phylogenetic networks In R Guig and D Gusfield editors Algorithms in Bioinfor matics WABI 2002 volume LNCS 2452 pages 375 391 2002 A8 8 10 11 12 13 14 15 16 17 18 19 20 21 22 P Buneman The recovery of trees from measures of dissimilarity In F R Hodson D G Kendall and P Tautu editors Mathematics in the Archaeological and Historical Sciences pages 387 395 Edinburgh University Press 1971 A W M Dress and D H Huson Constructing splits graphs IEEE ACM Transactions in Computational Biology and Bioinformatics 1 3 109 115 2004 R C Edgar MUSCLE multiple sequence alignment with high accuracy and high throughput Nucleic Acids Research 32 5 1792 97 2004 L Excoffier and P Smouse Using allele frequencies and geographic subdivision to reconstruct gene trees within a species Genetics Jan 1994 D P Faith Conservation evaluation and phylogenetic diversity Biol Conserv 61 1 10 1992 J Felsenstein PHYLIP phylogeny inference package version 3 2 Cladistics 5 164 166 1989 P Gambette and D H Huson Improved layout of phylogenetic networks To appear in TCBB 2008 O Gascuel BIONJ An improved version of the NJ algorithm based on a si
60. r obtains a license code from our website www splitstree org The register dialog is used to enter this license code This unlocks a number of features of the program that cannot be used without such a code 31 20 Nexus Blocks In this section we describe the Nerus format as implemented in SplitsTree4 based on the definition provided in 27 Unfortunately there exist two variants of the Nexus format which we will call old Nexus and new Nexus SplitsTree4 is based on the latter as this is what is defined in 27 Given a file formatted in old Nexus the program will often be able to parse it as it contains code to automatically convert from old to new Nexus format However the algorithm that does this does not provide a full implementation of the old Nexus format and thus sometimes it will be necessary to reformat a file by hand It is easy to tell the difference between old Nexus and new Nexus if a file in Nexus format contains a Taxa block then it is new Nexus Most blocks in both formats have the same name and similar syntax One main difference is that the Characters block in new Nexus is called a Data block in old Nexus In the following syntax descriptions we used upper case letters for keywords square brackets for optional statements and curly brackets to indicate a list of choices 20 1 Taxa Block The Taxa block is the only mandatory block in a Nexus file Its purpose is to list the names of all taxa It has the following
61. r or recombination a single phylogenetic tree will often not be an appropriate repre sentation of the phylogentic history or of the different incompatible phylogenetic signals Also the presence of noise in a data set or uncertainty due to inadequacies of the underlying model up on which a tree inference is based may also make it necessary to use a more general graph that is a network to represent the data The aim of SplitsTree4 is to provide a frame work for computing phylogenetic networks As the name of the program suggests it is based on the fundamental mathematical concept ofa split For example if we are given an alignment of binary sequences a 010011010110 b 100001011110 c 011001101110 d 010001101111 then each non constant column in the alignment defines a split of the taxon set consisting of those taxa with the value 0 and those with the value 1 For example the first column partitions the taxa into two sets a c d and b and thus gives rise to the split od a while the fourth column does not define a split because the characters are constant Mathematically for a set of X taxa any phylogenetic tree T defines a set of such splits called the split encoding X T of T as follows deletion of any branch also called edge ein the tree produces two subtrees T 4 and Tg say and they give rise to a split S 4 2 consisting of the set A of all taxa contained in T4 and the set B of all taxa contained in Tg In the li
62. r the sequences are to be interpreted as DNA RNA protein or standard 0 1 data In Nexus files the datatype is explicitly given In other file formats this is usually not the case The program employs a simple heuristic to guess the datatype If this fails e g if all character states are A then the program will display a Choose Datatype dialog and prompt the user to specify the datatype The choices are e dna e rna e protein e standard which means 0 1 data and e unknown which the program cannot deal with 19 3 Save As The Save As dialog is opened using the File gt Save As item Its purpose is to save the complete state of a document in Nexus format To save parts of the document in Nexus format or in some other supported format listed in Section 21 use the Export dialog 29 19 4 Find Replace Window The Find Replace window can be opened using the Edit gt Find Replace item It s purpose is to find strings in a displayed network in the source tab or in the message window It can also be used to replace text Enter a query in the top text region Optionally enter a replacement string below it Use the following check boxes to parameterize the search e Ifthe Whole words only item is selected then only taxa or reads matching the complete query string will be returned e Ifthe Case sensitive item is selected then the case of letters is distinguished in com parisons e Ifthe Regular Expression
63. ree4 is based on the Nexus format see 27 However the program can also parse a number of other formats including ClustalW format for aligned sequences Phylip format for sequences and distances FastA format for aligned distances and Newick format for trees Earlier versions had separate Open and Import menu items these have now been combined If a file is opened while the Network tab or Data tab is open then the program will attempt to parse and execute the file If the file is a complete Nexus file previously generated by SplitsTree4 then the network described in the file will be displayed If the file is in some other format then depending on the type of content the program will perform a chain of default calculations To save the complete data associated with a given window in SplitsTree4 s native Nexus format use the File gt Save or File gt Save As menu items To save selected blocks of data or to export data in a different file format use the File gt Export A picture of the computed network can be saved using the File Export Image menu item 7 Estimating Distances Many methods in phylogenetics begin by estimating distances between the taxa This is done by taking sequences two at a time and estimating the average number of mutations that occurred on the paths from them to their most recent common ancestor If the rate of mutation was constant then this will be proportional to their divergence time SplitsTree4 provide
64. represented split To select all nodes on the other side of the split use the View Invert Selection menu item To allow selection of individual edges deselect the Split Selection Mode check box in the Preferences General tab e Select a node by clicking on it Press the mouse button outside of the network and drag a rectangle to select several nodes at once Hold the shift key and click to add or remove further nodes from the set selected e Click on a text label to edit it 12 e Double click on an edge to edit its label e By default node labels are automatically positioned to avoid overlap Any label that has been interactively reposition by the user is no longer automatically positioned To apply automatic positioning all node labels including those that have been moved by the user deselect and then select the View Node Label Layout No Overlaps check box e Many aspects of the visual representation of nodes and edges can be modified using the 12 Format Nodes and Edges window which is opened using the View Format Nodes and Edges menu item Main Menus We now discuss all menus of the Main window 12 1 File Menu The File menu contains the usual file related items The File gt New item opens a new Main window The File gt 0pen item provides an Open File dialog to open a file containing input data in one of the supported formats If the file contains character sequences then the program must know whe
65. s Tree and Trees gt Next Tree items are used to move from one tree to the next in the Trees block Pressing the shift key when selecting either of these methods will move to the first tree or the last tree respectively The Trees gt ConsensusTree requests the program to compute compute the majority or strict consensus tree The Trees gt PhylipParsimony item requests the program to compute a tree using the PhylipParsimony method The Trees gt PhyML item requests the program to compute a tree using the PhyML method Network Menu Network menu contains the following items The Networks NeighborNet item requests to compute a network using the NeighborNet method 18 12 8 The The Networks SplitDecomposition item requests to compute a network using the SplitDecomposition method The Networks ParsimonySplits item requests to compute a network using the ParsimonySplits method The Networks ConsensusNetwork item requests to compute a network using the ConsensusNetwork method The Networks SuperNetwork item requests to compute a network using the SuperNetwork method The Networks FilteredSuperNetwork item requests to compute a network using the FilteredSuperNetwork method The Networks MedianNetwork item requests to compute a network using the MedianNetwork method The Networks MedianJoining item requests to compute a network using the MedianJoining method The Networks MinSpanningNetwork item reques
66. s a large number of methods for estimating distances for sequences and other types of data The UncorrectedP method computes the proportion of positions at which two sequences differ For DNA or RNA sequences there are choices over how ambiguous state codes such as W M V are handled Ignore means that these states are treated as missing states Average means that the contribution at a site is averaged over all possible resolutions of the ambiguous codes with the exception that sites having the same ambiguous code contribute zero Match looks at each possible state in each sequence counts one if the state is not a possible resolution of the ambiguous code in the other sequence and normalises the count by the number of states for each ambiguous code The Distances menu lists several standard distance estimation methods Most of these have parameters that can be changed When you select the menu item the Pipeline window will open and will display a panel for setting the parameters of the method If you always use the same parameters for a given method or if the method has no parameters then selecting the Don t show this dialog to configure this method again will prevent the dialog from appearing again All methods can be configured using the Pipeline window accessed by Analysis Configure Pipeline or Analysis gt Configure Recent Methods 8 Building and Processing Trees The Trees menu provides a number of methods for
67. t default Input file x lt String gt default Execute this command at startup V lt switch gt default false show version string S lt switch gt default false silent mode d lt switch gt default false debug mode s lt switch gt default true show startup splash screen h lt switch gt default false Show usage 46 Launching the program with option g will make the program run in non GUI command line mode first reading commands from a file supplied with the i option if any then executing any command given with the x option and then finally reading additional commands from standard input Please be aware that the command line version of the program uses the same properties file as the interactive version So any a preferences set using the interactive version of the program will also apply to the command line version of the program It this is not desired then please use the p option to supply a different properties file Commands provided to the program from within a file must be grouped together in a SplitsTree block so begin SplitsTree commands end Here we list all commands known to SplitsTree4 EXECUTE FILE file open and execute a file in Nexus format OPEN FILE file open but don t execute a file in Nexus format IMPORT FILE file DATATYPE PROTEIN RNA DNA STANDARD UNKNOWN open but don t execute a file in non Nexus or old Nexus format LOAD FILE file open or import
68. t Distances K3ST This method calculates distances using the Kimura 8ST model 33 Usage K35T Maximum_Likelihood lt boolean gt TRatio lt double gt AC_vs_ATRatio lt double gt Input Characters Output Distances LogDet This method Calculates the LogDet distance 32 Usage LogDet Impute_Gaps lt boolean gt Input Characters Output Distances LogHamming This method calculates distances using the log Hamming distance Al Usage LogHamming ignoregaps lt boolean gt Input Characters Output Distances MedianNetwork This method computes an unreduced median network 3 It uses all sites in the character alignment that contain exactly two different states and no gaps or missing states If UseRYAlphabet is selected then DNA and RNA sequences are translated using R A G and Y C T U If UseRelaxedSupport is selected a character need only be constant on one side of a split to count toward the support of the split Usage MedianNetwork AddA11Trivial lt boolean gt MinimalSupport lt int gt UseRYAlphabet lt boolean gt LabelEdges lt boolean gt UseRelaxedSupport lt boolean gt Input Characters Output Splits MedianJoining This method computes a median joining network 4 It uses all sites in the character alignment and treats gaps or missing data as additional sites The parameter epsilon is used to control the level of homoplasy considered in the analysis Usage MedianJoining Epsilon lt int gt Spr
69. terature a split 4 is sometimes denoted by A B or A B For example consider the tree T displayed in Figure 1 Its split encoding X T contains 5 trivial splits that separate a single taxon from all other taxa and 2 non trivial splits that contain at least two taxa in both parts The trivial splits are COC A CR q b c d ey a c d e La b d ey La b c ey a b c d and the non trivial ones are a b AE a a b e fea em Figure 1 An unrooted phylogenetic tree A fundamental result in mathematical phylogeny states that any given set of splits corresponds to some phylogenetic tree T with X T X if and only if X is compatible 8 Thus any phylogenetic tree may be viewed as a graph whose task it is to give a visual representation of a given set of compatible splits A phylogenetic tree can be thought of as an idealized representation of the historical relationships among a set of taxa and tree building methods are attempts to find a the set of compatible splits that are most consistent with the data according to some algorithm Often there are multiple tree that are equally consistent with the data i e multiple sets of compatible splits In general that collection of splits will be incompatible Moreover there exist inference methods such as split decomposition and Neighbor Net that compute a set of incompatible splits from the data in the form of a given distance matrix Note that split decomposition produces
70. the current Unaligned block Currently the program provides three methods e The NoAlign item simply adds gaps to the ends of sequences to make the all have the same length e The ClustalW item can be used to call the program ClustalW externally to compute an alignment of the sequences e The Muscle item can be used to call the program Muscle externally to compute an alignment of the sequences 16 3 Characters Tab The Pipeline Characters tab has three sub tabs The Pipeline Characters Method sub tab is used to choose and configure a method to apply to the current Characters block See Section 22 for a description of all available methods The Pipeline Characters Filter sub tab is used to exclude certain sites from the analysis By default the sites remain present in the Characters block and are simply masked during calculations To remove the sites completely press the Delete button The Pipeline Characters Select sub tab is used to select certain sites in the Characters block for display of sites in the network If you have selected a split in the main viewer and if you press Select Supporting Characters then SplitsTree will fill the text box on the left of the dialog with all sites that support the selected split These are all characters for which the set of character states on the one side of the split is disjoint from the set of character states on the other side of the split In this calculation gaps and missing data are ignored
71. the following items e The e The e The e The e The e The e The Copy Label copies the node label to the system clipboard Edit Label opens a dialog to edit the current node label Exclude Selected Taxa excludes the selected taxa from all computations Show Name labels the selected node by the names of any correspond taxa Show Id labels the selected node by the ids of any correspond taxa Hide Label hides the label of the selected node Format opens the Format Nodes and Edges window If the mouse is clicked on a edge of a network then this opens the Edge popup menu which has the following items e The e The e The e The e The split e The split e The e The Copy Label copies the edge label to the system clipboard Edit Label opens a dialog to edit the current edge label Show Id labels the selected edge by the id of the corresponding split Show Weight labels the selected edge by the weight of the corresponding split Show Confidence labels the selected edge by the confidence of the corresponding Show Interval labels the selected edge by the confidence interval of the corresponding Hide Label hides the label of the selected edge Format opens the Format Nodes and Edges window If the mouse is clicked on the drawable region of the window but not on any node or edge then the Window popup menu will open which has the following items e The e The e The Zoom In item is a short cut to the View Zoom
72. ther the sequences are DNA RNA protein or standard 0 1 data In Nexus files the datatype is explicitly given In other file formats this not always the case If the program cannot guess the data type e g if all character states are A then the program will display a Choose Datatype dialog and prompt the user to specify the data type If the current document is non empty then the selected file is opened in a new Main window The File Open Recent submenu provides access to recently opened or saved files The File Replace item is used to replaced the current data by a new data set The File Clone item clones the current window The File gt Enter Data item opens the Data Entry dialog which can be used to enter data by hand or copy and paste in a number of different formats The File gt Close item closes the current document The File gt Save item saves the current document to its file if known The File gt Save As item provides a Save As dialog and saves the current document to selected file The File gt Export item opens the Export dialog which is used to save individual data in a number of different formats 13 12 2 The The File Export Image item opens the Export Image dialog which is used to save the current network in a number of different formats The File Tools item provides a submenu of tools The File gt Tools gt Load Trees item can be used to merge a set of trees into one document
73. ts confidence_nsplits split_nsplits END 20 8 Network Block The Network block contains the definition of a phylogenetic network It has the following syntax BEGIN NETWORK DIMENSIONS NTAX number taxa NVERTICES number vertices NEDGES number edges DRAW ROTATE rotation TRANSLATE vertex_1 taxon_1 vertex_2 taxon_2 vertex_ntax taxon_ntax 5 VERTICES 1 x_1 y_1 WIDTH n HEIGHT n SHAPE RECT OVAL FGC color BGC color LINE n 2 x_2 y_2 WIDTH n HEIGHT n SHAPE rect OVAL FGC color BGC color LINE n nvertices x_nvertices y_nvertices WIDTH n HEIGHT n SHAPE RECT OVAL FGC color BGC color LINE n 35 END VLABELS vertex_id label X xoffset Y yoffset FGC color FONT font vertex_id label X xoffset Y yoffset FGC color FONT font EDGES 1 vertex_id vertex_id ECLASS n FGC color LINE n 2 vertex_id vertex_id ECLASS n FGC color LINE n nedges vertex_id vertex_id FGC color LINE n ELABELS edge_id label X xoffset Y yoffset FGC color FONT font edge_id label X xoffset Y yoffset FGC color FONT font 5 INTERNAL edge_id x y x y ee x y x y 5 20 9 Bootstrap Block The ST_Bootstrap block contains the results of a bootstrap analysis It has the following syntax BEGIN ST_BOOTSTRAP DIMENSIONS NTAX number of taxa NCHAR number of characters NSPLITS number of splits FORMAT LABELS LEFT NO SPLITS
74. ts to compute a network using the MinSpanningNetwork method The Networks SpectralSplits item requests to compute a network using the SpectralSplits method Analysis Menu Analysis menu contains the following items The Analysis Bootstrap item opens the Bootstrap dialog The Analysis Show Bootstrap Network item opens a new Main window depicting a network that is based on all splits that occurred in any of the bootstrap replicates Note that this item is enabled only after bootstrapping has been completed The Analysis Show Confidence Network item opens a new Main window containing a network that represents a 95 confidence set for the trees or networks estimated Note that this item is enabled only after bootstrapping has been completed The Analysis gt Estimate Invariable Sites uses the capture recapture method of 26 The Analysis Compute Phylogenetic Diversity is enabled when a set of taxa are selected It computes the sum of the weights for all splits that separate these taxa into two non empty groups On a tree this is equivalent to the phylogenetic diversity measure of 12 as these splits will be exactly those lying on a path between two taxa 19 12 9 The The Analysis Compute delta scores is enabled when a set of taxa are selected and there is a valid distances block It computes the average of the delta scores for all quartets selected from that set of taxa see 19 The averages for each individual ta
75. ultiple calculations simultaneously making use of multiple processors when available A Document consists of an individual data set and possesses its own Main window The document is discarded when its window is closed The program is based on the Nexus format as introduced in 27 and the data associated with a document is organized into blocks and each such block of data is represented by a corresponding block in the Nexus format The blocks are e Taxa the names of all taxa e Unaligned unaligned sequences e Characters aligned character sequences e Distances pairwise distances between taxa e Quartets possibly weighted quartet topologies e Trees list of possibly partial trees e Splits possibly weighted splits e Network phylogenetic tree or network e ST Assumptions contains all methods and options used to compute data e ST Bootstrap bootstrap support of splits The first eight blocks Taxa Unaligned Characters Distances Quartets Trees Splits and Network are organized as a pipeline and data is processed from left to right along this pipeline Any non empty document must contain a Taxa block and will usually contain an ST_Assumptions block All other blocks are optional and the presence or absence of some block depends on the set of computations that the user has selected We will use the term source block to denote the left most block in the pipeline excluding the Taxa block The source block co
76. xon are printed in the message menu The delta score is computed using the distances block The Analysis Conduct Phi test for Recombination tests for recombination using the Phi test of 6 The Analysis Configure Pipeline item opens the Pipeline window which can be used to configure the parameters of a given method The Analysis Configure Recent Methods submenus lists all recently used methods and can be used to open the Pipeline window in the appropriate tab to configure the parameters of a given method Draw Menu Draw menu contains the following items The Draw gt EqualAngle item requests to draw a network using the EqualAngle method The Draw RootedEqualAngle item requests to draw a rooted network using the RootedEqualAngle method The Draw ConvexHull item requests to draw a network using the ConvexHull method The Draw Phylogram item requests to draw a tree using the Phylogram method The Draw NoGraph item requests that no network be computed The Draw Reroot item requests that the selected edge becomes the new root of a tree in a rooted tree display The Draw gt Select Characters item opens the Pipeline Characters Select tab that can be used to select individual characters that is sites in the given sequence alignment to be displayed on the nodes of the network set of selected input trees If one or more splits are selected in the main viewer then pressing on the Select Supporting Characters button will
Download Pdf Manuals
Related Search
Related Contents
innovative technology IT-5022 User's Manual MEDIDOR DE NÍVEL SONORO DIGITAL SÉRIE CEL-200 取扱説明書ダウンロード Manual del usuario DeLonghi MW 505 King Canada KC-25FXT-i50 User's Manual Copyright © All rights reserved.
Failed to retrieve file