Common PAUP analysis commands

 

By Peter Unmack

 

I usually paste the blocks below to the end of my nexus files, open the command line version (which runs faster than the GUI) and execute the file (I usually copy the datafile to my paup directory, or copy paup to where my datafile is).  Stuff highlighted in yellow is what usually gets changed by the user depending upon their dataset and needs.

 

Parsimony Analysis

 

begin paup;

set autoclose=yes;

set criterion=parsimony;

set root=outgroup;

set storebrlens=yes;

set increase=auto;

outgroup fish1 fish2 etc;

hsearch addseq=random nreps=1000 swap=tbr hold=1;

savetrees file=datasetname.cb.pa.tree.nex format=altnex brlens=yes;

pscores /tl ci ri rc;

end;

 

Parsimony Bootstrap Analysis

 

begin paup;

set autoclose=yes;

set criterion=parsimony;

set root=outgroup;

set storebrlens=yes;

set increase=auto;

outgroup fish1 fish2 etc;

 

bootstrap nreps=1000 search=heuristic/ addseq=random nreps=10 swap=tbr hold=1;

savetrees from=1 to=1 file=datasetname.cb.pab.tree.nex format=altnex brlens=yes savebootp=NodeLabels MaxDecimals=0;

end;

 

ML Analysis

 

[! Likelihood settings from best-fit model (GTR+I+G) selected by AIC in Modeltest 3.7 on Tue Mar 13 23:03:48 2007]

 

begin paup;

set criterion=like;

set autoclose=yes;

set root = outgroup;

outgroup fish1 fish2 etc;

set storebrlens=yes;

set increase=auto;

 

log file=PeterML.log;

 

Lset  Base=(0.2892 0.2928 0.1309)  Nst=6  Rmat=(3.7285 46.5293 1.3888 2.3793 16.4374)  Rates=gamma  Shape=0.9350  Pinvar=0.5691;

hsearch addseq=random nreps=5 swap=tbr;

savetrees file=datasetname.cb.ml.tree.nex format=altnex brlens=yes maxdecimals=6;

end;

 

ML Bootstrap Analysis

 

[! Likelihood settings from best-fit model (GTR+I+G) selected by AIC in Modeltest 3.7]

 

begin paup;

set autoclose=yes;

set criterion=like;

set root = outgroup;

set storebrlens=yes;

set increase=auto;

Lset  Base=(0.2741 0.3195 0.1179)  Nst=6  Rmat=(2.1733 28.3895 1.0110 1.6886 15.2865)  Rates=gamma  Shape=1.6307  Pinvar=0.5765;

outgroup fish1 fish2 etc;

bootstrap nreps=1000 search=heuristic/ addseq=random swap=tbr hold=1;

savetrees from=1 to=1 file=datasetname.cb.mlb.tree.nex format=altnex brlens=yes savebootp=NodeLabels MaxDecimals=0;

end;

 

Molecular Clock Test

 

BEGIN PAUP;

log file= datasetname.cb.clock.log replace;

dset distance=JC objective=ME base=equal rates=equal pinv=0

subst=all negbrlen=setzero;

NJ showtree=no breakties=random;

End;

 

BEGIN PAUP;

Set criterion=like;

lscores 1/ Base=(0.2741 0.3195 0.1179)  Nst=6  Rmat=(2.1733 28.3895 1.0110 1.6886 15.2865)  Rates=gamma  Shape=1.6307  Pinvar=0.5765

scorefile=datasetname.cb.clock.scores replace;

 

roottrees;

outgroup fish1 fish2 etc /only;

 

lscores 1/ Base=(0.2741 0.3195 0.1179)  Nst=6  Rmat=(2.1733 28.3895 1.0110 1.6886 15.2865)  Rates=gamma  Shape=1.6307  Pinvar=0.5765

clock=yes scorefile=datasetname.cb.clock.scores append=yes;

 

log stop;

END;

 

Parsimony Constraint Analysis

 

This is for testing alternative tree topologies (usually monophyly of a group).  You compare the shortest tree (via a normal parsimony analysis) with one that is constrained to make a particular group monophyletic.

 

begin paup;

set warntree=no;

set warnreset=no;

set autoclose=yes;

set criterion=parsimony;

set root=outgroup;

set storebrlens=yes;

set increase=auto;

outgroup fish1 fish2 etc;

hsearch start=stepwise addseq=random nreps=10 swap=tbr hold=1;

savetrees file=datasetname.cb.noconst.pa.tree.nex format=altnex brlens=yes;

pscores /tl ci ri rc;

constraints australis=((M.australis.DeGrey.1 M.australis.Westrelly.1 M.australis.Ashburton.1 M.australis.Fortescue.1 M.australis.Fortescue.2 M.australis.Sherlock.1 M.australis.Sherlock.3 M.australis.Ord.1 M.australis.Manning.1 M.australis.Carson.1 M.australis.Isdel.1 M.australis.Sturt.1 M.australis.0137.Finnis.1 M.australis.0137.Finnis.2 M.australis.0121.Reynolds.1 M.australis.97126.King.1 M.australis.97128.Douglas.1 M.australis.97130.Blackmore.1 M.australis.97130.Blackmore13 M.australis.97129.Charlotte.1 M.australis.Mitchell.1 M.australis.King.Edward.1 M.australis.0121.Reynolds.2 M.australis.HL003.Katherine.1 M.australis.HL005.Katherine.1));

hs enforce constraints=australis addseq=random nreps=10 swap=tbr hold=1;

savetrees file=datasetname.cb.const.pa.tree.nex format=altnex brlens=yes;

pscores /tl ci ri rc;

[pscores /nonparamtest=yes;]

 

[seems as if one has to manually edit the appended tree file otherwise it only recognizes a single tree]

 

end;

 

ML Constraint Analysis

 

[! Likelihood settings from best-fit model (TrN+I+G) selected by AIC in Modeltest 3.6]

 

begin paup;

set criterion=distance;

set autoclose=yes;

set root = outgroup;

outgroup fish1 fish2 etc;

set storebrlens=yes;

set increase=auto;

 

DSet distance=JC objective=ME base=equal rates=equal pinv=0 subst=all negbrlen=setzero;

NJ showtree=no breakties=random;

 

set criterion=like;

 

Lset  Base=(0.2559 0.3195 0.1444)  Nst=6  Rmat=(1.0000 20.4705 1.0000 1.0000 11.9869)  Rates=gamma  Shape=0.7570  Pinvar=0.5644;

hsearch addseq=random nreps=5 swap=tbr;

savetrees file=datasetname.cb.const.ml.tree.nex format=altnex brlens=yes maxdecimals=6;

 

cleartrees nowarn=yes;

 

constraints australis=((M.australis.DeGrey.1 M.australis.Westrelly.1 M.australis.Ashburton.1 M.australis.Fortescue.1 M.australis.Fortescue.2 M.australis.Sherlock.1 M.australis.Sherlock.3 M.australis.Ord.1 M.australis.Manning.1 M.australis.Carson.1 M.australis.Isdel.1 M.australis.Sturt.1 M.australis.0137.Finnis.1 M.australis.0137.Finnis.2 M.australis.0121.Reynolds.1 M.australis.97126.King.1 M.australis.97128.Douglas.1 M.australis.97130.Blackmore.1 M.australis.97130.Blackmore13 M.australis.97129.Charlotte.1 M.australis.Mitchell.1 M.australis.King.Edward.1 M.australis.0121.Reynolds.2 M.australis.HL003.Katherine.1 M.australis.HL005.Katherine.1));

 

hsearch enforce constraints=australis addseq=random nreps=5 swap=tbr;

savetrees file=datasetname.cb.const.ml.tree.nex format=altnex brlens=yes maxdecimals=6 append=yes;

 

[gettree file=datasetname.cb.const.ml.tree.nex;]

[lscore all/ shtest=RELL; or shtest=fullopt]

 

end;

 

[seems as if one has to manually edit the appended tree file otherwise it only recognizes a single tree]

 

To Set Up Data Partitions And Character Sets

 

This is a complicated example for a dataset with five genes, with two of the nuclear genes (r1 and s7) separated into different exons and introns.  This is for running analyses using different gene combinations.

 

charpartition all=R1E1:1-222,R1I1:223-309,R1E2:310-1443,R1I2:1444-2301,R1E3:2302-3846,R2:3847-4753,S7I1:4754-5472,S7I2:5473-6056,cytb:6057-7196,12S:7197-7745;

charpartition nuccytb=R1E1:1-222,R1I1:223-309,R1E2:310-1443,R1I2:1444-2301,R1E3:2302-3846,R2:3847-4753,S7I1:4754-5472,S7I2:5473-6056,cytb:6057-7196;

charpartition gene=R1:1-3846,R2:3847-4753,S7:4754-6056,cytb:6057-7196,12S:7197-7745;

charpartition gene.cytb=R1:1-3846,R2:3847-4753,S7:4754-6056,cytb:6057-7196;

charpartition gene.12S=R1:1-3846,R2:3847-4753,S7:4754-6056,12S:7197-7745;

charpartition gene.nuc=R1:1-3846,R2:3847-4753,S7:4754-6056;

charpartition gene.nuc2=R1:1-3846,R2:3847-4753,S7I1:4754-5472,S7I2:5473-6056;

charpartition gene.rag=R1:1-3846,R2:3847-4753;

charpartition nuc12s=R1E1:1-222,R1I1:223-309,R1E2:310-1443,R1I2:1444-2301,R1E3:2302-3846,R2:3847-4753,S7I1:4754-5472,S7I2:5473-6056,12S:7197-7745;

charpartition nuc=R1E1:1-222,R1I1:223-309,R1E2:310-1443,R1I2:1444-2301,R1E3:2302-3846,R2:3847-4753,S7I1:4754-5472,S7I2:5473-6056;

charpartition rag=R1E1:1-222,R1I1:223-309,R1E2:310-1443,R1I2:1444-2301,R1E3:2302-3846,R2:3847-4753;

charpartition mt=cytb:6057-7196,12S:7197-7745;

 

 

charset r1e1=1-222;

charset r1i1=223-309;

charset r1e2=310-1443;

charset r1i2=1444-2301;

charset r1e3=2302-3846;

charset rag2=3847-4753;

charset S7I1=4754-5472;

charset S7I2=5473-6056;

charset S7=4754-6056;

charset cytb=6057-7196;

charset 12S=7197-7745;

 

EXCLUDE character-list [/ONLY];

 

Use this for defining different codon positions. 

 

CHARSET  pos_1  =  1-253\3;

CHARSET  pos_2  =  2-254\3;

CHARSET  pos_3  =  3-255\3 255;

 

or to combine first and second codon positions

 

CHARSET pos_1_2 = 1-253\3 2-254\3;

CHARSET pos_3 = 3-255\3 255;

 

or for the same codon positions spanning multiple genes with different starting points, plus one gene with a whole partition.

 

charset 1st = 1-2021\3 2022-2179\3 2180-2862\3 2863-3174\3 3175-5696\3;

charset 2nd = 2-2021\3 2023-2179\3 2181-2862\3 2864-3174\3 3176-5696\3;

charset 3rd = 3-2021\3 2024-2179\3 2182-2862\3 2865-3174\3 3177-5696\3;

charset s7 = 5697-6827;

 

or

 

codonposset * coding=

N:1-10,

1:11-.\3,

2:12-.\3,

3:13-.\3;

designates bases 1-10 as noncoding, and positions of the remaining bases in

the order 123123123....

 

BEGIN SETS;

charset 1st = 1-2021\3 2022-2179\3 2180-2862\3 2863-3174\3 3175-5696\3;

charset 2nd = 2-2021\3 2023-2179\3 2181-2862\3 2864-3174\3 3176-5696\3;

charset 3rd = 3-2021\3 2024-2179\3 2182-2862\3 2865-3174\3 3177-5696\3;

charset s7 = 5697-6827;

END;

 

[INCLUDE 1st 2nd /ONLY;]

[INCLUDE 1st /ONLY;]

[INCLUDE 2nd /ONLY;]

[INCLUDE 3rd /ONLY;]

 

Use this to set up a weighting scheme for different changes.

 

USERTYPE a STEPMATRIX= 4

     A C G T

 [A] . 2 1 2

 [C] 2 . 1 2

 [G] 1 1 . 1

 [T] 2 2 1 .

;

 

USERTYPE b STEPMATRIX= 4

     A C G T

 [A] . 1 0 1

 [C] 1 . 1 0

 [G] 0 1 . 1

 [T] 1 0 1 .

;

 

USERTYPE c STEPMATRIX= 4

     A C G T

 [A] . 0 1 0

 [C] 0 . 0 1

 [G] 1 0 . 0

 [T] 0 1 0 .

;

 

Partition Homogeneity Test (PHT) or Incongruence Length Difference (ILD)

 

Test whether two datasets can be combined, in this case cytb and s7.  The log file contains the p value for the test.  If significant, then according to the test the datasets are significantly different and should not be combined for analysis.

 

log file=datasetname.hpt.log;

 

charpartition s7cb=cb:1-601,S7:602-1492;

charset cb=1-601;

charset S7=602-1492;

 

begin paup;

set autoclose=yes;

set criterion=parsimony;

set root=outgroup;

set increase=auto;

outgroup fish1 fish2 etc;

hompart partition=s7cb nreps=100 search=heuristic/ addseq=random nreps=10 swap=tbr hold=1;

end;

 

Majority Rule Tree

 

gettrees file=pp.cb.mrb.nex.run1.t mode=7;
gettrees file=pp.cb.mrb.nex.run1.t mode=7;
contree all/ majrule treefile=pp.cb.mrb1.tree.nex;

 

Acknowledgements

 

Thanks to Martin Wojciechowski at Arizona State University who initially provided me with most of these paup blocks.

 

Back to Unmack's Molecular Phylogenetics page.