This site hosts supporting information for PACo: A Novel Procrustes Application to Cophylogenetic Analysis (Balbuena JA, Míguez-Lozano R, Blasco-Costa I (2013, PLOS ONE 8(4): e61048). Although the R scripts provided here are fully functional, if you intend to apply PACo to your data, the recently developed R package paco (Hutchinson et al. 2017) is recommended. A major improvement of the new package over the original scripts is that it provides a large suite of null models by employing the swap algorithms of vegan. In addition, it includes the approach of de Vienne et al. (2011) to handle the transformation of non-Euclidean phylogenetic distance matrices into Principal Coordinates.
In the Downloads section you will find the R code and examples to implement PACo as described in Balbuena et al. (2013). You will also be able to access the code and tutorial of the Rumbling-Orchids Pipeline to asses divergent evolution between nuclear and organelle sequences as shown in Pérez-Escobar et al. (2016).
1. The Problem
2. Why PACo?
|Figure 1. Example of host and parasite phylogenies illustrating four common evolutionary events contemplated in cophylogenetic studies: cospeciation (hosts and parasites speciate in parallel), host-switch (the parasite is able to colonize a new unrelated host), lineage sorting (failure to speciate or disappearance of a parasite linage on a host lineage) and duplication (independent speciation of the parasite). (Based on Page ).|
The diversification patterns over evolutionary time of tightly associated organisms, such as parasites and their hosts, are seldom independent. Therefore some degree of congruence (i.e., topological similarity) between the phylogenies of the associated taxa is expected to occur. Congruence expresses the extent to which each node in a given tree maps to a corresponding position in the other tree and perfect congruence can be interpreted as evidence for cospeciation, which may or may not result from coevolutionary mechanisms. Such perfect congruence is rarely, if ever, observed in nature, because in addition to cospeciation, other types of evolutionary events can act concurrently (Fig. 1). Thus, the historical reconstruction of the associations between two given sets of organisms is not straightforward because it needs to evaluate and disentangle the relative roles played by each evolutionary process.
PACo is a global fit method for cophylogenetic analysis based on Procrustes analysis that
Although PACo does not explicitly evaluate the contribution of the evolutionary events set forth above, the amount of phylogenetic congruence can be viewed as a measurement of the degree of coevolution in the system studied. For greater usability, PACo can be implemented in the public-domain statistical software R in a reasonable amount of computing time, which affords the analysis of large datasets.
Since there are already many methods for cophylogenetic analysis out there, you might be wondering whether yet another test is really necessary. However, PACo includes several innovative features with respect to previous global-fit methods, such as ParaFit of Legendre et al. (2002) or the cospeciation test described by Hommola et al. (2009):
PACo contemplates a given parasite occurring in more than one host species and, conversely, a host harbouring more than one parasite species. Figure 2 gives an overview of the method. The test builds on three pieces of information: two phylogenetic trees corresponding to hosts and parasites, and a binary matrix coding the host-parasite associations (H-P link matrix). Let h and p be the numbers of host and parasite species in the respective phylograms, the H-P link matrix is an h * p matrix, where 1 denotes presence of a given parasite species in a given host species, and 0 corresponds to absence of a particular parasite species in a particular host species.
|Figure 2. Method overview of PACo: (1) The phylogenetic information encapsulated by the host-parasite (H-P) tanglegram gives way to two distance matrices of host and parasites, and a binary matrix of H-P links. (2) The distance matrices are transformed by Principal Coordinates. (3) The H-P link matrix is converted into an identity matrix to account for multiple host-parasite associations. (4) Rows in the Principal Coordinate matrices are duplicated (arched arrows) following the order dictated by the identity matrix. (5) The extended Principal Coordinate matrices (X and Y) are centred by mean column vectors and subjected to Procrustes analysis, where the parasite configuration is rotated and scaled to fit the host configuration. The fit can be visualised in a Procrustes superimposition plot. (6) The analysis yields a global goodness-of-fit statistic, whose significance can be established by a randomization procedure. The importance of each H-P link can be assessed by the associated squared residual, which together with their 95% confidence intervals, are estimated using a jackknife method.|
My fellow coauthors and I will be happy to answer any questions regarding PACo. Suggestions, criticism and feedback will also be most welcome. Please address your queries to firstname.lastname@example.org.
Last update: 28 Aug. 2017