Introducing the R package TransPhylo

Xavier Didelot


TransPhylo is a R package that can reconstruct infectious disease transmission using genomic data. The input is a dated phylogeny, where leaves correspond to pathogens sampled from the known infected hosts. The main output is a transmission tree which indicates who infected whom, including the potential existence of unsampled individuals who may have acted as missing transmission links. TransPhylo works by colouring the branches of the phylogeny using a separate colour for each host, sampled or not. Each section of the tree coloured in a unique colour represents the pathogen evolution happening within a distinct host. Changes of colours on branches therefore correspond to transmission events from one host to another.

If you have not already done so, you can install TransPhylo using the following R command:


You should then be able to load TransPhylo using:


In the first part of this tutorial we will simulate a dataset. In the second part we will analyse the dataset simulated in the first part. If you already have a dataset on which you want to apply TransPhylo you can skip the first part, although you might still find it useful to read as it introduces some of the concepts in our model.