Computational Biology: The Beauty and the Beast
The problem of comparing hierarchies occurs in diverse areas of biology. Phylogenetic trees or more general forms of hierarchies need to be compared to identify symbiosis between host and parasite, to evaluate tumor progression models, or to transfer knowledge from the Gene Ontology to large-scale interaction networks. In recent years, matching-based alternatives to the most popular measure of tree similarity, the Robinson-Foulds (RF) metric, have been proposed. In particular, the generalized Robinson-Foulds metric corrects some of its flaws while retaining its widely appreciated properties. Here, we build upon our previous proof-of-concept study and propose a method and practical software tool, Trajan, that for the first time allows to efficiently compute the NP-hard generalized RF metric and variants thereof for non-trivial real-world instances. Trajan combines polyhedral insights with a state-of-the-art non-linear solver to achieve a remarkable speed-up and makes the generalized RF metric a "computable" distance.
In experiments, we demonstrate that the tree-consistent “best corresponding node” mapping established by Trajan allows comparing gene expression dynamics in a meaningful way.