Skip to content

Taxonomic concept identification, reconciliation, and the Open Tree of Life – Part 1

“Part 1” – allowing for subsequent parts (e.g., for the blue/non-blue use case; now added at end of post). The relevant OpenTree background discussion is here:

https://github.com/OpenTreeOfLife/muriqui/issues/15

There as several interesting and generally closely overlapping issues and views here. I will pick just (or mainly) the one introduced by Jonathan Rees (November 02, 2014).

“Here’s another example I’m struggling with: there are currently a couple of species in OTT that are misclassified as crustaceans instead of molluscs. When we fix this problem, there will be an incompatible ‘change’ in the membership of Arthropoda. Does this mean that the new group should get a new identifier? – after all its identity in some sense has changed. If so, annotations and OTU mappings linked to the old id have no home in the tree. It doesn’t get a new id with the current taxonomy generator, which assumes that names are tied uniquely to taxon concepts (with some exceptions), but with a more principled system where groups are defined by membership or phylogenetic hypotheses, it might. This would have an impact on OTU mappings and annotation carryover. I don’t have a good answer to this one, but am working on ways to anchor the semantics of ids.”
 

I am attempting to reproduce this under varying scenarios in the Euler/X toolkit. Starting simple, then expanding. I have created an initial scenario with two phylogenetic perspectives of the Ecdysozoa (molting animals) sec. ott1 (= OpenTree Topology at time =1; the later version) versus Ecdysozoa sec. ott0 (OTT at time = 0; the earlier version. Initially, an alignment with complete taxonomic congruence.

1A. Complete congruence, input visualization.

mollusca-static-input

1B. Complete congruence, alignment visualization.

mollusca-static-merge

Now added to this, two species-level concepts “ProblemSpeciesA” and “ProblemSpeciesB” sec. ott1/ott0, respectively. These two concepts are “errorneously” assigned at time = 0 to a crustacean genus-level concepts in ott0 – i.e., as children of ott0.Crustacea_Genus17 – but subsequently placed at time = 1 as children of the molluscan genus-level concept ott1.Mollusca_Genus3. Shown is the toolkit’s default alignment (-e mnpw –rcgo).

2A. With Problem Species, input visualization.

mollusca-change1-input

2B. With Problem Species, alignment visualization.

mollusca-change1-merge

Some insights emerge. First, and evidently, the combination of the default reasoner setting and default input annotation, plus the “radically” different placement of the two low-level concepts, cause a good amount of taxonomic incongruence at super-ordinated levels. Second, the “noise ends” at the level of Ecdysozoa sec. ott1/ott0. In this particular case, because we presumably have (1) differential taxonomic placements and (2) differential phenotypic property assignments to the four problematic concepts in question, I am not sure (at the moment) that property-centric individuation of concepts will change things.


And here is a simplified version of Arlin Stoltzfus’ example. Simplified to just represent blue and non-blue as character state(s) (concepts). These get assigned incongruent sets of taxon concept at time = 1 versus time = 0. Nevertheless they retain congruence.

3A. Blue/non-blue example, input visualization.

blue-nonblue2-input

3B. Blue/non-blue example, alignment visualization.

blue-nonblue2-merge

Under this particular representation, the taxon/character duality is apparent – two views of “the same phenomenon”.

No comments yet

Leave a Reply

You may use basic HTML in your comments. Your email address will not be published.

Subscribe to this comment feed via RSS