Last week we read and appreciated Seltmann et al.’s (2012) effort to carefully describe the benefits, use, and user community roll-out of the spectacularly well annotated Hymenoptera Anatomy Ontology Portal. We clearly need and want something like this for Coleoptera. That said, we continue to explore options to maybe do things a little differently. Looking for inspiration, we are reading once more what is to my mind one of the best demonstrations of how phenotype ontologies can be used to address research questions – by phylogenetic systematists, for phylogenetic systematists.
Ramírez, M.J. & P. Michalik. 2014. Calculating structural complexity in phylogenies using ancestral ontologies. Cladistics (Early View). Available here.
We are also starting, based on this semester’s cumulative readings, to formulate some interests of our own. Hence the following homework for all; due by next Wednesday’s discussion.
Formulate three research themes or questions that are comparative/phylogenetic in nature and could possibly make use of phenotype ontologies. Be very specific; ideally starting with the taxonomic group and character system that you are most intimately acquainted with. (in my case, e.g., that might be acalyptine weevil mouthparts). Best to work outward from the current core of your taxonomic expertise. Research ideas might take into account (yet are clearly not limited to):
- Evolution of phenotype complexity, reduction.
- Correlations across character systems.
- Presence/absence of traits across larger phylogenetic groups and within/among subgroups.
- Relationships of traits to non-organismal variables (e.g., environment).
- Annotations and inferences targeting the specimen level versus or higher taxon entities.
- Evolutionary rates, timing.
- Associations, coevolutionary themes.
- Information availability, completeness, suitability for analysis.
- … [insert your favored domain of phenomena or inquiry here]
The idea is to engage in a bit of a reverse engineering exercise. We know that the earliest phenotype ontologies came out of the model organism community – what Nelson & Platnick (1981) might refer to as “general biology” (pages 4-5). Yet systematists tend to ask comparative questions. What (if any) general structures, entities, and relationships do these comparative/phylogenetic questions entail? Which kinds of inferences are we (most) interested in? How would the components needed to accommodate the inferences be fruitfully translated into a logic framework?
In other words, let’s pretend we are well advised to engage in some conceptual modeling for the future design of a Coleoptera Anatomy Ontology (which may not carry such a name in the end). Start with nailing down our most highly domain-specific questions. Abstract overarching design needs from these. Pretend that solutions will follow.
“Part 1” – allowing for subsequent parts (e.g., for the blue/non-blue use case; now added at end of post). The relevant OpenTree background discussion is here:
There as several interesting and generally closely overlapping issues and views here. I will pick just (or mainly) the one introduced by Jonathan Rees (November 02, 2014).
“Here’s another example I’m struggling with: there are currently a couple of species in OTT that are misclassified as crustaceans instead of molluscs. When we fix this problem, there will be an incompatible ‘change’ in the membership of Arthropoda. Does this mean that the new group should get a new identifier? – after all its identity in some sense has changed. If so, annotations and OTU mappings linked to the old id have no home in the tree. It doesn’t get a new id with the current taxonomy generator, which assumes that names are tied uniquely to taxon concepts (with some exceptions), but with a more principled system where groups are defined by membership or phylogenetic hypotheses, it might. This would have an impact on OTU mappings and annotation carryover. I don’t have a good answer to this one, but am working on ways to anchor the semantics of ids.”
I am attempting to reproduce this under varying scenarios in the Euler/X toolkit. Starting simple, then expanding. I have created an initial scenario with two phylogenetic perspectives of the Ecdysozoa (molting animals) sec. ott1 (= OpenTree Topology at time =1; the later version) versus Ecdysozoa sec. ott0 (OTT at time = 0; the earlier version. Initially, an alignment with complete taxonomic congruence.
David Lowery and Paul Morris of the Filtered Push project visited the Franz Lab at ASU from January 6-10, 2014, for a focused Filtered Push/Symbiota hackathon. Ed Gilbert was also present. All enjoyed an intense week of specifying and implementing workflows, schemas, and new components in SCAN (and ultimately Symbiota) to search, display, and annotate images for remote identification and label transcription. The results will gradually come live, starting later this month.
Presentation slides are now posted for the TDWG 2013 SCAN talk.
Date: 2013-10-29 02:45 PM – 02:50 PM
SCAN – the Southwest Collections of Arthropods Network (http://symbiota1.acis.ufl.edu/scan/portal/) – is the first regional arthropod biodiversity data network that utilizes the Symbiota software platform (http://symbiota.org/tiki/tiki-index.php). Since its origin in 2012 SCAN has unified and newly created specimen-level occurrence records on-line pertaining to nearly 15 south-western United States arthropod collections; including more than 515,000 records that represent some 18,000 species. However, due to the disproportionately mismatched diversity versus taxonomic expertise for the region and focal taxa, at least one third of the specimens are not identified (authoritatively or otherwise) to the level of species, with concomitant limitations for derivative taxonomic or evolutionary/ecological research. The member collections are typically separated from each other geographically by distances that prohibit frequent interactions with regional or global experts, except in virtual realm. SCAN has therefore implemented a Filtered Push (FP) based service (http://wiki.filteredpush.org/wiki/) whose primary purpose is to connect high-quality imaged of yet insufficiently identified specimens with suitable experts who can provide identifications remotely. This is achieved through the FP-server system which both records these contributions externally and pushes them back into the source Symbiota platform for review, acceptance, or rejection by the respective collection/node leaders. SCAN is therefore primed to utilize FP at a large scale and with a well circumscribed focal purpose that is relevant to the specific needs of this collections network. We illustrate the SCN/FP workflow, underlying concepts and technology, and current state of implementation and usage. FP allows experts to gradually accumulate credit and “reputations” for their identification contributions, and thus represents a promising means to improve data quality through transparent and distributed expert involvement and attribution.
A new, innovative Biodiversity Data Journal has been created in association with Pensoft Publishers, a company whose thematic vision and entrepreneurial and technical savvy jointly represent one of the most significant steps into the future of biodiversity informatics made over the past decade. Read more about the Journal’s agenda and platform in the inaugural paper.
Pending final database settings adjustments on the iDigBio servers, SCAN is set to become the first major biodiversity data portal to facilitate “smart” expert annotations using Filtered Push semantics and technology. Thanks to the wonderful team at Harvard University including Maureen Kelly, Paul Morris, and David Lowery, we can now (1) “flag” imaged yet under-identified SCAN specimens as such, (2) assign them to experts connected to our virtual environment, (3) have them perform annotations and identifications, and (4) record these updates both inside SCAN and in the external Filtered Push environment. The annotations are embedded in an ontology and can be propagated to other Filtered Push compatible platforms, such as Specify – http://specifysoftware.org/.
The general workflow is outlined here: http://scan1.acis.ufl.edu/content/annotations.
In the coming weeks there will be a good bit of tweaking and optimizing, and a need to serve up many pertinent images and recruit testers. That said we now have a very powerful and pioneering annotations system available for use and promotion.