Thoughts: How many concepts are we talking about

Fourth post in this sequence (here are posts 1, 2, 3, respectively). Changing gears a little. The motivation for this post is to explore the interactions of explicitly and implicitly communicated taxonomic concepts in conversations among (living, meeting) humans with comparable levels of taxonomic expertise. How many identifiers are we talking about?

The exploration has two parts. The first part simulates a brief conversation of the kind that two human speakers may engage in while meeting in the hallways at a taxonomically oriented conference. The speakers know of each other, either through prior personal interactions or (minimally) by having read several of each other’s taxonomic publications. The conversation is hypothetical, and even though certain real persons are mentioned, the sole purpose of this is to add some realism, not to pass my judgment on any taxonomic particulars. The post is about exploring how the issue of taxonomic name/concept identifier resolution relates to this kind of communication, generally.

The second part examines the conservation from the perspective of representing taxonomic reference – “logically”. By that I mean framing the taxonomic content identifiers communicated explicitly or implicitly by the human speakers in such a way that a computational, logic-based application can adequately represent them. Ok, so here goes (in part, as it will turn out).

Thoughts: Humans, computers, and identifier granularity

Third post in this sequence. In the first post, I reviewed that biological nomenclature promotes (even requires) fairly deep taxonomic semantics, due to semantically forceful principles such as Typification, Priority, Coordination, and Binomial Names. In the second post, I suggested (again, nothing very new here) that the Linnaean system has many features which, given the task on hand (reliably identifying nature’s hierarchy), are nearly optimally aligned with evolutionarily constrained human cognitive universals.

Both posts are ultimately about advancing biodiversity informatics infrastructure design. That motivation points to finding sound models of knowledge communication in the taxonomic domain. Lessons from the two preceding posts may be as follows. (1) If the goal is to build data environments that largely continue to reflect the strengths and weaknesses of human cognitive universals, then the particular balance struck by Linnaean names and name relationships acting as identifiers of evolving human taxonomy making is adequate. (2) There may be better solutions out there, particularly solutions that more effectively utilize the reasoning and scalability strengths of computational logic.

New publication: Reasoning over taxonomic change – Perelleschus

The first, fleshed out use case of the Euler/X project was published yesterday in PLoS ONE. This paper is a companion to the phylogenetic revision of the acalyptine weevil genus Perelleschus sec. Franz & Cardona-Duque (2013), and translates the 54 taxonomic concepts and 75 RCC-5 articulations provided in that paper into 13 logically consistent alignments and visualizations, with additional inferred articulations.

Franz, N.M., M. Chen, S. Yu, P. Kianmajd, S. Bowers & B. Ludäscher. 2015. Reasoning over taxonomic change: exploring alignments for the Perelleschus use case. PLoS ONE 10(2): e0118247. doi:10.1371/journal.pone.0118247. Available on-line here.

Very glad to see this one published; at the same time there are other use case papers in the pipeline (Andropogon, Primates). The particular motivation for this paper was to resolve sets of several small-scale yet taxonomically and phylogenetically complex input trees with the RCC-5 concept alignment approach and Euler/X toolkit. The paper is written in a “how to?” style, successively exploring and explaining the connections between the user-provided input constraints and the over-, under-, or well-specified reasoning outcomes. It deals with issues of logical consistency, input sufficiency, ambiguity, and alternative ways to align (parent) concepts in reference to either (1) their intensionally circumscribed properties (which may include synapomorphies) or (2) the ostensively indicated members. This corresponds to the program outlined in Franz & Thau (2010).

One reviewer wrote: “With an exceptionally suited use case, the complexity of taxonomic reasoning and its translation to machine processing are depicted in unprecedented form.” Our ultimate goal is to develop a widely applicable reference and linkage system for taxonomic products that human users create but which is actually optimized for computational processing – without compromising the Linnaean system whose services to humans are profoundly valuable.

ICE 2016 Orlando Symposium proposal – Building the Biodiversity Knowledge Graph for Insects

Calls are going out currently to submit Symposium proposals for the XXV International Congress of Entomology (ICE 2016) in Orlando. Here is the summary for one such proposal led by Nico Franz and Katja Seltmann and intended for the category: Biodiversity, Biogeography and Conservation Biology. If you would like to contribute as potential speaker to this Symposium proposal, please contact Nico Franz (s0on).

Title: Building the Biodiversity Knowledge Graph for Insects – Components, Progress, Challenges.
Presentation Type: Combination Oral and Poster Presentations.

BIGCB Workshop at UC Berkeley: Tackling the Taxon Concept Problem

November 7-9, 2014. Coolest workshop theme, like, ever. Organized by Brent Mishler and Staci Markos of the UC Berkeley Jepson Herbarium.

Understanding Taxon Ranges in Space and Time: Tackling the Taxon Concept Problem

A Workshop Sponsored by the Berkeley Initiative in Global Change Biology

Friday, November 7, 2014, 1002 VLSB, open to all who are interested.

1:00 pm: Brent Mishler  –  Introduction to the themes and goals of this workshop.

1:10 pm: Edward Gilbert  –  Ideas for incorporating taxonomic concepts into Symbiota.

1:40 pm: Robert Guralnick  –  Map of Life and the challenge of heterogeneous names data for determining species ranges.

2:10 pm: Gaurav Vaidya  –  Tracking taxonomic changes: how, where and why, with examples from the Avibase database of taxon concepts. Link & PDF

2:40 pm: Coffee Break.

3:00 pm: Robert Peet  –  Taxon concepts as essential infrastructure for large-scale data integration: lessons from VegBank, SEEK and BIEN. PDF

3:30 pm: Nico Franz  –  Tracking taxonomic change across classifications and phylogenies. PDF

4:00 pm: Alan Weakley  –  Applying concept maps to 7100 vascular plants of the Southeastern United States, and some thoughts on ‘atomic concepts’ and their utility at the specimen level.  PDF

4:30 pm: Nico Cellinese  –  Thoughts on the right approach to query trees based on phyloreferences (ontologized phylogenetic definitions).

Additional Workshop Notes

  • Communicated by Robert Peet  –  Taxonomic concept infrastructure-related entities requiring identifiers. PDF

2014 UC Boulder Meaning of Names Conference in Review

A relatively short review of the timely conference “The Meaning of Names: Naming Diversity in the 21st Century”, held on September 30 to October 2nd, 2014, and organized by Rob Guralnick and the University of Colorado at Boulder Museum of Natural History.

I have uploaded the Conference Program for reference. I gave an update on Euler/X, the slides are shared again here. Some photos of the conference participants are posted on Flickr.

Having had an opportunity to present for 30 minutes allowed me to review some general ideas about names and concepts and apparently (given positive reactions) made the presentation more accessible. A number of engaging and thematically diverse presentations were in the line-up, although the diversity of domains of application did not necessarily mean immediate directional friction. Names – the “right ones” – remain essential to information transmission that employs human cognition and memorization. Among other fleeting observations, it seemed clear to me that the standard OBO Foundry approach to fixating the meaning of terms is not all that biodiversity informatics needs to integrate taxonomically annotated data. I also think we are at the cusp of separating more clearly and consistently what conventional taxonomic names can achieve for human communication, and what they need to achieve in addition to support scalable computational integration. Two Global Names Architecture presentations (Ellinor Michel and David Patterson, respectively) pointed that way. To what extent the “additional layer” for logic integration is needed, and justified by apparent representational and infrastructural costs, was an underlying theme of the conference. In other words – progress.

Conference presentation: Explaining taxonomy’s legacy to computers – how and why?

I will give an updated presentation on the Euler/X project and concept taxonomy at the conference “The Meaning of Names: Naming Diversity in the 21st Century”, held at the Museum of Natural History, University of Colorado – Boulder, on September 29 to October 01, 2014. Slides are posted on Slideshare, and linked here. Thanks to Rob Guralnick for the invitation!

New postdoctoral position in revisionary insect systematics

We are excited to have the opportunity to recruit a new postdoc into our lab. This is ASU job # 10742; please contact me (details below) if you are interested in applying.

Postdoctoral Researcher – Revisionary Insect Systematics
School of Life Sciences
Arizona State University

