Knowledge Representation in Systematic Biology – Edited book proposal seeking comments, contributions
I have an opportunity to edit a new book in the series “Species and Systematics” (originally UC Press; now CRC Press). The draft outline is below, but is subject to change and expansion as deemed appropriate.
I am looking for suggestions, and for potential contributors (naturally, while reserving standard/common sense rights to kindly accept or decline). The book will to collect a number of strong, diverse chapters on various projects and directions in this still very young field. Lead authors of chapters will coordinate with co-authors as preferred. I also intend to give authors much freedom to do and say things they maybe could not express using different publication outlets (while keeping things fair and high-minded).
Another key issue is (of course) – who may have time and motivation to contribute an original and impactful chapter in the coming six months? Either way, I am open to suggestions, contact me on- or off-line.
Knowledge Representation in Systematic Biology
The practice of systematic biology is fundamentally about generating and refining a body of language with which we describe and organize the phylogenetic diversity of life. Perceived species and higher-level lineages receive names that reflect their historical, hierarchical relationships (phylogeny). Traits are described in ways that bring out their phylogenetic identity (homology). Given that the language of systematics is meant to reflect the causal organization among life’s entities and their traits, we might ask to what extent this language is also amenable to computational logic or, more specifically, to the inferential powers of knowledge representation and reasoning (KRR).
An early attempt to reconcile the practice of systematics with logic was made in Woodger’s (1937) The Axiomatic Method in Biology, which significantly influenced Hennig’s (1966) conceptualizations of relationship in the groundbreaking Phylogenetic Systematics. Nevertheless many components of contemporary systematic treatments remain unclaimed by logic representation.
In the past decade KRR has appeared on the horizon of systematic practice through the infusion of several independent research communities and interests. These developments are in line with (1) rapidly expanding access to powerful and usable KRR tools and (2) an overarching programmatic trend to “explain science to computers” which, although neither trivial in theory nor in application, promises vast data integration benefits and pathways for systematics to the Big Data world.
Under the theme of creating the Semantic Web Stack, we have seen (inter alia) the rise of the Open Biological and Biomedical Ontologies (OBO) Foundry paradigm, with strictly regulated vocabularies (ontologies) that reflect biological processes and functions, anatomical and phenotypical entities and relationships of diverse model organisms and higher-level lineages, taxonomic hierarchies, and even of collection objects such as specimen vouchers (Biological Collections Ontology). When fully implemented, the OBO approach will facilitate unparalleled access to comparative biological data for systematic and other ontology-driven inferences. However, the process of standardizing and annotating all systematics-related products is immense and complex.
Other KRR efforts concentrate on translating Darwin Core, the lingua franca of the biodiversity database community, into Resource Description Framework (RDF) concepts and graphs, with the intent to connect them to the Linked Open Data cloud (hence abandoning altogether the “single database paradigm”).
Yet another novel area of logic application in systematics is provenance tracking of taxonomic concepts, i.e., logically reconciling shifting circumscriptions of perceived taxonomic entities across succeeding classifications and phylogenies. This amounts to a new proposal to resolve the products of systematic research more finely than possible with Linnaean names, yet without abandoning the advantages of the latter for human communication and learning about biological diversity.
In light of these novel and diverse logic-based applications now becoming available to systematic practitioners, it is timely to provide an accessible overview of this potentially transformative field and critical synthesis of prospects and challenges, as viewed with an explicit systematic interest. What flavors of logic are available to systematists through usability-designed tools? What functions can they serve, and what are their relevant strengths and limitations? The representation of theory-laden concepts – such as homology, phylogenetic transformation, or the nomenclatural and taxonomic provenance of systematic hypotheses – likely poses new challenges for computational logic. Less theoretically complex logic translations are easier to implement and propagate but may not support the kinds of inferences that systematists are most interested in. What are the trade-offs? In general, the extent to which particular aspects of contemporary systematic practice are, or are not, amenable to knowledge representation and reasoning, are not sufficiently well resolved. Hence an updated, open-minded characterization of the “boundaries of logic” in systematics is overdue, and is needed to inform the development of an effective interaction between systematics and KRR in the future.
The envisioned collection of articles will range from easy-to-read introductions, reviews, and empirical studies, to engaging position papers that aim to clarify the most fruitful ways forward for the development of logic-based tools and practices in systematics.
2. Other technicalities
- Edited book with 12-18 chapters by different author teams.
- Approximately 400 pages in length (10-30 pages per chapter).
- Approximately 5-10 illustrations per chapter, overwhelmingly line drawings, many in B&W.
- Suggested chapter submission date: June 1st, 2015.
3. Table of contents – main sections (subject to change; mainly provided to solicit input!)
I. “What is computational logic, how can the different logic forms and approaches support systematic practice, products, and inferences?”
- Overview of essentials and aims of (computational) logic.
- Optional – a historical review of logic representations in systematics.
- What flavors of logic and reasoning exist, are readily accessible, and how may they suit the systematic community’s interests?
- What domains of systematic practice are particularly easy, or challenging, to represent in logic, and why? What does this mean? (See also Section IV.)
II. “The OBO Foundry approach – origins, achievements, promise, and limitations.”
- Higher-level review of the OBO (Open Biological and Biomedical) Foundry paradigm – an endorsing take.
- 3-4 chapters on diverse accomplishments; prospects and challenges.
- 1-2 chapters taking a more critical stance; presenting alternative Knowledge Representation and Reasoning solutions.
III. “Logic and biodiversity data – from specimens to the Linked Open Data cloud.”
- BCO – Overview of the Biological Collections Ontology.
- TDGW-RDF – An emerging solution to re-create the Taxonomic Database Working Group’s DarwinCore standard in RDF and thereby ready it for the Linked Open Data cloud.
- iDigBio – the central North American Hub for integrating collections-based data and KRR.
- Prospects, challenges.
IV. “Concept taxonomy – logic representation of nomenclatural and taxonomic provenance.”
- Provenance – what is it and why is it important? (could move to Section I).
- Workflow provenance, annotation/revision provenance, content provenance, logic provenance.
- TaxMeOn – TaxonMeta-Ontology.
- Reasoning about Taxonomies.
- Exploring Taxonomic Concepts.
V. “Other developments in KRR for systematics.”
- Answer Set Programming approaches for phylogenetics, nomenclature.
- Additional novel themes.
VI. “Synthesis – where are we, and where are we going?”
- 1-2 chapters, maximally.