Skip to content

Posts tagged ‘inference’

Weekly reading: Ramirez et al. on structural complexity in ancestral ontologies (again)

Last week we read and appreciated Seltmann et al.’s (2012) effort to carefully describe the benefits, use, and user community roll-out of the spectacularly well annotated Hymenoptera Anatomy Ontology Portal. We clearly need and want something like this for Coleoptera. That said, we continue to explore options to maybe do things a little differently. Looking for inspiration, we are reading once more what is to my mind one of the best demonstrations of how phenotype ontologies can be used to address research questions – by phylogenetic systematists, for phylogenetic systematists.

Ramírez, M.J. & P. Michalik. 2014. Calculating structural complexity in phylogenies using ancestral ontologies. Cladistics (Early View). Available here.

We are also starting, based on this semester’s cumulative readings, to formulate some interests of our own. Hence the following homework for all; due by next Wednesday’s discussion.

Formulate three research themes or questions that are comparative/phylogenetic in nature and could possibly make use of phenotype ontologies. Be very specific; ideally starting with the taxonomic group and character system that you are most intimately acquainted with. (in my case, e.g., that might be acalyptine weevil mouthparts). Best to work outward from the current core of your taxonomic expertise. Research ideas might take into account (yet are clearly not limited to):

  • Evolution of phenotype complexity, reduction.
  • Correlations across character systems.
  • Presence/absence of traits across larger phylogenetic groups and within/among subgroups.
  • Relationships of traits to non-organismal variables (e.g., environment).
  • Annotations and inferences targeting the specimen level versus or higher taxon entities.
  • Evolutionary rates, timing.
  • Associations, coevolutionary themes.
  • Information availability, completeness, suitability for analysis.
  • … [insert your favored domain of phenomena or inquiry here]

The idea is to engage in a bit of a reverse engineering exercise. We know that the earliest phenotype ontologies came out of the model organism community – what Nelson & Platnick (1981) might refer to as “general biology” (pages 4-5). Yet systematists tend to ask comparative questions. What (if any) general structures, entities, and relationships do these comparative/phylogenetic questions entail? Which kinds of inferences are we (most) interested in? How would the components needed to accommodate the inferences be fruitfully translated into a logic framework?

In other words, let’s pretend we are well advised to engage in some conceptual modeling for the future design of a Coleoptera Anatomy Ontology (which may not carry such a name in the end). Start with nailing down our most highly domain-specific questions. Abstract overarching design needs from these. Pretend that solutions will follow.

Weekly reading: Smith 2004 on ontology as reality representation

Last week’s paper on the merits of “realism as practiced by the BFO” left us with a sense of dissatisfaction (which cannot fairly be credited to the paper itself). First, since this was predominantly a “con” paper, it seems important to also examine the “pro” stance. And second, yes, we are getting further away from applications. We will address both issues, though necessarily in sequence. Therefore, up this week is a foundations paper on how to conceive of and construct realist (OBO-compliant) ontologies.

Smith, B. 2004. Beyond concepts: ontology as reality representation; pp. 73-84. In: Proceedings of the 3rd International Conference on Formal Ontology in Information Systems (FOIS 2004); November 4-6, 2014; Torino, Italy. IOS Press, Amsterdam. Available on-line here.

Weekly reading: Adding a little reality to building ontologies for biology

We are moving from practical designs and implementations of ontologies in systematics to design theory. One issue to understand, or least have an intuitive position on, is the strength of the interaction or interdependency between ontology design and functionality. And “design” could reach as far up the chain of representation as “why Description Logics and not another flavor of logic?” The term “Realism” plays a role. About five years ago there was a fairly spirited debate on this topic, reviewed here. We are reading one paper from the longer list.

Lord, P. & R. Stevens. 2010. Adding a little reality to building ontologies for biology. PLoS ONE 5(9): e12258. Available on-line here.

Weekly reading: Balhoff et al. on a semantic model for wasp species description

Following Daduhl et al. and Vogt et al., our third paper in the phenotype ontologies Weekly Discussion series will dive into an applied example by Balhoff and co-authors (mainly of the Deans Lab) with a clear taxonomic emphasis. Already we have seen that different scientific orientations draw on phenotype ontologies with the expectation of reframing and solving specific problem complexes.

Daduhl et al.‘s focus was firmly within the bounds of evolutionary and phylogenetic analyses of phenotypes across broader and deeper taxonomic scales. Implementation challenges notwithstanding, there was an underlying agreement that the legacy of phenotype-centric systematic work could be appropriated towards the outlined representation and inference goals.

Vogt et al., in turn, emphasized a need for consistent, machine-processable standards with regards to phenotype syntactics, semantics, etc.; including a separation of descriptive and evolutionary/explanatory elements in our morphological terminology. This has the makings of a potentially divergent paradigm in relation to Daduhl et al.‘s program and perspective.

Another interesting development is the Phenoscape team’s exploration of homology relations in ontologies, outlined here: http://phenoscape.org/wiki/Reasoning_over_homology_statements.

In light of these different lines of research, we set ourselves two immediate questions to address:

1. What are actual applications that utilize phenotype ontologies and (optionally) reasoning for (a) multi-taxon studies with (b) an evolutionary/systematic orientation?

2. Suppose we had the “awesome ontology & reasoning” infrastructure on hand, where current technological limits no longer apply. What kinds of questions would  we ask this infrastructure to solve for us (that cannot be addressed otherwise)?

The paper for next week applies directly to these questions.

Balhoff, J.P., I. Mikó, M.J. Yoder, P.L. Mullins & A.R. Deans. 2013. A semantic model for species description applied to the ensign wasps (Hymenoptera: Evaniidae) of New Caledonia. Systematic Biology 62: 639–659. Available on-line here.

Weekly reading: NGS technology and inference challenges in review

Summary of Weekly Discussion papers read last semester (Fall 2014) on Next-Generation Sequencing methods and related phylogenomic inference challenges. With a few diversions in between. In chronological sequence.

  1. Barker. 2014. Philosophy of statistical phylogenetic methods.
  2. Witteveen. 2014. Naming and contingency: the type method of biological taxonomy.
  3. Lemmon & Lemmon. 2013. High-throughput genomic data in systematics and phylogenetics.
  4. Mardis. 2013. Next-Generation Sequencing platforms.
  5. Bybee et al. 2011. Targeted Amplicon Sequencing.
  6. Lemmon et al. 2012. Anchored Hybrid Enrichment.
  7. Kumar et al. 2012. Statistics and truth in phylogenomics.
  8. Wright & Hillis. 2014. Bayesian analysis outperforms parsimony for morphological data.
  9. Nguyen et al. 2012. Intermittent evolution and robustness of phylogenetic models.
  10. Schwartz et al. 2014. SISRS – Site Identification from Short Read Sequences.
  11. Narechania et al. 2012. RADICAL – Random Addition Concatenation Analysis.
  12. Philippe et al. 2011. Why more sequences are not enough.

We had some fun discussions. Topic for Spring 2015: Phenotype Ontologies.

Weekly reading: SISRS – Site Identification from Short Read Sequences

This week’s discussion will focus on a novel method to identify and separate signal from noise for Next-Generation sequencing datasets – SISRS.

Schwartz, R.S., K. Harkins, A.C. Stone & R.A. Cartwright. 2014. A composite genome approach to identify phylogenetically informative data from Next-Generation Sequencing. http://arxiv.org/abs/1305.3665

Weekly reading: Intermittent evolution and robustness of phylogenetic models

Models of evolution used in phylogenetic reconstruction make specific assumptions which (in their entirety, and globally applied) are ultimately wrong. They are also approximately right. What does this even mean? This week’s reading gets us into the notion of robustness of phylogenetic models to violations of their inherent assumptions. An important piece of the “which method should I use?” puzzle. Let’s see if we can identify other pieces too.

Nguyen, M.A.T., T. Gesell & A. von Haeseler. 2012. ImOSM: Intermittent evolution and robustness of phylogenetic methods. Molecular Biology and Evolution 29: 663-673. Available on-line here.

Weekly reading: Bayesian analysis outperforms parsimony for morphological data

We had a lively Weekly Discussion of Kumar et al. 2012, and are staying with the general theme (hereby undemocratically coined) of “new insights in statistical phylogenetics/phylogenomics”. Models, biases, assumptions, data. Thus, for next week:

Wright, A.M. & D.M. Hillis. 2014. Bayesian analysis using a simple likelihood model outperforms parsimony for estimation of phylogeny from discrete morphological data. PLoS ONE 9(10): e109210. Available on-line here.