Skip to content

Taxonomic names, identifications, and concepts – how to reconcile?

This post is motivated by my (late) discovery of the GBIF “Guidelines for the capture and management of digital zoological names information” (Version 1.1, released in March 2013), authored by Dr. Francisco Welter-Schultes who is (i.a.) the project leader of the resource http://www.animalbase.org/. Jump to the Addendum [response to Stephen Thorpe, May 06, 2014].

The GBIF Guidelines, a dense, informative, and authoritative 126-page document on representing and managing zoological names (which, as GBIF makes clear, ultimately reflects the author’s perspective), also include (pages 3-5) a Section 1.1.2 on Taxon concept models. I found this section to contain a mix of more or less accurate statements and assessments of the interaction among taxonomic names, identification events, and taxonomic concepts. This issue is of interest to me, and has on occasion been discussed on Taxacom, in the TDWG community, and elsewhere. I (henceforth NMF, regular font) will take the opportunity to examine Dr. Welter-Schultes’ (henceforth FWS, italics) perspectives and examples, point by point. Hopefully some readers will find this post helpful.

FWS: “The meaning of a name used in one source at a given time may differ from that of the same name used in another source at another time. Bioscientific research is progressing steadily, authors can have different views on the same group of animals, or names are applied incorrectly (misidentifications).”

“Time is an important factor under such conditions. Only a proportion of specific names is used in the same sense as 100 years ago. This varies substantially among disciplines. Names of birds and mammals tend to have conserved their meaning for longer times than hemipterous insects or acanthocephalan worms, simply because birds and mammals are better known.”

NMF: The “better known taxa equals better conserved meaning of names” correlation may be doubtful. Better known taxa may also mean: more frequently revised names and thus more frequent rearrangements of meanings at higher and/or lower levels. Cherry-picked case in point: in 2011 the number of bovid species – a group comprising cattle, bison, buffalo, goats, sheep, and antelopes – was raised from 143 to 279. Researching the past years’ papers on this issue, one may conclude that even though mammals are generally well known, mammal taxonomy is not exactly at its end point today. Minor issue, but not entirely trivial. The larger point is – we need more data on the subject to draw sound and wide-reaching conclusions.

FWS: “In European non-marine molluscs, which is perhaps a more or less representative group between the extremes, we can integrate data from an estimated 70-80% of the specific names established until 1900 from pre-1900 sources, and a much lower proportion has not modified its generic classification (9% of 1380 species known before 1900 are still in the same genus as originally classified). Linnean names allow to communicate between centuries, but you have to know the limits.”

NMF: No major issues here; “same” genus is an underspecified term, however (same name?; same circumscription?; same number of entailed species?; etc.).

FWS: “The name/meaning divergence is regarded as a major impediment to the integration of biological information (Franz & Peet 2009). This is a general problem of Linnean names and can neither be solved here nor by using other identifiers.”

NMF: Acknowledged, and no argument.

FWS: “I have a few more comments about recent proposals to overcome problems associated with the name/meaning divergence. Several approaches were proposed to improve tracking the quality behind the information connected to a Linnean name in a certain context. These approaches involved models of taxonomic concepts (Franz 2005, Kennedy et al. 2006, Franz et al. 2008, Franz & Peet 2009) by which the use of Linnean names should be more accurately delimited using more or less objectively defined standards. The proposed solutions involved adding metadata to the Linnean name to improve the quality of the biodiversity information behind it. Berendsohn (1995) proposed using the term secundum (sec., = according to) to label different usages of a name, for example Carya ovata Gleason 1952 sec. Stone 1997 (plants: Juglandaceae), Stone 1997 being the author of a revision generally accepted as important, where the concept of this species was delimited. The idea behind this proposal was that ambiguities in the use of Linnean names would mainly result from multiple revisions of a taxonomic name. The term “secundum” was not new; in zoology it was already used in similar contexts in the early 1800s.”

NMF: We are getting into more critical issues here. The key phrases are highlighted in bold font. First, I am certain that the term “objective” as an adjective is not my usage. I checked all my citations above and “objective” is not once used as an adjective (roughly in the sense of “unbiased”) but only (and infrequently) as a noun in the sense of “goal”. This is not accidental; I happen to like scientific realism as a philosophical approach to understanding and practicing science. Under that framework, bias is ineliminable, and having (approximately, relevantly) “right” biases is necessary to engage in reliable scientific inferences.

Clarification 1: The taxonomic concept approach is not objective, nor does it aim for objectivity in taxonomy in the sense of finding an “unbiased” way to delimit the meanings of names. Indeed it is not a normative view about taxonomic research and naming per se. To the best degree possible, the approach represents taxonomic knowledge as is. Instead the aims are transparency in recording and accrediting taxonomic perspectives, exposure of implicit knowledge and assumptions, consistency in such exposure, and the creation of semantic linkages among concepts that permit human and computational provenance tracking. None of these aims are equivalent to reaching objectivity.

Clarification 2: Kennedy et al. 2006 and Franz et al. 2008 do talk about quality – of data, identifications, and concepts. But not in the sense of “improving the quality of the biodiversity information behind (the Linnaean names)”. Quality instead can mean “extent of relevant associated metadata” or “ease of enabling precise and unambiguous interpretations” of identifications or concepts. So then quality translates into an ability to create precise linkages among biodiversity data “packages” because of the well specified identifications and/or concepts associated with these “packages”. Again, establishing quality linkages among these packages of data is different from improving the quality of data themselves (and labelled with names).

Clarification 3: The “idea […] that ambiguities […] mainly result from multiple revisions” is not something that one must commit to strongly in order to endorse the taxonomic concept approach. There are issues with the relationships among (1) identifications to concepts and names, and then there are issues with the relationships among (2) names to names, or (3) names to concepts, or (4) concepts to concepts (and so on). Most likely each of these issues has some significance – small or large – in specific use cases where integration matters. Working on improved solutions for (e.g.) (3) and (4) is not the same as saying that issues mainly result from (3) and (4), or worse, that (1) and (2) are irrelevant. It merely means that (3) and (4) may have some relevance and thus merit explorations of solutions.

FWS: “A major shortcoming of these taxonomic concept models is that they were elaborated in botanical environments using few well-studied model groups; they only seem to be of limited value for zoological contexts. Other than in botany, the use of zoological names seems to be much more diverse and not as well defined – if the cited models reflect current trends in botanical taxonomy at all. It is extremely difficult to determine whether or not a zoological name was correctly applied in a publication, and it is certainly not possible to define objective criteria for a reliable data quality assessment. In zoology, only in some cases are name/meaning divergences provoked by multiple revisions of a taxonomic name.”

NMF: Major botanical concept floras include Koperski et al. 2000 (8544 names; 24,390 concepts; 7891 articulations) and Weakley 2012; the latter presently includes some 52,362 concept-to-concept articulations. I have no sound evidence that botanical names and taxonomies are inherently or consistently better defined than their zoological counterparts.

Clarification 4: See Clarification 1 for the part regarding objectivity, and Clarification 3 for the part regarding sources of ambiguity. We can agree that interpreting the historical or current zoological literature is not trivial. But frequently the most well trained and experienced experts can achieve more here than one might imagine. The key issue here is not doable versus impossible, but instead doable to what degree of precision? Humans, and computers, can represent and reason over information that reflects varying degrees of uncertainty. More on this below.

FWS: “The authors of these theoretical models also failed to provide evidence that such a model would actually be applicable in practice as well as improve the situation for biodiversity data management. The model depends on the assumption that any author who used a name explicitly defined in which sense the name was used. For such a model it would also be necessary to define which publication should be regarded as a “major revision”, qualifying for a “secundum” authorship. A revision of a genus would certainly be one, but where are the limits? If an author revised a genus and separated a subgenus containing two species without mentioning to which subgenus four additional species belong, can we add those species later as having been implicitly contained and thus revise the author’s incompletely presented concept of the subgenus?”

NMF: Evidence on practical applicability exists (e.g Weakley 2012, cited above), but yes “more data are needed”. So then time will tell. Whether the taxonomic concept approach can improve biodiversity data management is indeed largely an open question (point well taken).

Clarification 5: The “what are the limits?” issue is apparently longstanding. Again, this is not readily resolved unless one also addresses some deeper (dare I say philosophical) issues. Some (not all) good definitions in science must (necessarily) be naturalistic and hence somewhat vague. Think of the evolutionar species concept – which argues (in my mind not irrationally) that issues of delimiting historical identity are to some extent up to the particular scientific community working on a set of related lineages.

So here is (not just) my take on the “limit” issue: whether a name/source instance merits “elevation” to the concept “level” has potentially very little to do with “what is there” in terms of explicitly published and verifiable taxonomic information; and instead may be almost entirely depend on the ability of one or more experts to assert an articulation from the name/source label in question to another concept. Case in point: Wibmer & O’Brien’s (1986) perspective is merely a checklist! However, this is one lineage of insects that I happen to know well (in fact I have personally worked with the authors). Hence I am quite confident to take this very bare-bones set of name/source instances and translate it into articulated concepts.

Concepts become meaningful and facilitate integration by virtue of the articulations that experts assign to them. Historically there may have been an emphasis on addressing the “what is a concept?” question by focusing mainly on the properties of an isolated concept instance (e.g., the TDWG 2005 TCS; and thanks to Steve Baskauf, this update). Instead it is more appropriate to shift the focus towards articulations. We can use a naturalistic definition of boundaries: if no expert can confidently assert an articulation to another concept, then likely what we are looking at is an instance of a name whose meaning is specifiable only in terms of nomenclatural precision. If, on the other hand, an expert can assert one or more specific articulations to other concepts, then we have concept-level resolution.

In my mind the requirement to “set strict, a priori boundaries” as to what qualifies as a concept or not is pretty much equivalent to a foundational view that naturalistic definitions are not useful in this context. It is reasonable to hold such a non-naturalistic view about definitions (and many scientists do), but if that is the case then it should be openly acknowledged, and (I suppose) justified for this particular application. In the meantime, we may benefit from de-emphasizing the need to provide upfront definitions and instead explore how far the abilities of experts to interpret our taxonomic legacy can reach in specific use cases.

To answer the author’s question about assigning after the fact a concept to another author who implicitly posited a subgenus-level concept…if doing so facilitates better data integration towards that author’s perspective and in relation to a current, succeeding perspective, then why not? Interpreting the explicit and implicit meanings of other authors’ taxonomic works is part of what taxonomic experts are “there for”. If the original author is made aware of this she may even posit such articulations herself. Users will likely appreciate the added resolution.

FWS: “The major source of incongruent uses of zoological names is simply misidentifications. This occurs frequently, for example in molecular studies which deal with a decreasing ability of bioscientists to identify species correctly before studying them on a molecular basis. Taxonomists can sometimes recognise such misidentifications if a locality was given where the species is known not to live, or if voucher specimens were deposited somewhere. But even then, it is not possible to improve the data record, for example in GenBank, where molecular data deposited under a name of a misidentified species cannot subsequently be shifted to the (presumably) correct name of the species. The database provider will not allow metadata to be added afterwards.”

“So although an author may have cited a certain source as the basis for the use of a name (an equivalent to Stone 1997 in the example given above), which could serve as an objective criterion, the species in question could have been misidentified anyway. The reliability behind a name must be evaluated for each use of each name individually. One must know that a certain name was used in another sense prior to the 1970s than afterwards.”

NMF: What is a suitable representation model for the interaction between taxonomic names, identifications, and concepts? My position here (Franz & Peet 2009: 6) is congruent with that of Baskauf & Webb (2014) – Darwin Core – Semantic Web. Accordingly, identifications have relationships to taxonomic concepts, which in turn have relationships to each other and to taxonomic names (which in turn have relationship to each other as well). The identification-to-concept relationships may be highly precise and reliable, or less so, depending on the situation at hand. No argument that this is problematic, often to an unknown (but likely high) degree. But it does not follow that concept-to-concept articulations are therefore not also desirable, or relevant, to obtain. See also Clarification 3. Here is an analogy. Let us say that both faulty steering and faulty brakes contribute, to some degree, to cars of a particular brand having frequent accidents. Faulty steering (identifications to concepts) is a major problem. Does it follow that faulty breaks (concept-to-concept articulations) are not worth investing research into, and ultimately provide novel solutions?

BaskaufWebb2014-SWJ-Fig1

Source: http://www.semantic-web-journal.net/system/files/swj635.pdf

I would think that an improved GenBank environment would allow both the correction of apparent misidentifications (with provenance, ideally) and representation of taxonomic concepts. In either case the particularities of how GenBank currently operates are not a very strong argument against an alternative way of doing business – just lobby to improve GenBank then! Practical and theoretical arguments for such improvements are on our side.

FWS: “One must know that a shift in the use of the name was not always initiated by a published source (an important difference from the theoretical model proposed by Franz 2005), and that even if so, subsequent authors were not always aware that in previous sources a modification of the meaning of the name was proposed. One must know how taxonomy works in practical zoological life.”

NMF: Agreed. And I think this is readily modeled in practice using taxonomic concepts.

FWS: “Below are some examples regarding terrestrial molluscs which illustrate the main problems. In contrast to many marine species and fishes, terrestrial molluscs have no dominating central internet resource; they reflect developments in a diverse body of experts.”

“Examples:

Case 1 – Anisus. Some researchers analysed the type specimens of 6 European endangered Anisus species (Gastropoda) and did not publish detailed results. Based on these analyses, Falkner et al. 2001 and Falkner et al. 2002 rejected on species checklists the traditional use of the six names and simply used the names in a different sense. In a much more detailed study of the group, Glöer 2002 came to different conclusions and only partially accepted these changes. The names as used by the Falkner et al. 2001 checklist were copied to the Fauna Europaea database file (www.faunaeur.org) in 2004, a very important internet resource used by many researchers. It is necessary to know this as well as the fact that the data behind these entries were not based on a sound and scientifically published study, and that the classification had been rejected in a much more detailed scientific study by Glöer in 2002. Glöer & Meier-Brook 2008 finally rejected Falkner et al.’s 2001 entire concept. Around 2010 www.faunaeur.org returned to the traditionally used names, but many internet sources had meanwhile copied the information, classified taxa under the untenable concept and associated information on distributional ranges to names which were not corrected later.”

NMF: Great example. Let us see if I get this right (well, approximately – not a mollusc expert [unlike Dr. Welter-Schultes]). It would seem that at least eight sources of name usages are involved, as follows:

  1. Anisus (1 genus- and 6 species-level concepts) sec. “the traditional use” (prior to the trajectory that FWS recounts).
  2. Anisus (1 genus- and 6 species-level concepts) sec. “some researchers” (results not formally published but used by others); concepts not congruent with 1.
  3. Anisus (1 genus- and 6 species-level concepts) sec. Falkner et al. 2001; all concepts congruent with 2.
  4. Anisus (1 genus- and 6 species-level concepts) sec. Falkner et al. 2002; all concepts congruent with 2 & 3.
  5. Anisus (1 genus- and 5 [?] species-level concepts?) sec. Glöer 2002; taxonomically distinct perspective from 2-4 (but congruent with 1?).
  6. Anisus (1 genus- and 6 species-level concepts) sec. Fauna Europea 2004; all concepts congruent with 2, 3 & 4.
  7. Anisus (1 genus- and 6 species-level concepts) sec. Glöer & Meier-Brook 2008; taxonomically distinct perspective (but congruent with 5?).
  8. Anisus (1 genus- and 6 species-level concepts) sec. Fauna Europea 2010; all concepts congruent with 1?

Unfortunately I have only access to perspectives 7 & 8; nevertheless this example is likely less complex than Andropogon. Here is the current (2014) Fauna Europea perspective (congruent with FE 2010?):

  • Genus Anisus sec. FE 2014
    • Subgenus Anisus sec. FE 2014
      • Anisus (Anisus) leucostoma (Millet 1813) sec. FE 2014
      • Anisus (Anisus) septemgyratus (Rossmassler 1835) sec. FE 2014
      • Anisus (Anisus) spirorbis (Linnaeus 1758) sec. FE 2014
      • Anisus (Anisus) strauchianus (Clessin 1886) sec. FE 2014
    • Subgenus Disculifer sec. FE 2014
      • Anisus (Disculifer) vortex (Linnaeus 1758) sec. FE 2014
      • Anisus (Disculifer) vorticulus (Troschel 1834) sec. FE 2014

According to FWS’ reconstruction (caveat: as I understand it), among the eight cited perspective there are minimally two sets of taxonomically non-congruent concepts; e.g. perspectives 1 & 2 evidently have such non-congruent concepts. It is not clear that any of the instances of non-congruence affect entities matching members in the subgenus Disculifer sec. FE 2010. Maximally we may have five taxonomically non-congruent sets: 1 versus 2-4 & 6 versus 5 versus 7 versus 8. Though possibly 1 and 8 are congruent, and 7 & 8 are congruent, and so on. Glöer & Meier-Brook 2008 treat essentially three species-level concepts; in my quick assessment the differences with Falker et al.’s 2001/2002 perspectives are focused on the interpretations and assignments of types, subtle diagnostic features, and distributions of A. leucostoma sec. G & M-B 2008 and A. septemgyratus sec. G & M-B 2008. While I cannot confidently assert concept articulations myself, I am quite certain that an expert as informed as FWS can.

So in short, this case can be represented adequately with concepts and articulations and is not exceedingly complex. The “many internet sources” can be treated either as additional concept sets, or as identifications, with their respective inherent degrees of resolution. Yes, resolution may be poor (given the poor annotation practices adopted by some of the aforementioned entities) – it does not follow that concepts are inadequate or useless in this example.

FSW: “The gastropod Viviparus viviparus does not live in Austria, a rare case of a true distributional gap of a common European freshwater mollusc. But many do not know this, and the relatively similar species Viviparus contectus is repeatedly misidentified and reported as V. viviparus from Austrian localities, as can be seen in GBIF range maps. It is almost impossible to trace the nature of the errors in the original sources – true misidentifications by the authors or copied data of previous publications. The only solution is to know that the Austrian records must be incorrect.”

NMF: From a concept perspective this case is rather trivial. The example could be handled just via a set of qualifiers linking the occurrences via identification to the two “correct” concepts. Or one could assert that three concepts are involved, as follows.

  1. Viviparus viviparus sec. “the correct perspective (CP)” (i.e., that which is congruent with FWS’ view).
  2. Viviparus contectus sec. “the correct perspective (CP)” (ditto).
  3. Viviparus viviparus sec. “the incorrect perspective (IP)” (i.e., the one leading to misidentification of actual V. contectus sec. CP occurrences).

The one can assert these articulations: 1 | 2 (exclusion), 1 < or >< 3 (inverse proper inclusion or overlap), 2 < or >< 3 (ditto), and possibly 1 + 2 == or >< or > 3 (neither ==, nor |).The disjoint (or) articulations reflect my understanding that V. contectus sec. CP (2) occurs also outside of Austria where I  (as a non-expert) have no evidence to assert proper articulations to (e.g.) concept 1. Within Austria, it seems that 1 + 2 == 3. Either way, concepts are adequate to represent this example, even if they cannot miraculously make misidentifications go away.

FWS: “For more than 5 decades the genus Oxychilus was studied intensively by A. Riedel who published a summary of his work in 1998. In the same year, Giusti & Manganelli studied a few species and mentioned that the subgeneric classification of Oxychilus could be questioned. In their uncommented checklist of European species, Falkner et al. 2001 subdivided the genus and elevated three subgenera to generic rank. Giusti & Manganelli 2002 made clear that they did not intend to modify Riedel’s 1998 classification. The data provided by Falkner et al. 2001 were copied to the Fauna Europea online database mentioned above, and since then have been used by many researchers. Today these names are used in the sense of an internet database, or a simple checklist, without being based on a taxonomic concept. No detailed scientific study has proposed a sound concept for a new classification after Riedel 1998.”

NMF: Again, not an insurmountable representation challenge for concept taxonomy. AnimalBase lists some 114-115 species-level concepts in this entity. It is perfectly acceptable (at a minimum) to attribute to Falkner et al. 2001 the subgeneric concepts (with included species-level concepts) which they evidently held at the time of publication (see also Clarification 5). These concepts are apparently also congruent with those authored in the Fauna Europea (post 2001). Giusti & Manganelli’s (1998) perspective can be represented as is; and articulations to (e.g.) Falkner et al.‘s (2001) subgeneric concepts need not take into account anything other than a cumulative list of the corresponding species-level concepts presented in Guisti & Manganelli (1998) (or, for that matter, in Riedel 1998). Hence (as an example): Oxychilus subgenus A sec. Falkner et al. (2001) == [sum of all entailed species-level concepts] sec. Riedel 1998. Concept resolution at the subgeneric level obtained across treatments, and case likely closed.

FWS: “In some zoological revisions, responsible authors who were aware of the problem that the use of a name before and after a revision will stand in contrast to each other, associated the new taxon concept simply with a previously forgotten old name. Sometimes this is an option to avoid ambiguous name/meaning divergences.”

NMF: Naturally. But this relies on the (frequently disappointed) expectation that a unique concept/name assignment (now using the rare old name) will stand the test of time. Concepts and articulations are the superior long-term management solution here: the assertion – pre-revision name sec. author/source A   [<= concept articulation =>]   post-revision name sec. author/source B – will stand the test of time even if the pre- and post-revision names undergo additional, non-congruent revisions in subsequent years. So then adding the concept representation and articulations is even more responsible than just selecting the rare old name.

FWS: “Example: The names of several species of Radix (Gastropoda) had to change after a detailed study by Bargues et al. 2001 who found three species A, B and C. Prior to 2001, the name ovata was used for species A and B, peregra for species C. The authors found that the types of ovata and peregra both belonged to species A. Then they proposed to use the older and forgotten name balthica for species A, lagotis for B and labiata for C. Both names ovata and peregra would no longer be used after 2001; they were junior synonyms of balthica. This means that all old peregra records referred to labiata, the old ovata records referred partly to balthica and to lagotis, usually to balthica. Today, the use of balthica will automatically indicate identification according to the classification after 2001. New records of peregra and ovata after 2001 will automatically indicate usage of the old concept.”

NMF: Certainly less complex than the Anisus example. We have:

  1. Radix ovata sec. “prevailing pre-2001 classification”. p.1
  2. Radix peregra sec. “prevailing pre-2001 classification”. p.2
  3. Radix balthica (“A”) sec. Bargues et al. 2001. b.3
  4. Radix lagotis (“B”) sec. Bargues et al. 2001. b.4
  5. Radix labiata (“C”) sec. Bargues et al. 2001. b.5

Articulations: b.3 < p.1, b.4 < p.1, b.3 + b.4 == p.1, and b.5 == p.2. This looks as follows with the Euler/X toolkit merge. The remainder of the example can be dealt with through representations of identifications.

Radix

Euler/X concept alignment (merge taxonomy) for the Radix example.

FWS: “Google hits 2007-2009, Google Scholar 2011 (the figures for Google hits were only of limited reliability until 2009, but it was possible to verify the results from other computers. After 2010 they depended increasingly on the geographic location of the querying IP address and became totally unreliable. The figures for Google Scholar included citations and were still based on true hits in 2011): 07.2007: “radix-ovata” 650, “radix-peregra” 850, “radix-balthica” 1000, “radix-lagotis” 90, “radix-labiata” 350. 07.2007: “lymnaea-ovata” 620, “lymnaea-peregra” 18,500, “lymnaea-balthica” 15, “lymnaea-lagotis” 30, “lymnaea-labiata” 0. 09.2009: “radix-ovata” 6300, “radix-peregra” 3100, “radix-balthica” 1700, “radix-lagotis” 150, “radix-labiata” 500. 09.2009: “lymnaea-ovata” 800, “lymnaea-peregra” 5400, “lymnaea-balthica” 200, “lymnaea-lagotis” 230, “lymnaea-labiata” 0. 11.2011 (“since 1992”): “radix-ovata” 360, “radix-peregra” 502, “radix-balthica” 266, “radix-lagotis” 30, “radix-labiata” 95. 11.2011: “lymnaea-ovata” 150, “lymnaea-peregra” 1080, “lymnaea-balthica” 7, “lymnaea-lagotis” 10, “lymnaea-labiata” 0.

The name balthica is only slowly gaining importance in internet resources. Interestingly, many sources still use the genus Lymnaea which has not been used as the genus for these species by taxonomic experts for several decades.”

NMF: These are valuable insights into the shifting dynamics of identification; and one would need to create more concept representation for (e.g.) the Lymnaea-labeled concepts. Yet these insights are not in themselves challenging to the taxonomic concept perspective; the concept-level articulations are rather unambiguous can have the desired resolution benefits.


Summary. I think for the most part, the points made and examples given by FWS miss the target – if there was one. Concepts and articulations can potentially have wide applicability and use, including applicability to the examples given. Concepts and articulations can provide superior taxonomic perspective-based resolution in comparison to names/nomenclatural relationships-based resolution, as demonstrated here. Poor annotation or identification practices represent valid concerns but are largely separate issues that the taxonomic concept approach in isolation is not intended to, and evidently cannot, resolve. However the combination of (1) good representation of identification events and (2) concept-level resolution of succeeding taxonomic perspectives remains worth pursuing.

In my current view, a moderately defensible view against the taxonomic concept approach is the following: (1) it may be too cumbersome to implement, and (2) it is still unclear whether it will have very significant benefits to taxonomic experts and other users in the longer term. None of these concerns “rescue” the current names-centric practice from having its own well known strengths and limitations transparently acknowledged. But from a present-day, pragmatic, and perhaps also sociological and technical perspective, they make some sense to me. However I am not convinced that they count as our best first-principles arguments. I think on first principles, meaning in terms of intellectual strength, and also in terms of proper knowledge representation and reasoning (it is not just us anymore in the taxonomic meanings universe, computers have joined a while ago and are here to help), concepts and articulations may win out over names and nomenclatural relationships. And I think that the mere possibility of this being the case means that it is prudent to try. Not just concepts, anything that might make systematics more transparent, explain the implicit to humans and machines, provide more refined semantics for us and others to use, and so on.


Addendum, May 06, 2014. Stephen Thorpe comments: “The issue is complex. We need to distinguish between two very different things, both of which result in the same name being used for different species: (1) nomenclatural issues: a lectotype is designated from a mixed series, or the holotype is found not to belong to the species the name is used for; (2) taxonomic issues (lumping/splitting): one author sees two species where another sees only one I suggest that the only way of dealing with all the complexity is simply to track all the relevant literature associated with a name, in the way that I try to do on Wikispecies. With a good set of references, you can work out what is going on.”

NMF: Thank you, Stephen. I think one way to model this is as follows; call it the nomenclatural/taxonomic change & provenance tracking square:

NomenTaxonChangeProv

Nomenclatural/taxonomic change & provenance tracking square.

The Codes, i.e. our legacy-rich rules about name identities and nomenclatural relationships (frequently type-driven), provide sound provenance tracking for strictly nomenclatural changes (by definition, so to speak). That is the yellow region. These rules also partially serve to track provenance when taxonomic changes are involved – those are the two orange regions. Yet of course the Codes are not designed to provide a perfectly archived system of provenance for all possible taxonomic changes (Pterygota, a name associated with a Subclass of arthropods, is an example I think). Then, lastly, there are kinds of taxonomic change – e.g., moving a junior genus concept from one tribal concept to another (where both tribal names and type genera are more senior) – where no nomenclatural change is implied. That is the red region.

You may well argue that tracking taxonomic provenance fully in the orange and red regions is not possible, or not desirable. But I happen to be curious about pushing our provenance tracking limits further into the orange and red zones – personally, and for the sake of computational logic representation. I am not arguing above that everyone must share this curiosity; just making the point that (1) it should be taken for what it is (i.e., represented adequately) and (2) remains an area where more exploration may produce novel insights and opportunities for systematics.

8 Comments Post a comment
  1. Stephen Thorpe #

    I think we need to take a step back and start thinking about this again from first principles. It seems to me to be a nonsense to think that we can somehow associate every historical use of a taxon name with a taxon concept, and even if we could, why would we want to? What matters is the taxon concept of the user, not the concept of the historical author. Relative to the concept of the user, each historical usage of a name is either correctly identified or misidentified. If the latter, then we disregard the associated data. We don’t need to know the concept of the historical author! If the user has no concept, then none of the historical information associated with the name will be meaningful. So, what the user wants to know is whether a historical usage of a name referes to the user’s concept for that name. But how can we know this? Only, I suggest, by (in the case of species) tracking down the material specimens and checking their IDs. Data associated with supraspecific taxa is trickier, and I’m not sure I have my head around that yet. But, my main point is that for species there is no way to specify a concept for historical usages unless a concept was explicitly stated with the usage. This is not the case for names used in ecological contexts, and often not even in taxonomic contexts. The only way is to check the material. Otherwise, any data associated with a name is to some extent doubtful. For a hypothetical example, suppose that Aus bus in Smith (1900) is now split into cryptic species. So Aus bus today is one species, but in Smith (1900) was several species. Chances are that we can never resolve which data in Smith (1900) truly applies to Aus bus (in the modern sense). But somehow specifying Smith’s concept doesn’t help, even if it is possible.

    May 5, 2014
    • Thank you, Stephen. Maybe another way to think about this is: the history of taxonomy is happening today, in 2014. Just looking forward, can we individuate in time “the current view”, and then when the next significant update is generated, we have two such instances of “the current view” that are separately citable, and can be semantically connected. In a way that’s what synonymy relationships are meant (in part) to achieve. Articulations can be thought of as more powerful synonymy relationships, if only looking forward.

      May 5, 2014
  2. Stephen Thorpe #

    The issue is complex. We need to distinguish between two very different things, both of which result in the same name being used for different species:

    (1) nomenclatural issues: a lectotype is designated from a mixed series, or the holotype is found not to belong to the species the name is used for;

    (2) taxonomic issues (lumping/splitting): one author sees two species where another sees only one.

    I suggest that the only way of dealing with all the complexity is simply to track all the relevant literature associated with a name, in the way that I try to do on Wikispecies. With a good set of references, you can work out what is going on.

    May 5, 2014
  3. Stephen Thorpe #

    Now I’m a little bit confused which of two things you are mainly talking about:

    (1) tracking taxonomic changes more “formally” (as they do in botany);

    (2) dealing with competing taxonomic views on the same taxa.

    May 6, 2014
  4. Thanks again, Stephen. (1) In fact so formally that we can represent in computational logic (i.e., beyond botanical Code rules). (2) Of course – that is one of the key motivations: tracking the sequence, multiple taxonomies side-by-side. I think at this point I may have to ask you to look at one of my (collaborative) recent manuscripts on multi-taxonomy alignment; e.g. this: http://taxonbytes.org/pdf/FranzEtAl2014-ReasoningTaxonomicChangeAndropogon.pdf Pages 13 onwards have tables and figures that should serve to illustrate the approach (one hopes). Table 1 lists the concepts. Figure 1 presents the use case. Figure 5 has a legend that specifies (1) taxonomic and (2) nomenclatural identity, allowing one to contrast the two.

    May 6, 2014
  5. Stephen Thorpe #

    I just get the feeling that you (and certainly GBIF) are trying to impose more precision on taxonomy than the underlying practice actually supports. All of my attempts to find precision in taxonomy have failed. Even the International Code of Zoological Nomenclature is a vague mishmash of contradictory rules. It works in practice for standard cases, but as soon as something “nonstandard” happens, the Code falls apart. Similarly, here, I think you could cherry pick examples which seem to work, but applying your ideas more broadly may not work.

    Something to think about (which may or may not be relevant!): we would not want to give every taxonomic opinion equal weight. This is because (1) they don’t have equal weight; and (2) there is no objective criterion to determine their weight. Alternative taxonomic opinions are never precisely simultaneous, so to give them equal weight would be to go back all the way to Linnaeus and give his taxonomic opinions equal weight to those of modern workers, and to give Hoser’s opinions equal weight to Wuster’s, etc.

    May 7, 2014
    • Once more – thank you for the comment. Maybe instead of “impose” I would like to say “encourage to explore”. I concur with everything else.

      May 7, 2014

Trackbacks & Pingbacks

  1. Prior work on concept taxonomy - 2013 - taxonbytes

Leave a Reply

You may use basic HTML in your comments. Your email address will not be published.

Subscribe to this comment feed via RSS