PathwayMatrix visualization software shows Euler/X taxonomy alignment products and ambiguities
This post serves as an update on a new Euler/X compatible visualization software called PathwayMatrix, and also as a mini-review of the Exploring Taxonomic Concepts (ETC) Information Visualization Workshop, held on May 11-13, 2015, at the National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign. The workshop was organized by Bertram Ludäscher of the Euler/X Project and ETC lead information scientist Hong Cui.
Workshop participants included people with diverse backgrounds and skill sets including biological taxonomy, computer science (knowledge representation and reasoning), and software visualization. I cannot link to everybody’s personal and project websites, but here is at least an alphabetical list of participants: Hong Cui, Tuan Dang, Angus Forbes, Nico Franz, Martin Graham, Jessie Kennedy, Curtis Lisle, Bertram Ludaescher, Paula Mabee (remote), James Macklin (ETC co-lead), Michael McGuffin, Paul Murray, Martín Ramirez, Thomas Rodenhausen, Michael Twidale, Matt Yoder, and Shizhuo Yu.
The line-up included four different visualization teams previously not interacting with the ETC and Euler/X software applications. The primary goal was to learn about each others’ visualization needs and opportunities, taking into account such powerful tools as the D3 Library and other novel visualization solutions.
One immediate, and surprisingly smooth and powerful new connection was formed between the Euler/X toolkit products and Tuan Dang’s PathwayMatrix visualization software. This intuitive tool displays multiple possible relationships (RCC-5 in our case) in the matrix cells, and additionally allows the sets of classes displayed in both columns and rows to be hierarchically arranged. The cell content can be variously arranged and then intuitively explored across the matrix interface. A YouTube video of PathwayMatrix used for another purpose is posted here.
Tuan was able to reconfigure the Euler/X input/output readily to then render the set of Maximally Informative Relations (MIR) for showing in PathwayMatrix. These confers two immediate and new visualization services.
(1) In cases where certain concept-to-concept articulations are ambiguous (RCC-5 disjunctions) in the output, the corresponding concepts can be spatially aggregated and thus identified very easily by the user. This can lead to an accelerated understanding and subsequent removal of the ambiguity issues. Without the visualization, one has to instead “comb through” a spreadsheet that may have many thousands of rows. We can now already do this in PathwayMatrix, even with 153,111 articulations.
(2) We can show “information expression” that is newly acquired through the Euler/X toolkit reasoning process. In the primate use case, the user provides 402 articulations as input. The reasoning process translates this set into 153,111 MIR, thereby expressing a 380-fold increment in the number of articulations that are logically implied by the input but are not explicitly stated therein. The differentiallevels of information expression before and after the reasoning process could be correspondingly visualized with PathwayMatrix through two matrix versions, and thus show the powers of the reasoning approach. We have not implemented this service yet.
Overall it was a great, intense and productive workshop, with diverse new connections and plans to move things forward. To my mind ETC and Euler/X occupy an important niche in the biodiversity informatics domain, providing semantically expressed cross-taxonomy linkage services that may perform more reliably over large taxonomic information scales than name-based data integration solutions.