Sunday, July 26, 2009

ICBO: Tools for Annotation and Ontology Mapping

The International Conference on Biomedical Ontology was held in Buffalo, NY this weekend. There were a number of papers that addressed the tools side of enriching documents with annotations derived from domain ontologies, using more than one source of terminology.

Michael Bada (from Larry Hunter's group at U. Colo. Denver) described a project called CRAFT (Colorado Richly Annitated Full-Text) to create a corpus of scientific documents annotated with respect to the Gene Ontology (GO) and other sources from CheBI and NCBI. The annotation tool is called Knowtator, which performs some automatic tagging which can then be corrected by a domain expert in an editorial environment, e.g., Knowtator plugs into Protege. All annotation is 'stand-off', i.e., not inserted into the document as inline tags, but kept in another file. The semantic tagging performed in this effort relies in part upon syntactic annotations performed elsewhere (U. Colo. Boulder).

Various representational issues arise in the course of applying GO to documents, e.g., the fact that verb nominalizations in English tend to be ambiguous between process and result, e.g., "mutation" can refer to either the process by which change occurs or the results of that change. These ambiguities pose difficulties for automatic annotation, and can also perplex human annotators.

Cui Tao (from Chris Chute's group at the Mayo) presented a paper on LexOwl, an information model and API that connects LexGrid's lexical resources with Owl's logic-based representation. LexOwl is being released as a plug-in for the Protege 4 editorial environment. It's heartening to see high-quality tools being made available in a manner that can be shared and deployed by the community as a whole.