HCLSIG/SWANSIOC/Document-Annotation-Subtask/UseCases/2

From W3C Wiki

Is course-grained rhetorical blocks improving entity annotation tools performance?

prepared by Paolo Ciccarese, October 5, 2010 (derived from and related to the Rhetorical Structure Use Case 3 by Tim Clark, December 4, 2009)

Text Mining Related Use Case: This use case is connected to the usage of the "entity recognition" tools, which establish the initial links from free text to terminology/ontology systems.

1. Introduction

It is possible that by merely defining and applying a simple, course-grained model of rhetorical structure corresponding to the rhetorical purpose of major document sections, the performance of entity recognition software could be dramatically improved in ambiguous cases, reducing the need for human intervention and improving the cost-effectiveness of the software.

2. Use case

Given a document - such as a biomedical journal article - we want to enable the annotation of the different blocks according to the course-grained rhetorical blocks vocabulary currently under development in the Rhetorical Structure subtask.

(in progress)