[BioNLP] AnEM: a new corpus of anatomical entity mentions

Sampo Pyysalo smp at is.s.u-tokyo.ac.jp
Mon May 14 07:19:05 EDT 2012

Dear colleagues,

We are delighted to announce the first public release of the Anatomical
Entity Mention (AnEM) corpus, a manually annotated resource covering over
3,000 entity mentions annotated in 500 documents (over 90,000 words)
selected randomly from PubMed citation abstracts and PMC OA full-text
papers. The corpus is intended to serve as reference data for the
development and evaluation of automatic tools for anatomical entity mention
detection, and is provided along with evaluation tools under the open
CC-BY-SA license.

For more information and downloads, see the resource home page


The corpus is presented in the following paper:

    Tomoko Ohta, Sampo Pyysalo, Jun'ichi Tsujii and Sophia Ananiadou (2012).
    Open-domain Anatomical Entity Mention Detection.
    In Proceedings of ACL 2012 Workshop on Detecting Structure in Scholarly
Discourse (DSSD) (to appear)

As there has been substantial recent interest in the domain and on this
list on the topic, we have opted to make this early first announcement of
the new resource now; please excuse any possible omissions from the current
resource homepage, which is still partly under construction.

Any and all feedback, questions and criticism are very welcome!

Best regards,

-------------- next part --------------
HTML attachment scrubbed and removed

More information about the BioNLP mailing list