[BioNLP] New Paper on Recognition of Chemical Entities
peter.corbett at linguamatics.com
Fri Apr 20 05:00:56 EDT 2012
On 19/04/12 18:19, Phil Gooch wrote:
> Hi Ulf
> Thanks for this. Unfortunately I don't have access to the full paper.
> Can I ask: is the 68.1% F1 measure calculated using strict (exact
> boundary match) or lenient (some overlap allowed) criteria?
No access here either.
I think there's a bigger issue with evaluation here. I've reported F
scores as high as 83.2% on chemistry before (strict boundary match):
http://www.biomedcentral.com/1471-2105/9/S11/S4/ - I think a lot depends on:
a) What the source text for the evaluation corpus was.
b) Exactly which chemical named entities were being annotated.
c) How well-defined the annotation task was; i.e. how extensive the
d) How good the inter-annotator agreement was.
e) Whether the software was developed for the corpus - i.e. whether
development sets were annotated with the same guidelines as the test data.
f) Whether the training set was annotated with the same guidelines as
the test set (e.g. by cross validation).
Given all of these, it's not hard to see how F scores might go up or
down by 20% or so depending on evaluation conditions. Really, we need a
BioCreative for chemical NER.
(Incidentally, F is a perverse metric, as precision-recall curves are
typically the mirror image of F score contours, so another point is: g)
Whether the software tried to balance precision and recall. But that's
just a pet peeve of mine.)
More information about the BioNLP