[BioNLP] recommendation NER software tools

Beatrice Alex balex at staffmail.ed.ac.uk
Thu Oct 8 13:09:46 EDT 2009


Hi,

these results are interesting though it's difficult to compare them if  
one tagger is trained on data that is similar to that which it is  
evaluated on and the others aren't.   It seems unsurprising that  
BANNER performs best on the BCII data, given that it was trained AND  
evaluated on BCII data.

We've used the Curran and Clark tagger trained on various biomedical  
data sets to tag entities in biomedical texts.

http://svn.ask.it.usyd.edu.au/trac/candc/wiki

It's a maxent tagger that is very fast to train and run.  As far as I  
know, there are currently no models trained on biomedical texts  
provided in its release.  However, the tagger is very easy to train on  
annotated data, though some feature optimisation is required.

Best Regards,

Bea

------------------
Beatrice Alex
Research Fellow and Project Manager at the School of Informatics,  
University of Edinburgh.

http://homepages.inf.ed.ac.uk/balex
http://www.linkedin.com/in/beatricealex
http://twitter.com/bea_alex


On 8 Oct 2009, at 07:25, Nathan Harmston wrote:

> Hi,
>
> So I ve done some evaluations on NER taggers on the BCII corpus (no
> cross validation). So the first set is using sloppy matching and the
> second strict matching criteria as you can see theres a massive jump
> in performance by GENIA and JNET mainly due to the differences in
> annotation of training sets for BC and GENIA corpus. Medpos indicates
> that the Medpos POS tagger was used. Banner trained on BCII.
>
> Tool Prec Recall F
> ABNER  0.6603 0.8043 0.7252
> BANNER - Medpos  0.9397 0.8904 0.9144
> GENIA Tagger  0.5885 0.7795 0.6707
> JNET Tagger - Medpos 0.8661 0.6490 0.742
>
>
> ABNER  0.4584 0.5584 0.5035
> BANNER - Medpos  0.7593 0.7195 0.7388
> GENIA Tagger  0.437 0.5789 0.4981
> JNET Tagger  0.5074 0.3802 0.4347
>
> On top of this the nlprot, abgene, whatizit (web service) and lots
> more though I havent yet evaluated them yet and lots lots more
>
> Hope this helps,
> Nathan
>
> 2009/10/7 kang ning <emukang at gmail.com>:
>> Genia Tagger is also nice.
>>
>> On Wed, Oct 7, 2009 at 9:48 PM, Kevin B. Cohen  
>> <kevin.cohen at gmail.com>
>> wrote:
>>>
>>> Don't forget LingPipe.
>>>
>>> Kev
>>>
>>> On Wed, Oct 7, 2009 at 1:12 PM, Jose Maria Gomez Hidalgo
>>> <jmgomezh at yahoo.es> wrote:
>>>>
>>>> I recommend searching the list archive with google, some  
>>>> interesting
>>>> threads (BANNER and ABNER rediscovered):
>>>>
>>>>
>>>> http://www.google.es/search?q=named+entity+tool+site:lists.ccs.neu.edu+-job+-TPs+-position
>>>>
>>>> Best
>>>> --
>>>>
>>>> José María Gómez Hidalgo
>>>> http://www.esp.uem.es/jmgomez
>>>> http://jmgomezhidalgo.blogspot.com
>>>>
>>>>
>>>> --- El mié, 7/10/09, Alexandra Rostin <alros at gmx.de> escribió:
>>>>
>>>>> De: Alexandra Rostin <alros at gmx.de>
>>>>> Asunto: [BioNLP] recommendation NER software tools
>>>>> Para: bionlp at lists.ccs.neu.edu
>>>>> Fecha: miércoles, 7 octubre, 2009 8:30
>>>>> Hello everyone,
>>>>>
>>>>> I'm searching for free available software tools (best open
>>>>> source) for
>>>>> named entity recognition (NER) of gene/protein names like
>>>>> BANNER or
>>>>> ABNER for comparing and combining various NER techniques.
>>>>>
>>>>> Do you have any suggestions for me?
>>>>>
>>>>> Thank you very much,
>>>>> Alexandra
>>>>>
>>>>> _______________________________________________
>>>>> BioNLP mailing list
>>>>> BioNLP at lists.ccs.neu.edu
>>>>> https://lists.ccs.neu.edu/bin/listinfo/bionlp
>>>>> The BioNLP website: http://www.bionlp.org
>>>>>
>>>>
>>>> __________________________________________________
>>>> Correo Yahoo!
>>>> Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
>>>> Regístrate ya - http://correo.yahoo.es
>>>>
>>>> _______________________________________________
>>>> BioNLP mailing list
>>>> BioNLP at lists.ccs.neu.edu
>>>> https://lists.ccs.neu.edu/bin/listinfo/bionlp
>>>> The BioNLP website: http://www.bionlp.org
>>>
>>>
>>>
>>> --
>>> K. B. Cohen
>>> Biomedical Text Mining Group Lead, Center for Computational  
>>> Pharmacology
>>> and
>>> Lead Artificial Intelligence Engineer, The MITRE Corporation, Human
>>> Language Technology Division
>>> 303-916-2417 (cell) 303-377-9194 (home)
>>> http://compbio.uchsc.edu/Hunter_lab/Cohen
>>>
>>>
>>> _______________________________________________
>>> BioNLP mailing list
>>> BioNLP at lists.ccs.neu.edu
>>> https://lists.ccs.neu.edu/bin/listinfo/bionlp
>>> The BioNLP website: http://www.bionlp.org
>>>
>>
>>
>> _______________________________________________
>> BioNLP mailing list
>> BioNLP at lists.ccs.neu.edu
>> https://lists.ccs.neu.edu/bin/listinfo/bionlp
>> The BioNLP website: http://www.bionlp.org
>>
>>
>
> _______________________________________________
> BioNLP mailing list
> BioNLP at lists.ccs.neu.edu
> https://lists.ccs.neu.edu/bin/listinfo/bionlp
> The BioNLP website: http://www.bionlp.org
>


-------------- next part --------------
HTML attachment scrubbed and removed
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://lists.ccs.neu.edu/pipermail/bionlp/attachments/20091008/48f99c8b/attachment.txt 


More information about the BioNLP mailing list