[BioNLP] using mechanical turk

Leon French leonfrench at gmail.com
Wed Nov 30 19:18:17 EST 2011


I'm wondering if anyone has any updates on using Mechanical Turk for
biomedical text mining - say for gene or protein name recognition?

The only paper I have found is:
Using Amazon’s Mechanical Turk for Annotating Medical Named Entities
Yetisgen-Yildiz, Solti, and Xia


On 29 April 2009 10:46, Bob Carpenter <carp at alias-i.com> wrote:
>  > I would be very curious to know what annotation tasks you are throwing
>  > at Mechanical Turk and what you mean by "surprisingly effective" esp. if
>  > you had anything quantitative to share (throughput, accuracy, cost per
>  > task, etc.)  I can think of a number of experiments that would be fun to
>  > try if I thought they were remotely plausible....
> It's very easy to use, and if you're comfortable
> setting up javascript-enabled web pages (really
> just cutting and pasting on the javascript front),
> it's easy to do it very flexibly.
> So far, we've done newswire named entity and
> morphology for English.
> We have been able to get people to pass the
> qualifying task on linking gene mentions to
> Entrez-Gene, but it seems like it's too involved
> a task for mech Turk, even though we kept upping
> the payment.  The tasks that are fairly simple
> and self-contained seem much more popular.
> By effective, I mean both in terms of cost
> and in terms of data.  I got better results
> recreating MUC-6 named entities than in the
> original data and similarly for textual
> inference (though I didn't collect the data
> for that -- Snow et al. did [see below for
> a pointer to their papers and data]).
> I wrote up a short two-page conference version
> here:
>   http://lingpipe.files.wordpress.com/2008/09/hierarchical-data-annotation-nyas-08.pdf
> a typical ACL-like 9-page conference-submission
> version here:
>   http://lingpipe.files.wordpress.com/2009/01/anno-bayes-entities-09.pdf
> and a longer tech report version here:
>   http://lingpipe.files.wordpress.com/2008/11/carp-bayesian-multilevel-annotation.pdf
> The R and Bugs code are in LingPipe's sandbox
> project "hierAnno", instructions for retrieving
> here:
>   http://alias-i.com/lingpipe/web/sandbox.html
> There's also a stream of blog entries and pointers
> to other work and the papers contain extensive
> refs, especially the tech report.
> Here's a blog entry on stemming that's
> not in the papers:
> http://lingpipe-blog.com/2009/02/25/stemming-morphology-corpus-coding-standards/
> and some pointers to work by Snow et al. and
> Dolores Labs:
> http://lingpipe-blog.com/2008/09/15/dolores-labs-text-entailment-data-from-amazon-mechanical-turk/
> They did a whole bunch of different tasks.
> There's also been a ton of work on the Turk in
> the image processing and search space.
> - Bob Carpenter
>   Alias-i
> _______________________________________________
> BioNLP mailing list
> BioNLP at lists.ccs.neu.edu
> https://lists.ccs.neu.edu/bin/listinfo/bionlp
> The BioNLP website: http://www.bionlp.org

More information about the BioNLP mailing list