[BioNLP] New Clinical NLP Publication

Imre Solti imre.solti at gmail.com
Wed Dec 26 10:54:51 EST 2012

A new clinical NLP paper was published in JAMIA:
The paper is Open Access.

A sequence labeling approach to link medications and their attributes in
clinical notes and clinical trial announcements for information extraction

Qi Li, Haijun Zhai, Louise Deleger, Todd Lingren, Megan Kaiser, Laura
Stoutenborough, Imre Solti

Objective The goal of this work was to evaluate machine learning methods,
binary classification and sequence labeling, for medication–attribute
linkage detection in two clinical corpora.

Data and methods We double annotated 3000 clinical trial announcements
(CTA) and 1655 clinical notes (CN) for medication named entities and their
attributes. A binary support vector machine (SVM) classification method
with parsimonious feature sets, and a conditional random fields (CRF)-based
multi-layered sequence labeling (MLSL) model were proposed to identify the
linkages between the entities and their corresponding attributes. We
evaluated the system's performance against the human-generated gold

Results The experiments showed that the two machine learning approaches
performed statistically significantly better than the baseline rule-based
approach. The binary SVM classification achieved 0.94 F-measure with
individual tokens as features. The SVM model trained on a parsimonious
feature set achieved 0.81 F-measure for CN and 0.87 for CTA. The CRF MLSL
method achieved 0.80 F-measure on both corpora.

Discussion and conclusions We compared the novel MLSL method with a binary
classification and a rule-based method. The MLSL method performed
statistically significantly better than the rule-based method. However, the
SVM-based binary classification method was statistically significantly
better than the MLSL method for both the CTA and CN corpora. Using
parsimonious feature sets both the SVM-based binary classification and
CRF-based MLSL methods achieved high performance in detecting medication
name and attribute linkages in CTA and CN.

Imre Solti, MD, PhD, MA
Assistant Professor
Division of Biomedical Informatics
University of Cincinnati &
Cincinnati Children's Hospital Medical Center
-------------- next part --------------
HTML attachment scrubbed and removed

More information about the BioNLP mailing list