Center for Biomedical Data and Language Processing (BioDLP)


Disease Named Entity Recognition

We argue that manual annotation can be speeded up and made more efficient if tools can accurately identify the sentences containing disease mentions that can then be annotated for relations. We argue a sentence-based evaluation metric for the development of disease named entity recognition methods which with this metric already show a high F1 score of over 95.39%. We use dictionary based and Machine learning approaches in our system.

Click Here to see a demo.

Classifying Drug-Drug Interactions with Two-Stage SVM and Post-Processing

We presented our system for the DDIExtraction-2013 shared task of classifying Drug-Drug interactions (DDIs), given labeled drug mentions. The challenge called for a five-way classification of all drug pairs in each sentence: a drug pair is either non-interacting, or interacting as one of four types. Our approach begins with the use of a two-stage weighted SVM classifier to handle the highly unbalanced class distribution: the first stage for classifying drug pairs into positive (i.e., interacting) and negative (i.e., non-interacting) classes, and the second stage classifying positively classified instances from the first stage into one of the four positive types. We used various features for the classifier, exploiting stemmed words, lemmas, bigrams, part of speech tags, verb lists, and similarity measures, among others. For each stage, we also developed a set of post-processing rules based on observations in the training data.