Clinical Named Entity Recognition system (CliNER) is an open-source natural language processing system for named entity recognition in clinical text of electronic health records. CliNER is designed to follow best practices in clinical concept extraction.
CliNER is implemented as a two-pass machine learning system for named entity recognition, currently using a Conditional Random Fields (CRF) classifier to establish concept boundaries and a Support Vector Machine (SVM) classifier to establish the type of concept.
CliNER will identify clinically-relevant entities mentioned in a clinical narrative (such as diseases/disorders, signs/symptoms, medications, procedures, etc.) CliNER needs to be trained a corpus of clinical notes, in which humans have annotated the relevant entities. Using this data, CliNER will learn to identify the entities like the ones marked up in the annotated text.
Any existing annotated data may be used for training. For example, i2b2 2010 challenge data contains several hundred discharge summaries in which the mentions of problems, treatments, and tests have been identified.