Clinical Named Entity Recognition system (CliNER) is an open-source natural language processing system for named entity recognition in clinical text of electronic health records. CliNER is designed to follow best practices in clinical concept extraction.

CliNER is implemented as a two-pass machine learning system for named entity recognition, currently using a Conditional Random Fields (CRF) classifier to establish concept boundaries and a Support Vector Machine (SVM) classifier to establish the type of concept.

  • Free software: Apache v2.0 license

View on GitHub

What does it mean? Just tell me what it does!

CliNER will identify clinically-relevant entities mentioned in a clinical narrative (such as diseases/disorders, signs/symptoms, medications, procedures, etc.) CliNER needs to be trained a corpus of clinical notes, in which humans have annotated the relevant entities. Using this data, CliNER will learn to identify the entities like the ones marked up in the annotated text.

Any existing annotated data may be used for training. For example, i2b2 2010 challenge data contains several hundred discharge summaries in which the mentions of problems, treatments, and tests have been identified.

Here is an output sample for CliNER trained on the i2b2 2010 challenge data.


W. Boag, K. Wacome, T. Naumann, A. Rumshisky. CliNER: A Lightweight Tool for Clinical Named Entity Recognition. (poster) AMIA Joint Summits on Clinical Research Informatics 2015. San Francisco, CA