Clinical Named Entity Recognition system (CliNER) is an open-source natural language processing system for named entity recognition in clinical text of electronic health records. CliNER is designed to follow best practices in clinical concept extraction.

CliNER currently supports two options: (1) a traditional machine learning architecture for named entity recognition, using a Conditional Random Fields (CRF) classifier and (2) a deep learning architecture using a recurrent neural network with long short-term memory (LSTM) for sequence labelling.

  • Free software: Apache v2.0 license

View on GitHub

What does it mean? Just tell me what it does!

CliNER will identify clinically-relevant entities mentioned in a clinical narrative (such as diseases/disorders, signs/symptoms, medications, procedures, etc.) CliNER needs to be trained a corpus of clinical notes, in which humans have annotated the relevant entities. Using this data, CliNER will learn to identify the entities like the ones marked up in the annotated text.

Any existing annotated data may be used for training. For example, i2b2 2010 challenge data contains several hundred discharge summaries in which the mentions of problems, treatments, and tests have been identified.

Here is an output sample for CliNER trained on the i2b2 2010 challenge data.


W. Boag, K. Wacome, T. Naumann, A. Rumshisky. CliNER: A Lightweight Tool for Clinical Named Entity Recognition. AMIA Joint Summits on Clinical Research Informatics 2015. San Francisco, CA

W. Boag, E. Sergeeva, S. Kulshreshtha, P. Szolovits, A. Rumshisky, T. Naumann. CliNER 2.0: Accessible and Accurate Clinical Concept Extraction. ML4H: Machine Learning for Health Workshop at NIPS 2017. Long Beach, CA.