Medical Concept Normalization
Normalization maps clinical terms in medical notes to concepts in standardized medical vocabularies. We develop the MCN corpus: a corpus for medical concept normalization which consists of 100 discharge summaries and provides normalization for the total of 10,919 concept mentions, using 3792 unique concepts from two controlled vocabularies, RxNorm and SNOMED CT.
We develop a baseline system for this task which complements the traditional lexical transformation based approach and uses a hybrid normalization method which incorporates a deep learning model to capture semantic similarity between different surface expressions of the same concept. When evaluating our system against the mentions which may be normalized to existing concepts in the ShARe/CLEF eHealth 2013 dataset, our hybrid system achieves 90.6% in accuracy and outperforms a strong exact match + edit distance baseline by 2.6%. The results suggest the potential of the deep learning model to further improve the performance of normalization by mapping concept mentions to concepts using semantic similarity.
Publications
@article{luo2019mcn, title={MCN: A Comprehensive Corpus for Medical Concept Normalization}, author={Luo, Yen-Fu and Sun, Weiyi and Rumshisky, Anna}, journal={Journal of biomedical informatics}, pages={103132}, year={2019}, publisher={Elsevier} }
@article{luo2019hybrid, title={A Hybrid Normalization Method for Medical Concepts in Clinical Narrative using Semantic Matching}, author={Luo, Yen-Fu and Sun, Weiyi and Rumshisky, Anna}, journal={AMIA Summits on Translational Science Proceedings}, volume={2019}, pages={732}, year={2019}, publisher={American Medical Informatics Association} }