I am a post-doctoral associate at the Computer Science Department, University of Massachusetts (Lowell). I hold a Ph.D. degree from the Department of Language and Information Sciences at the University of Tokyo (Japan). My research spans the intersection of linguistics, natural language processing, and machine learning.
This page lists my work with Text Machine lab, which I joined in 2017. See my full research profile here.
RuSentiment is a new high-quality dataset for sentiment analysis in Russian, enriched with active learning. We also present a lightweight annotation scheme for social media that ensures high speed and consistency, and can be applied to other languages (Russian and English versions released).
Word embeddings are the most widely used kind of distributional meaning representations in both industrial and academic NLP systems, and they can make dramatic difference in the performance of the system. However, the absence of a reliable intrinsic evaluation metric makes it hard to choose between dozens of models and their parameters. This work presents Linguistic Diagnostics (LD), a new methodology for evaluation, error analysis and development of word embedding models that is implemented in an open-source Python library. In a large-scale experiment with 14 datasets LD successfully highlights the differences in the output of GloVe and word2vec algorithms that correlate with their performance on different NLP tasks.
@inproceedings{rogers2018rusentiment, title={RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian}, author={Rogers, Anna and Romanov, Alexey and Rumshisky, Anna and Volkova, Svitlana and Gronas, Mikhail and Gribov, Alex}, booktitle={Proceedings of the 27th International Conference on Computational Linguistics}, pages={755–763}, year={2018} }
@article{rogers_whats_2018, author = “Rogers, Anna and Ananthakrishna, Shashwath Hosur and Rumshisky, Anna”, title = “What’s in {Your} {Embedding}, {And} {How} {It} {Predicts} {Task} {Performance}", language = “en”, journal = “Proceedings of COLING”, year = “2018”, pages = “14” }