Distributional compositional semantics in the age of word embeddings:
tasks, resources and methodology

Tutorial 4 at LREC 2018

May 7th, 2018, Miyazaki, Japan

Bibliography

1. Compositionality in linguistics and philosophy

Wallace L. Chafe. Idiomaticity as an Anomaly in the Chomskyan Paradigm. Foundations of Language, 4(2):109–127, 1968. URL: http://www.jstor.org/stable/25000002.

Donald Davidson. Theories of meaning and learnable languages. In Y. Bar-Hillel, editor, Pro$­$ceedings of the 1964 Internatonal Congress for Logic, Methodology and Philosophy of Science, pages 383–394. Tel-Aviv, 1965.

David R. Dowty. Word Meaning and Montague Grammar. Dordrecht: Reidel, 1979.

Katrin Erk and Sebastian Padó. A structured vector space model for word meaning in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 897–906. Association for Computational Linguistics, 2008. URL: http://dl.acm.org/citation.cfm?id=1613831.

Ray Jackendoff. Semantic Structures. Volume 18 of Current studies in linguistics. MIT Press, Cambridge and Mass, 1990. ISBN 978-0-585-37890-9.

J. Katz and P. Postal. The semantic interpretation of idioms and sentences containing them. MIT Research Laboratory of Electronic Quarterly Progress Report, pages 275–282, 1963.

Max Kisselew, Sebastian Padó, Alexis Palmer, and Jan Šnajder. Obtaining a Better Understanding of Distributional Models of German Derivational Morphology. IWCS 2015, pages 58, 2015. URL: http://anthology.aclweb.org/W/W15/W15-01.pdf\#page=74.

Angeliki Lazaridou, Elia Bruni, and Marco Baroni. Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world. In ACL (1), 1403–1414. 2014. URL: http://www.aclweb.org/old_anthology/P/P14/P14-1132.pdf.

A. Makkai. Idiom Structure in English. Volume 48 of Janua Linguarum, series maior. The Hague: Mouton, 1972.

Martha McGinnis. On the Systematic Aspect of Idioms. Linguistic Inquiry, 33(4):665–672, October 2002. URL: https://doi.org/10.1162/ling.2002.33.4.665, doi:10.1162/ling.2002.33.4.665.

Richard Montague. Universal grammar. Theoria, 36(3):373–398, 1970.

Peter Pagin and Dag Westerståhl. Compositionality II: Arguments and Problems. Philosophy Compass, 5(3):265–282, March 2010. URL: http://doi.wiley.com/10.1111/j.1747-9991.2009.00229.x, doi:10.1111/j.1747-9991.2009.00229.x.

Barbara Partee. Compositionality. In F. Landman and F. Veltman, editors, Varieties of Formal Semantics: Proceedings of the 4th Amsterdam Colloquium, Sept. 1982, pages 281–311. Foris Pubs., Dordrecht, 1984.

David Pitt and Jerrold J. Katz. Compositional Idioms. Language, 76(2):409–432, 2000. URL: http://www.jstor.org/stable/417662, doi:10.2307/417662.

James Pustejovsky. The generative lexicon. Computational linguistics, 17(4):409–441, 1991. URL: http://dl.acm.org/citation.cfm?id=176324.

James Pustejovsky. Co-compositionality in grammar. In Markus Werning, Wolfram Hinzen, and Edouard Machery, editors, The Oxford Handbook of Compositionality, Oxford handbooks in linguistics, pages 371–383. Oxford University Press, Oxford ; New York, NY, 2012.

Siva Reddy, Diana McCarthy, and Suresh Manandhar. An Empirical Study on Compositionality in Compound Nouns. In Proceedings of the 5th International Joint Conference on Natural Language Processing, 210–218. Chiang Mai, Thailand, 2011. AFNLP. URL: http://www.aclweb.org/anthology/I11-1024.

Bahar Salehi, Paul Cook, and Timothy Baldwin. A Word Embedding Approach to Predicting the Compositionality of Multiword Expressions. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 977–983. Denver, Colorado, 2015. URL: https://aclanthology.info/papers/N15-1099/n15-1099, doi:10.3115/v1/N15-1099.

Zoltán Gendler Szabó. Compositionality. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2017 edition, 2017. URL: https://plato.stanford.edu/archives/sum2017/entries/compositionality/.

Masashi Tsubaki, Kevin Duh, Masashi Shimbo, and Yuji Matsumoto. Modeling and Learning Semantic Co-Compositionality through Prototype Projections and Neural Networks. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 130–140. Seattle, Washington, USA, 18-21 October 2013, 2013. URL: http://www.aclweb.org/anthology/D13-1014.

Stephanie Wulff. Marrying cognitive-linguistic theory and corpus-based methods: On the compositionality of English V NP-idioms. In Dylan Glynn and Kerstin Fischer, editors, Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches, number 46 in Cognitive linguistics research, pages 223–238. De Gruyter Mouton, Berlin ; New York, 2010.

2. Introduction to distributional semantic models

2.1 Introduction to distributional semantic models

Marco Baroni, Georgiana Dinu, and Germán Kruszewski. Don't count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, volume 1, 238–247. 2014. URL: http://anthology.aclweb.org/P/P14/P14-1023.pdf.

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5(0):135–146, June 2017. URL: https://transacl.org/ojs/index.php/tacl/article/view/999.

John A. Bullinaria and Joseph P. Levy. Extracting semantic representations from word co-occurrence statistics: A computational study. Behavior Research Methods, 39(3):510–526, August 2007. URL: http://link.springer.com/article/10.3758/BF03193020, doi:10.3758/BF03193020.

John A. Bullinaria and Joseph P. Levy. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD. Behavior research methods, 44(3):890–907, 2012. URL: http://link.springer.com/article/10.3758/s13428-011-0183-8.

Anna Gladkova, Aleksandr Drozd, and Satoshi Matsuoka. Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn't. In Proceedings of the NAACL-HLT SRW, 47–54. San Diego, California, June 12-17, 2016, 2016. ACL. URL: https://www.aclweb.org/anthology/N/N16/N16-2002.pdf, doi:10.18653/v1/N16-2002.

Douwe Kiela and Stephen Clark. A systematic study of semantic vector space model parameters. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and Their Compositionality (CVSC) at EACL, 21–30. 2014. URL: http://anthology.aclweb.org/W/W14/W14-15.pdf\#page=31.

Siwei Lai, Kang Liu, Shizhu He, and Jun Zhao. How to generate a good word embedding. IEEE Intelligent Systems, 31(6):5–14, 2016. URL: https://ieeexplore.ieee.org/document/7478417/.

Gabriella Lapesa and Stefan Evert. A large scale evaluation of distributional semantic models: parameters, interactions and model selection. Transactions of the Association for Computational Linguistics, 2:531–545, 2014. URL: http://www.aclweb.org/anthology/Q14-1041.

Rémi Lebret and Ronan Collobert. Word emdeddings through Hellinger PCA. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 482–490. Gothenburg, Sweden, April 26-30 2014, 2014. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/E14-1051.

Omer Levy and Yoav Goldberg. Neural word embedding as implicit matrix factorization. In NIPS'14 Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2177–2185. Montreal, Canada — December 08 - 13, 2014, 2014. URL: https://papers.nips.cc/paper/5477-neural-word-embedding-as-implicit-matrix-factorization.pdf.

Omer Levy and Yoav Goldberg. Linguistic Regularities in Sparse and Explicit Word Representations. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 171–180. 2014. URL: http://anthology.aclweb.org/W/W14/W14-1618.pdf, doi:10.3115/v1/W14-1618.

Omer Levy, Yoav Goldberg, and Ido Dagan. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3:211–225, 2015. URL: http://www.aclweb.org/anthology/Q15-1016.

Kevin Lund and Curt Burgess. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2):203–208, 1996. doi:10.3758/BF03204766.

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of International Conference on Learning Representations (ICLR). 2013. URL: https://arxiv.org/pdf/1301.3781.

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), volume 12, 1532–1543. 2014. URL: http://llcao.net/cu-deeplearning15/presentation/nn-pres.pdf, doi:10.3115/v1/D14-1162.

Alexandre Salle, Aline Villavicencio, and Marco Idiart. Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 419–424. Berlin, Germany, August 7-12, 2016, 2016. Association for Computational Linguistics. URL: http://aclweb.org/anthology/P16-2068, doi:10.18653/v1/P16-2068.

Hinrich Schütze. Word Space. In Advances in Neural Information Processing Systems 5 (NIPS 1992), 895–902. 1993. URL: https://papers.nips.cc/paper/603-word-space.pdf.

2.2. Evaluation paradigms and problems

Judit Acs and András Kornai. Evaluating Embeddings on Dictionary-Based Similarity. In Proceedings of The 1st Workshop on Evaluating Vector Space Representations for NLP, 78. Berlin, Germany, 2016. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/W/W16/W16-25.pdf\#page=90.

Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, and Yoav Goldberg. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks. In ICLR, 1–13. Toulon, France, 2017. URL: https://openreview.net/pdf?id=BJh6Ztuxl.

Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, and Weiwei Guo. *SEM 2013 shared task: Semantic Textual Similarity. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task, 32–43. Atlanta, Georgia, 9–14 June 2013, 2013. URL: http://www.aclweb.org/anthology/S13-1004.

Eneko Agirre, Mona Diab, Daniel Cer, and Aitor Gonzalez-Agirre. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity. In Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, SemEval '12, 385–393. Stroudsburg, PA, USA, 2012. Association for Computational Linguistics. URL: http://dl.acm.org/citation.cfm?id=2387636.2387697.

João António Rodrigues, Chakaveh Saedi, Vladislav Maraev, João Silva, and António Branco. Ways of Asking and Replying in Duplicate Question Detection. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), 262–270. Association for Computational Linguistics, 2017. URL: http://www.aclweb.org/anthology/S17-1030, doi:10.18653/v1/S17-1030.

Jeremy Auguste, Arnaud Rey, and Benoit Favre. Evaluation of word embeddings against cognitive processes: primed reaction times in lexical decision and naming tasks. In Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP, 21–26. 2017. URL: http://www.aclweb.org/anthology/W17-5304.

Marco Baroni and Alessandro Lenci. How We BLESSed Distributional Semantic Evaluation. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, GEMS '11, 1–10. Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. URL: http://dl.acm.org/citation.cfm?id=2140490.2140491.

Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 632–642. Lisbon, Portugal, 17-21 September 2015, 2015. Association for Computational Linguistics. URL: http://aclweb.org/anthology/D15-1075, doi:10.18653/v1/D15-1075.

Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M. Blei. Reading tea leaves: How humans interpret topic models. In Advances in Neural Information Processing Systems, 288–296. 2009. URL: http://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models.pdf.

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 670–680. Copenhagen, Denmark, September 7–11, 2017, 2017. Association for Computational Linguistics. URL: http://aclweb.org/anthology/D17-1070, doi:10.18653/v1/D17-1070.

W.B. Dolan and C. Brockett. Automatically constructing a corpus of sentential paraphrases. In Third International Workshop on Paraphrasing (IWP2005). Asia Federation of Natural Language Processing, 2005. URL: https://www.aclweb.org/anthology/I/I05/I05-5002.pdf.

Allyson Ettinger, Naomi H. Feldman, Philip Resnik, and Colin Phillips. Modeling N400 amplitude using vector space models of word representation. In Proceedings of the 38th Annual Conference of the Cognitive Science Society, 1445–1450. 2016. URL: https://pdfs.semanticscholar.org/0845/af7a35cf89c1d6bfdb3237d0c2c1f04e3cd0.pdf.

Allyson Ettinger and Tal Linzen. Evaluating vector space models using human semantic priming results. In Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, 72–77. Berlin, Germany, August 12, 2016, 2016. Association for Computational Linguistics. URL: http://anthology.aclweb.org/W16-25\#page=84.

Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, and Chris Dyer. Problems With Evaluation of Word Embeddings Using Word Similarity Tasks. Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pages 30–35, 2016. URL: http://aclanthology.info/papers/problems-with-evaluation-of-word-embeddings-using-word-similarity-tasks.

Anna Gladkova and Aleksandr Drozd. Intrinsic evaluations of word embeddings: what can we do better? In Proceedings of The 1st Workshop on Evaluating Vector Space Representations for NLP, 36–42. Berlin, Germany, August 12, 2016, 2016. ACL. URL: http://www.aclweb.org/anthology/W/W16/W16-2507.pdf, doi:10.18653/v1/W16-2507.

Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. Teaching Machines to Read and Comprehend. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS'15, 1693–1701. Cambridge, MA, USA, 2015. MIT Press. URL: http://dl.acm.org/citation.cfm?id=2969239.2969428.

Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations. arXiv:1511.02301 [cs], November 2015. URL: http://arxiv.org/abs/1511.02301, arXiv:1511.02301.

Shankar Iyer, Nikhil Dandekar, and Kornél Csernai. First Quora Dataset Release: Question Pairs. January 2017. URL: https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs.

Michael N. Jones, Walter Kintsch, and Douglas J.K. Mewhort. High-dimensional semantic space accounts of priming. Journal of Memory and Language, 55(4):534–552, November 2006. URL: http://linkinghub.elsevier.com/retrieve/pii/S0749596X06000829, doi:10.1016/j.jml.2006.07.003.

Sigrid Klerke, Héctor Martínez Alonso, and Anders Søgaard. Looking hard: Eye tracking for detecting grammaticality of automatically compressed sentences. Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015), pages 97–105, 2015. URL: https://aclanthology.coli.uni-saarland.de/papers/W15-1814/w15-1814.

Michal Konkol, Tomáš Brychcín, Michal Nykl, and Tomáš Hercig. Geographical Evaluation of Word Embeddings. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), volume 1, 224–232. 2017. URL: http://aclweb.org/anthology/I17-1023.

Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. RACE: Large-scale ReAding Comprehension Dataset From Examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 785–794. Association for Computational Linguistics, 2017. URL: http://aclweb.org/anthology/D17-1082, doi:10.18653/v1/D17-1082.

Gabriella Lapesa and Stefan Evert. Evaluating neighbor rank and distance measures as predictors of semantic priming. In Proceedings of the ACL Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2013), 66–74. 2013. URL: http://www.aclweb.org/anthology/W13-2608.

Xin Li and Dan Roth. Learning Question Classifiers. In Proceedings of the 19th International Conference on Computational Linguistics - Volume 1, COLING '02, 1–7. Stroudsburg, PA, USA, 2002. Association for Computational Linguistics. URL: https://doi.org/10.3115/1072228.1072378, doi:10.3115/1072228.1072378.

Teng Long, Emmanuel Bengio, Ryan Lowe, Jackie Chi Kit Cheung, and Doina Precup. World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 825–834. Association for Computational Linguistics, 2017. URL: http://aclweb.org/anthology/D17-1086, doi:10.18653/v1/D17-1086.

Kevin Lund and Curt Burgess. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2):203–208, 1996. doi:10.3758/BF03204766.

Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. Building a Large Annotated Corpus of English: The Penn Treebank. Comput. Linguist., 19(2):313–330, June 1993. URL: http://dl.acm.org/citation.cfm?id=972470.972475.

Marco Marelli, Stefano Menini, Marco Baroni, Luisa Bentivogli, Raffaella Bernardi, and Roberto Zamparelli. A SICK cure for the evaluation of compositional distributional semantic models. In LREC, 216–223. 2014. URL: http://www.aclweb.org/anthology/L14-1314.

Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. In HLT-NAACL, 746–751. 2013. URL: http://www.aclweb.org/anthology/N13-1\#page=784.

Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. A Dataset for Detecting Stance in Tweets. In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Portorož, Slovenia, May 2016. European Language Resources Association (ELRA). URL: http://www.lrec-conf.org/proceedings/lrec2016/pdf/232_Paper.pdf.

Nasrin Mostafazadeh, Michael Roth, Nathanael Chambers, and Annie Louis. LSDSem 2017 Shared Task: The Story Cloze Test. In Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-Level Semantics, 46–51. Association for Computational Linguistics, 2017. URL: http://www.aclweb.org/anthology/W17-0900.

Neha Nayak, Gabor Angeli, and Christopher D. Manning. Evaluating Word Embeddings Using a Representative Suite of Practical Tasks. In Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, 19–23. Berlin, Germany, August 12, 2016, 2016. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/W16-2504, doi:10.18653/v1/W16-2504.

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2383–2392. Association for Computational Linguistics, 2016. URL: http://aclweb.org/anthology/D16-1264, doi:10.18653/v1/D16-1264.

Matthew Richardson, Christopher J C Burges, and Erin Renshaw. MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 193–203. Seattle, Washington, USA, 18-21 October 2013, 2013. URL: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/MCTest_EMNLP2013.pdf.

Parinaz Sobhani, Diana Inkpen, and Xiaodan Zhu. A Dataset for Multi-Target Stance Detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 551–557. Valencia, Spain, April 3-7, 2017, 2017. Association for Computational Linguistics. URL: http://aclweb.org/anthology/E17-2088, doi:10.18653/v1/E17-2088.

Anders Søgaard. Evaluating word embeddings with fMRI and eye-tracking. In Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NL, 116–121. Berlin, Germany, August 12, 2016, 2016. Association for Computational Linguistics. URL: http://anthology.aclweb.org/W16-25\#page=128.

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to Sequence Learning with Neural Networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 3104–3112. Curran Associates, Inc., 2014. URL: http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf.

Erik F. Tjong Kim Sang and Fien De Meulder. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, 142–147. Association for Computational Linguistics, 2003. URL: http://aclweb.org/anthology/W03-0419.

Erik F. Tjong Kim Sang, Kim Sang, and Sabine Buscholz. Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of CoNLL. Association for Computational Linguistics, 2000. URL: http://aclweb.org/anthology/W00-0726.

Kristina Toutanova, Dan Klein, and Christopher D Manning. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computa- Tional Linguistics on Human Language Technology, volume 1, 173–180. 2013. URL: http://aclweb.org/anthology/N03-1033.

Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Guillaume Lample, and Chris Dyer. Evaluation of word vector representations by subspace alignment. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2049–2054. Lisbon, Portugal, 17-21 September 2015, 2015. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/D15-1243.

Yulia Tsvetkov, Manaal Faruqui, and Chris Dyer. Correlation-based Intrinsic Evaluation of Word Vector Representations. Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pages 111–115, 2016. URL: http://aclanthology.info/papers/correlation-based-intrinsic-evaluation-of-word-vector-representations.

Cyma Van Petten. Examining the N400 semantic context effect item-by-item: Relationship to corpus-based measures of word co-occurrence. International Journal of Psychophysiology, 94(3):407–419, December 2014. URL: http://www.sciencedirect.com/science/article/pii/S0167876014016377, doi:10.1016/j.ijpsycho.2014.10.012.

Ekaterina Vylomova, Laura Rimmel, Trevor Cohn, and Timothy Baldwin. Take and took, gaggle and goose, book and read: evaluating the utility of vector differences for lexical relation learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1671–1682. Berlin, Germany, 2016. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/P16-1158, doi:10.18653/v1/P16-1158.

Adina Williams, Nikita Nangia, and Samuel R. Bowman. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1112–1122. New Orleans, Louisiana, 2017. Association for Computational Linguistics. URL: http://aclweb.org/anthology/N18-1101, doi:10.18653/v1/N18-1101.

Geoffry Zweig and Christopher J C Burges. The Microsoft Research Sentence Completion Challenge. Technical Report Microsoft Research Technical Report MSR-TR-2011-129, Microsoft Research, Redmond, WA, 2011. URL: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR_SCCD.pdf.

3. Introduction to distributional semantic models

3.1. Compositionality at sentence level: algebraic approaches

Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, and Yoav Goldberg. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks. In ICLR, 1–13. Toulon, France, 2017. URL: https://openreview.net/pdf?id=BJh6Ztuxl.

Marco Baroni and Roberto Zamparelli. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 1183–1193. MIT, Massachusetts, USA, 9-11 October 2010, 2010. URL: https://www.aclweb.org/anthology/D/D10/D10-1115.pdf, doi:10.4249/scholarpedia.3881.

Islam Beltagy, Katrin Erk, and Raymond Mooney. Probabilistic Soft Logic for Semantic Textual Similarity. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1210–1219. Baltimore, Maryland, 2014. URL: http://aclweb.org/anthology/P14-1114.

I. Beltagy, Stephen Roller, Pengxiang Cheng, Katrin Erk, and Raymond J. Mooney. Representing Meaning with a Combination of Logical and Distributional Models. Computational Linguistics, 42(4):763–808, December 2016. URL: http://www.aclweb.org/anthology/J16-4007, arXiv:1505.06816, doi:10.1162/COLI\%0020a\%002000266.

William Blacoe and Mirella Lapata. A Comparison of Vector-based Representations for Semantic Composition. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 546–556. Stroudsburg, PA, USA, 2012. Association for Computational Linguistics. URL: http://dl.acm.org/citation.cfm?id=2390948.2391011.

C. De Boom, S. Van Canneyt, S. Bohez, T. Demeester, and B. Dhoedt. Learning Semantic Similarity for Very Short Texts. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW), 1229–1234. November 2015. URL: https://ieeexplore.ieee.org/document/7395808/, doi:10.1109/ICDMW.2015.86.

Daoud Clarke. A context-theoretic framework for compositionality in distributional semantics. Computational Linguistics, 38(1):41–71, March 2012. URL: http://dl.acm.org/citation.cfm?id=2122944.2122946, doi:10.1162/COLI_a_00084.

Stephen Clark and Stephen Pulman. Combining Symbolic and Distributional Models of Meaning. In AAAI Spring Symposium: Quantum Interaction, 52–55. 2007. URL: http://www.aaai.org/Papers/Symposia/Spring/2007/SS-07-08/SS07-08-008.pdf.

Bob Coecke, Mehrnoosh Sadrzadeh, and Stephen Clark. Mathematical Foundations for a Compositional Distributional Model of Meaning. Linguistic Analysis, 36(1-4):345–384, 2011. URL: https://ora.ox.ac.uk/objects/uuid:2932007a-6911-400f-9992-e0f85fb9de93/datastreams/ATTACHMENT01.

Edilson Anselmo Corrêa Júnior, Vanessa Queiroz Marinho, and Leandro Borges dos Santos. NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis. In Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017), 611–615. Association for Computational Linguistics, 2017. URL: http://aclweb.org/anthology/S17-2100, doi:10.18653/v1/S17-2100.

Katrin Erk and Sebastian Padó. A structured vector space model for word meaning in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 897–906. Association for Computational Linguistics, 2008. URL: http://dl.acm.org/citation.cfm?id=1613831.

Alona Fyshe, Leila Wehbe, Partha P. Talukdar, Brian Murphy, and Tom M. Mitchell. A Compositional and Interpretable Semantic Space. Proceedings of the NAACL-HLT, Denver, USA, 2015. URL: http://www.cs.cmu.edu/~fmri/papers/naacl2015/comp_nnse.pdf.

Edward Grefenstette and Mehrnoosh Sadrzadeh. Experimental support for a categorical compositional distributional model of meaning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1394–1404. Association for Computational Linguistics, 2011. URL: http://dl.acm.org/citation.cfm?id=2145580.

Edward Grefenstette and Mehrnoosh Sadrzadeh. Concrete Models and Empirical Evaluations for the Categorical Compositional Distributional Model of Meaning. Computational Linguistics, 41(1):71–118, March 2015. URL: http://www.mitpressjournals.org/doi/10.1162/COLI_a_00209, doi:10.1162/COLI_a_00209.

Thomas K. Landauer, Darrell Laham, Bob Rehder, and Missy E. Schreiner. How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. In Proceedings of the 19th Annual Meeting of the Cognitive Science Society, 412–417. 1997. URL: http://lsa.colorado.edu/papers/cogsci97.pdf.

Qv Le and Tomas Mikolov. Distributed Representations of Sentences and Documents. In International Conference on Machine Learning - ICML 2014, volume 32, 1188–1196. 2014. URL: http://arxiv.org/abs/1405.4053, doi:10.1145/2740908.2742760.

Mike Lewis and Mark Steedman. Combining distributional and logical semantics. Transactions of the Association for Computational Linguistics, 1:179–192, 2013. URL: http://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/93.

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26 (NIPS 2013), 3111–3119. 2013. URL: http://papers.nips.cc/paper/5021-di.

Jeff Mitchell and Mirella Lapata. Composition in distributional models of semantics. Cognitive science, 34(8):1388–1429, 2010. URL: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1551-6709.2010.01106.x, doi:10.1111/j.1551-6709.2010.01106.x.

Xiaochang Peng and Daniel Gildea. Exploring phrase-compositionality in skip-gram models. arXiv:1607.06208 [cs], July 2016. URL: http://arxiv.org/abs/1607.06208, arXiv:1607.06208.

Sebastian Rudolph and Eugenie Giesbrecht. Compositional Matrix-Space Models of Language. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 907–916. Uppsala, Sweden 11-16 June 2010, 2010. URL: http://www.aclweb.org/anthology/P10-1093.

Tim Van de Cruys, Thierry Poibeau, and Anna Korhonen. A tensor-based factorization model of semantic compositionality. In Conference of the North American Chapter of the Association of Computational Linguistics (HTL-NAACL), 1142–1151. 2013. URL: https://hal.archives-ouvertes.fr/hal-00997334/.

Lyndon White, Roberto Togneri, Wei Liu, and Mohammed Bennamoun. How Well Sentence Embeddings Capture Meaning. In Proceedings of the 20th Australasian Document Computing Symposium, ADCS '15, 9:1–9:8. New York, NY, USA, 2015. ACM. URL: http://doi.acm.org/10.1145/2838931.2838932, doi:10.1145/2838931.2838932.

Dominic Widdows. Word-Vectors and Search Engines. In Geometry and Meaning. Stanford: Center for the Study of Language and Information, 2004. URL: http://www.puttypeg.net/papers/vector-chapter.pdf.

3.2. Compositionality at sentence level: holistic approaches

Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Advances in Neural Information Processing Systems 27 (NIPS 2014), volume 2, 2042–2050. Montreal, Canada, December 08 - 13, 2014, 2014. URL: https://dl.acm.org/citation.cfm?id=2969055.

Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. A Convolutional Neural Network for Modelling Sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 655–665. Baltimore, Maryland, USA, June 23-25 2014, 2014. URL: http://www.aclweb.org/anthology/P14-1062.

Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Skip-Thought Vectors. In Advances in Neural Information Processing Systems 28 (NIPS 2015), volume 2, 3294–3302. Montreal, Canada, December 07 - 12, 2015, 2015. URL: https://papers.nips.cc/paper/5950-skip-thought-vectors.pdf.

Qv Le and Tomas Mikolov. Distributed Representations of Sentences and Documents. In International Conference on Machine Learning - ICML 2014, volume 32, 1188–1196. 2014. URL: http://arxiv.org/abs/1405.4053, doi:10.1145/2740908.2742760.

Tal Linzen. Issues in evaluating semantic spaces using word analogies. In Proceedings of the First Workshop on Evaluating Vector Space Representations for NLP. Association for Computational Linguistics, 2016. URL: http://anthology.aclweb.org/W16-2503, doi:http://dx.doi.org/10.18653/v1/W16-2503.

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26 (NIPS 2013), 3111–3119. 2013. URL: http://papers.nips.cc/paper/5021-di.

Jonas Mueller and Aditya Thyagarajan. Siamese Recurrent Architectures for Learning Sentence Similarity. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), 2786–2792. Phoenix, Arizona, February 12-17, 2016, 2016. AAAI Press. URL: https://dl.acm.org/citation.cfm?id=3016291.

Richard Socher, Brody Huval, Christopher D Manning, and Andrew Y Ng. Semantic Compositionality through Recursive Matrix-Vector Spaces. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 1201–1211. Stroudsburg, PA, USA, 2012. doi:10.1162/153244303322533223.

Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. Semi-supervised Recursive Autoencoders for Predicting Sentiment Distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '11, 151–161. Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. URL: http://dl.acm.org/citation.cfm?id=2145432.2145450.

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to Sequence Learning with Neural Networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 3104–3112. Curran Associates, Inc., 2014. URL: http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf.

Kai Sheng Tai, Richard Socher, and Christopher D. Manning. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 1556–1566. Beijing, China, July 26-31, 2015, 2015. Association for Computational Linguistics. URL: http://aclweb.org/anthology/P15-1150, doi:10.3115/v1/P15-1150.

3.3. Compositionality at subword level

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5(0):135–146, June 2017. URL: https://transacl.org/ojs/index.php/tacl/article/view/999.

Yuanzhi Ke and Masafumi Hagiwara. Radical-level Ideograph Encoder for RNN-based Sentiment Analysis of Chinese and Japanese. arXiv:1708.03312 [cs], August 2017. URL: http://arxiv.org/abs/1708.03312, arXiv:1708.03312.

Yanran Li, Wenjie Li, Fei Sun, and Sujian Li. Component-enhanced Chinese character embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 829–834. Lisbon, Portugal, 17-21 September 2015, 2015. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/D15-1098.

Viet Nguyen, Julian Brooke, and Timothy Baldwin. Sub-character Neural Language Modelling in Japanese. In Proceedings of the First Workshop on Subword and Character Level Models in NLP, 148–153. 2017. URL: http://aclweb.org/anthology/W17-4122.

Yuya Sakaizawa and Mamoru Komachi. Construction of a Japanese Word Similarity Dataset. In Nicoletta Calzolari (Conference chair), Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga, editors, Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan, May 2017. European Language Resources Association (ELRA). URL: http://www.lrec-conf.org/proceedings/lrec2018/pdf/96.pdf.

John Wieting, Mohit Bansal, Kevin Gimpel, and Karen Livescu. Charagram: Embedding words and sentences via character n-grams. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1504–1515. Austin, Texas, November 1-5, 2016, 2016. Association for Computational Linguistics. URL: http://aclweb.org/anthology/D16-1157.

Rongchao Yin, Quan Wang, Peng Li, Rui Li, and Bin Wang. Multi-Granularity Chinese Word Embedding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 981–986. Austin, Texas, November 1-5, 2016, 2016. Association for Computational Linguistics. URL: http://aclweb.org/anthology/D16-1100, doi:10.18653/v1/D16-1100.

Jinxing Yu, Xun Jian, Hao Xin, and Yangqiu Song. Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 286–291. Copenhagen, Denmark, September 7–11, 2017, 2017. Association for Computational Linguistics. URL: http://aclweb.org/anthology/D17-1027, doi:10.18653/v1/D17-1027.

4. Meaning components in word embeddings

Aleksandr Drozd, Anna Gladkova, and Satoshi Matsuoka. Word embeddings, analogies, and machine learning: beyond king - man + woman = queen. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 3519–3530. Osaka, Japan, December 11-17, 2016. URL: https://www.aclweb.org/anthology/C/C16/C16-1332.pdf.

Alona Fyshe, Leila Wehbe, Partha P. Talukdar, Brian Murphy, and Tom M. Mitchell. A Compositional and Interpretable Semantic Space. Proceedings of the NAACL-HLT, Denver, USA, 2015. URL: http://www.cs.cmu.edu/~fmri/papers/naacl2015/comp_nnse.pdf.

Anna Gladkova and Aleksandr Drozd. Intrinsic evaluations of word embeddings: what can we do better? In Proceedings of The 1st Workshop on Evaluating Vector Space Representations for NLP, 36–42. Berlin, Germany, August 12, 2016, 2016. ACL. URL: http://www.aclweb.org/anthology/W/W16/W16-2507.pdf, doi:10.18653/v1/W16-2507.

Anna Gladkova, Aleksandr Drozd, and Satoshi Matsuoka. Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn't. In Proceedings of the NAACL-HLT SRW, 47–54. San Diego, California, June 12-17, 2016, 2016. ACL. URL: https://www.aclweb.org/anthology/N/N16/N16-2002.pdf, doi:10.18653/v1/N16-2002.

Hongyin Luo, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. Online Learning of Interpretable Word Embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1687–1692. Lisbon, Portugal, 17-21 September 2015, 2015. Association for Computational Linguistics. URL: http://www.anthology.aclweb.org/D/D15/D15-1196.pdf.

Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. In HLT-NAACL, 746–751. 2013. URL: http://www.aclweb.org/anthology/N13-1\#page=784.

Brian Murphy, Partha Talukdar, and Tom Mitchell. Learning effective and interpretable semantic models using non-negative sparse embedding. Proceedings of COLING 2012, pages 1933–1950, 2012. URL: http://aclanthology.info/pdf/C/C12/C12-1118.pdf.

Anna Rogers, Aleksandr Drozd, and Bofang Li. The (Too Many) Problems of Analogical Reasoning with Word Vectors. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (* SEM 2017), 135–148. 2017. URL: http://www.aclweb.org/anthology/S17-1017.

Fei Sun, Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng. Sparse Word Embeddings Using L1 Regularized Online Learning. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI'16, 2915–2921. New York, New York, USA, 2016. AAAI Press. URL: http://dl.acm.org/citation.cfm?id=3060832.3061029.

Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Guillaume Lample, and Chris Dyer. Evaluation of word vector representations by subspace alignment. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2049–2054. Lisbon, Portugal, 17-21 September 2015, 2015. Association for Computational Linguistics. URL: http://www.aclweb.org/anthology/D15-1243.