HG7017: Computational Lexical Semantics

Francis Bond, 2014.

Thursday 9:30-13:30 HSS Computer Lab 2 (HSS-01-05)

In this course students will become familiar with how to represent word meanings computationally and a variety of methods for automatically determining the meaning of words. These include dictionary based methods, such as LESK, graph based methods such as UKB, supervised methods such as sequence classification and best paths as well as vector space methods. We will finish with a discussion of how to provide feedback from word sense disambiguation to meaning representation.

The course is a seminar course, with participants choosing and presenting papers each week. Because of this, content varies from year to year depending on the composition of the class.

The course includes a strong computational component, students will experiment with implementing systems and algorithms, in addition to learning about them. By the end of this course, graduate students will have an advanced knowledge of current computational approaches to lexical meaning. Students should be able to evaluate lexical resources for coverage and quality as well as to design and implement systems for determining meaning in text.

This year (2104) we will try to cover at least:

Lexical Semantics in Definitions (LESK)
Distributional Approaches (e.g. Vector Spaces)
Structural Semantics (e.g. Semantic Dependencies)
Graph based approaches (e.g. personalised page rank)
BabelNet (a much richer graph)
Combinations
- Structure + Distributional Semantics (Socher)
- Structural + Lexical (AMR)
Knowledge Acquisition (e.g. learn from definitions or text snippets; also learning regular expression from examples)

Schedule

Seminar 1 2014-08-14
- Various Approaches to WSD (HG8003 notes)
- Simplified Extended LESK (Baldwin et al 2010) (Slides)
- Research at VU: NAF and Newsreader
Seminar 2 2014-08-21
- Semantic Dependencies (Rose Chen: slides)
  - Chen et al 2012Semantic Labeling of Chinese Verb-complement Structure Based on Feature Structure, CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics, Pages 784-790
  - Chen et al (????)Building a Chinese Semantic Resource Based on Feature Structure IJCPOL
  - Chen et al (2013) Semantic Labeling of Chinese Serial Verb Sentences Based on Feature Structure
- Deep Learning over distributional spaces applied to learning predominant senses (Huizhen Wang: slides)
  - Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013.
  - code (in python: tutorial)
  - McCarthy et al. (2004)Finding predominant word senses in untagged text ACL 2004
  - Lau et al (2014) Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models (slides)
Seminar 3 2014-08-28
- Recursive Deep Learning (Egor: slides)
  - Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks, Richard Socher, Christopher D. Manning, Andrew Y. Ng.
    Deep Learning and Unsupervised Feature Learning Workshop - NIPS 2010, Oral.
  - NAACL 2013 talk by Manning and Socher (video, slides)
- Cross-lingual sense projection (Giulia: slides)
  - Luisa Bentivogli and Emanuele Pianta (2005) Exploiting Parallel Texts in the Creation of Multilingual Semantically Annotated Resources: The MultiSemCor Corpus, In Natural Language Engineering, Special Issue on Parallel Texts, Volume 11, Issue 03, September 2005, pp. 247-261.
- BabelNet & WSD (Luis: slides)
  - R. Navigli and S. Ponzetto. 'BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network. Artificial Intelligence, 193, Elsevier, 2012, pp. 217-250.
  - R. Navigli, D. A. Jurgens, D. Vannella. SemEval-2013 Task 12: Multilingual Word Sense Disambiguation. Proc. of 7th International Workshop on Semantic Evaluation (SemEval), in the Second Joint Conference on Lexical and Computational Semantics (*SEM 2013), Atlanta, USA, June 14-15th, 2013, pp. 222-231.
Seminar 4 2014-09-04 (FCB has a clash 10:30-12:30)
- Graph Based Word Sense Disambiguation and Similarity with Personalized Page Rank (Yukun: slides)
  - Eneko Agirre and Aitor Soroa (2009) Personalizing PageRank for Word Sense Disambiguation. Proceedings of the 12th conference of the European chapter of the Association for Computational Linguistics (EACL-2009). Athens, Greece.
  - Eneko Agirre, Oier Lopez de Lacalle and Aitor Soroa (2013) Random Walks for Knowledge-Based Word Sense Disambiguation. Computational Linguistics. 40:1. ISSN 0891-2017. doi:10.1162/COLI_a_00164 (optional)
Seminar 5 2014-09-11
- Learning Ontologies from Definitions (David: slides)
  - Francis Bond, Eric Nichols, Sanae Fujita and Takaaki Tanaka (2004) Acquiring an Ontology for a Fundamental Vocabulary. In 20th International Conference on Computational Linguistics (COLING-2004), 1319–1325, Geneva.
  - Eric Nichols, Francis Bond, and Daniel Flickinger (2005) Robust ontology acquisition from machine-readable dictionaries. In Proceedings of the International Joint Conference on Artificial Intelligence IJCAI-2005, 1111–1116, Edinburgh.
- Word Sense Disambiguation (Tuan Anh: slides)
  - Zhong, Zhi, & Ng, Hwee Tou (2010). It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text. Proceedings of the ACL 2010 System Demonstrations. (pp. 78 – 83). Uppsala, Sweden.
  - Other resources:
    - What is machine learning? Supervised learning vs unsupervised learning?
    - A very good book (and free access) for hardcore learners
Seminar 6 2014-09-18
- Taxonomy Construction Using Syntactic Contextual Evidence (Luu Anh Tuan and Kim Jung-jae:
- On Research (Francis Bond: slides)
  - Richard Hamming (1968) You and Your Research
  - NTU Authorship Guidelines
Seminar 7 2014-09-25
- A computational approach to generate a sensorial lexicon (Tuan Anh)
- Panlex (Michael)
  - Building a Sense - Distinguished Multilingual Lexicon from Monolingual Corpora and Bilingual Lexicons
Recess
Seminar 8 2014-10-16
- Combining Resources (David: slides)
  - Margaretha and Manurung (2008) Comparing the value of Latent Semantic Analysis on two English-to-Indonesian lexical mapping tasks
- The Integrated Semantic Framework (Francis: slides)
Seminar 9 2014-10-23 (short class) PROJECT ONE DUE 2014-10-24 11:59 (email is fine)
- KORE: Keyphrase Overlap Relatedness for Entity Disambiguation (Yukun)
Seminar 10 2014-10-30 (+ invited talk)
- Phrase clustering for discriminative learning by Lin & Wu (Egor)
- An Enhanced Lesk Word Sense Disambiguation algorithm through a Distributional Semantic Model (P. Basile, A. Caputo, and G. Semeraro. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 1591-1600) (Giulia)
- 31st 15:30-18:30: Talks on Wordnet (Christiane Fellbaum) and the ERG (Dan Flickinger)
Seminar 11 2014-11-06 (no presentations --- individual meetings about your projects as necessary)
Seminar 12 2014-11-13 (Late start: 12:30–)
- Final discussion:
  What are the most important problems in your field?
  - Are you working on them?
Final Deadline 2014-12-01 PROJECT TWO DUE 2014-12-01 11:59 (email is fine)

Assessment

Presentation One (20%): assigned subject

Present a paper (or series of papers), along with an extended motivation/background and some discussion of applications. We will discuss both the content of the paper, and also the form of your presentation. One goal of the course is to learn how to present information effectively.

Presentations should take an hour, with 30-40 minutes of presentation and 30-20 minutes discussion. You should give me the corrected slides within a week of the presentation, and I will add them to the web page.

Seminar 2: Rose Chen; Huizhen Wang
Seminar 3: Egor, Giulia
Seminar 4: Yukun, Luis
Seminar 5: David, Tuan Anh

Presentation Two (20%): your choice (must be OKed: should fit in with your research)

Present a paper (or series of papers), along with an extended motivation/background and some discussion of applications. Can be your own work (if it is ready).

Seminar 7: Tuan Anh, Mike
Seminar 8: David
Seminar 9: ???
Seminar 11: ???

Project One (30%): assigned topic

Sample: critically evaluate one lexical resource (such as WordNet and tagged corpus). The evaluation will include writing code to summarize its properties and compare it to other resources.

The project must include a computational component, a presentation, and a written component. In the computational component the student will be expected to write to code analyse data and solve problems. In the presentation, they will present the results to their lecturer and peers. In the written component they will explain their results in the form of a short paper. They should incorporate any feedback from their presentation in the final paper. The final paper should follow the submission guidelines for TACL.

Tuan Anh: Extended Simplified LESK with the wordnet gloss corpus
Egor: Examining compositionally with vector-spaces OR phrase clustering with
Giulia: Cross-lingual mapping: projection vs intersection
David: ? identifying and exploiting definitions in Indonesian
and maybe other languages
Yukun: Named entity

Project Two (30\%): your choice (must be OKed: should fit in with your research)

Sample: implement (or extend an existing implementation of) a word sense disambiguation system.

Course Outline

Sample topics

Representing Word Meaning
Word Sense Disambiguation
Dictionary based methods
Supervised Methods
Cross lingual disambiguation
Graph based methods
Vector Space Methods
Method Combination
WSD and applications
Machine Translation
Intelligent Indexing and the Semantic Web

Sample Discussions

Computational Lexical Semantics Relational meaning and WordNets; Vector Spaces; Ontologies (Stevenson, 2003, ch 3); (Miller, 1998), (Vossen, 1998, ch2)
Splitting or Lumping: Granularity; Meaning across languages; Multiword expressions (Navigli, 2006),(Stevenson, 2003, ch 2)
Dictionary based methods for WSD LESK, ALT (Lesk, 1986; Ikehara et al., 1996; Baldwin et al., 2010)
Supervised methods for WSD. CL Special Issue on WSD (www.aclweb.org/anthology-new/J/J98/)
Cross lingual disambiguation for WSD (Dagan and Pereira, 1994; Zhong and Ng, 2009)
Graph based methods for WSD (Agirre and Soroa, 2009; Agirre et al., 2009)
Vector space methods for WSD (Widdows, 2004, ch 2–4)
Method Combination (Stevenson, 2003, ch 6)
Machine Translation (Carpuat et al., 2006; Chan et al., 2007)
Intelligent Indexing and the Semantic Web (Shadbolt et al., 2006)
WSD and the Representation of Meaning (Stevenson, 2003, ch 7)

References

Agirre, Eneko and Aitor Soroa. (2009) .Personalizing pagerank for word sense disambiguation. In 12th conference of the European chapter of the Association for Computational Linguistics (EACL-2009), Greece.
Agirre, Eneko, Oier Lopez de Lacalle, and Aitor Soroa. (2009). Knowledge-based WSD and specific domains: performing over supervised WSD. In International Joint Conference on Artificial Intelligence (IJCAI-2009), Pasadena.
Baldwin, Timothy, Su Nam Kim, Francis Bond, Sanae Fujita, David Martinez, and Takaaki Tanaka. (2010). A reexamination of MRD-based word sense disambiguation. ACM Transactions on Asian Language Information Processing (TALIP). (to appear).
Chan, Yee Seng, Hwee Tou Ng, and David Chiang. (2007).Word sense disambiguation improves statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 33–40, Prague, Czech Republic. Association for Computational Linguistics. URL http://www.aclweb.org/anthology/P/P07/P07-0005.
Carpuat, Marine, Pascale Fung, and Grace Ngai. (2006). Aligning word senses using bilingual corpora. ACM Transactions on Asian Language Information Processing (TALIP), 5(2):89–120, ISSN 1530-0226. doi: http://doi.acm.org/10.1145/1165255.1165256.
Dagan, Ido and Fernando Pereira. (1994). Similarity-based estimation of word cooccurrence probabilities. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 272–278, Las Cruces, New Mexico, USA. Association for Computational Linguistics. doi: 10.3115/981732.981770. URL http://www.aclweb.org/anthology/P94-1038.
Ikehara, Satoru, Satoshi Shirai, and Francis Bond. (1996). Approaches to disambiguation in ALT-J/E. In International Seminar on Multimodal Interactive Disambiguation: MIDDIM-96, pages 107–117, Grenoble.
Lesk, Michael (1986). Automatic sense disambiguation: How to tell a pine cone from an ice cream cone. In Proceedings of the 1986 SIGDOC Conference, pages 24–26, New York, ACM.
Miller, George. (1998). Nouns in WordNet. In Christine Fellbaum, editor, WordNet: An Electronic Lexical Database, chapter 1, pages 23–46. MIT Press.
Navigli, Roberto. (2006). Meaningful clustering of senses helps boost word sense disambiguation performance. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 105–112, Sydney, Australia.
Shadbolt, Nigel, Wendy Hall, and Tim Berners-Lee. (2006). The semantic web revisited. IEEE Intelligent Systems, pages 1541–1672. (http://eprints.ecs.soton.ac.uk/12614/1/Semantic_Web_Revisted.pdf).
Stevenson, Mark. (2003). Word Sense Disambiguation. CSLI Publications.
Vossen, Piek, editor. (1998). Euro WordNet. Kluwer.
Widdows, Dominic. (2004). Geometry and Meaning. CSLI Publications.
Zhong, Zhi and Hwee Tou Ng. (2009). Word sense disambiguation for all words without hard labor. In International Joint Conference on Artificial Intelligence (IJCAI-2009), pages 1616–162.

Motivation

One of the core areas of study in computational linguistics is the study of how words and sentence meaning is represented. This course covers both linguistics and computational issues, with an emphasis on the latter. The course also offers a grounded introduction to some general natural language processing techniques, such as sequence labeling, graph search, comparison of similarity and method combination.

Aims and objectives

This course is designed to provide an advanced knowledge of current computational approaches to lexical meaning. Students should be able to evaluate lexical resources for coverage and quality; design and implement systems for determining meaning in text; and present their results effectively. Students will do two small research projects during the course.

On completion of this course, the students should be able to:

Critically evaluate lexical resources for cover and accuracy.
Understand the main approaches to representing word meaning computationally
Read, understand and present other people's research
Be able to implement or enhance an existing method for word sense disambiguation
Conduct independent research on word sense disambiguation

Francis Bond <bond@ieee.org>
Computational Linguistics Lab
Division of Linguistics and Multilingual Studies
Nanyang Technological University
Level 3, Room 55, 14 Nanyang Drive, Singapore 637332
Tel: (+65) 6592 1568; Fax: (+65) 6794 6303