LAC: Language and the Computer: Project 3

In-Class On-Line Open-Book Programming Challenge

This assignment constitutes 30% of your final grade for LAC.

Comparing and describing wordnets

Given a pair of wordnets, compare them. Given a single wordnet describe it or illustrate it.

Please start with these wordnets:

Single wordnet description

Write a function (or functions) that take a wordnet lexicon id as input, assuming the wordnet has already been read.

  1. How many synsets, senses and words are there?
  2. How many synsets, senses and words are there for each pos (part of speech)?
  3. Show these as e.g. pi charts
    Make the size of the pi chart proportional to the actual size
    E.g if one wn has 100,000 synsets and one has 5,000, the first pi chart should be 20 times bigger
  4. Show three examples of each pos

Compare wordnets

  1. How many synsets, senses and words are there only in one, in both or only in the other?
  2. Show these as e.g. bar charts --- left for one wordnet, right for the other
  3. Show three examples of words in one, both or other for each pos

Compare wordnet to curated list

  1. Try this with the core and basic lists
  2. How many synsets are in the wordnet from the curated concept list
  3. See the coverage over the semantic fields (lexicographer files)
  4. Graph this and give examples
  5. Compare the two lists

Illustrate a basic list

  1. Go through the basic list, and show a picture for each word
    • Easiest to do by concepts
    • Also show synonyms and examples, if any
    • You can get a definition from another wordnet (like en-omw:1.4)
  2. Try to break things into smaller groups (pos, semantic field)
  3. If the word has not picture, experiment with a hypernym or hyponym
  4. Not how many words are missing pictures, and list them
  5. You can show images in google colab like this:
    from IPython.display import display, Markdown
    
    ara_id=2910
    
    display(Markdown((f"""**Computer**
    <img width='48' src='https://static.arasaac.org/pictograms/{ara_id}/{ara_id}_300.png'> *from Arasaac*""")))

Deliverables

Pedagogical Goals


Project Three for LAC: Language and the Computer Francis Bond.