Lab 7

Preliminaries

These instructions might get edited a bit over the next couple of days. I'll try to flag changes.

As usual, check the write up instructions first.

Raising vs Control

Raising
- Take the situation as an argument, matrix subject is of the raising verb is the subject of the innermost verb, raising verb has no subject
Control
- Matrix subject is the subject of both verbs

Larry continued to smile. (raising)
Larry decided to smile. (control)
There continued to be a problem. (raising)
* There decided to be a problem. (control)

Requirements for this assignment

0. Make sure you have a baseline test suite corresponding to your lab 6 grammar.
1. Add the equivalent of can "able to".
2. Check that negation is working, and fix it if necessary.
3. Make sure your MRSs for I can eat glass. It doesn't hurt me. are right.
4.Make sure your grammar can still generate, and debug as necessary.
5. Fix at least one (more) thing about your grammar.
6. Test your grammar using [incr tsdb()]. [incr tsdb()] should be part of your test-development cycle. In addition, you'll need to run a final test suite instance for this lab to submit along with your baseline.

Run a baseline test suite

Before making any changes to your grammar for this lab, run a baseline test suite instance. If you decide to add items to your test suite for the material covered here, consider doing so before modifying your grammar so that your baseline can include those examples. (Alternatively, if you add examples in the course of working on your grammar and want to make the snapshot later, you can do so using the grammar you turned in for Lab 6.)

Background

The goal of this lab is to be able to parse the two sentences I can eat glass. It doesn't hurt me., assign them appropriate semantics, and generate back. You have already done some of the work: from previous labs, your grammar should already handle pronouns, case (if applicable), and transitive verbs. Negation may already be working from the customization system and/or previous work you've done. You may need to add some vocabulary and possibly some verb forms. In addition, depending on how the sentences translate in your language, you might need to consider a new valence pattern for verbs and a new type of nouns (mass nouns).

Semantic representations

Your semantic representations for the two sentences should look approximately like this, modulo the relations showing up in a different order, the variables (e's, x's, and h's) showing up with different numbers, the SEMSORT information showing up in different places. Also, if your language tends to use prodrop rather than overt pronouns, you might end up without any representation of the pronouns in these sentences. Finally, if you need a complex predicate in place of, say, "hurt", then you'll also have some differences.

I can eat glass.
```
<h1,e2:SEMSORT,
{h3:pronoun_n_rel(x4:SEMSORT:+:FIRST:SG),
h5:exist_q_rel(x4, h7, h6),
h1:_can_v_rel(e2:SEMSORT:TENSE:ASPECT:MOOD, h8),
h9:_eat_v_rel(e10:SEMSORT:TENSE:ASPECT:MOOD, x4, x11:SEMSORT:BOOL:THIRD:SG)
h12:_glass_n_rel(x11),
h13:exist_q_rel(x11,h15,h14)}
{h8 qeq h9,
h6 qeq h3,
h14 qeq h12} >
```
Things to note about this representation: _can_v_rel is a one-place relation (i.e., we're treating can as a raising verb), whose ARG1 is qeq (equal modulo quantifiers) to the handle of the _eat_v_rel as its argument. The _eat_v_rel is a two-place relation taking x4 (the index from the first-person pronoun) and x12 (the ARG0 of _glass_n_rel) as its arguments.
It doesn't hurt me.
```
<h1,e2:SEMSORT,
{h3:pronoun_n_rel(x4:SEMSORT:+:THIRD:SG),
h5:exist_q_rel(x4, h7, h6),
h1:_neg_r_rel(u9:SEMSORT, h8),
h10:_hurt_v_rel(e2:SEMSORT:TENSE:ASPECT:MOOD, x4, x11:SEMSORT:+:FIRST:SG),
h12:pronoun_n_rel(x11),
h13:exist_q_rel(x11, h15, h14)},
{h6 qeq h3,
h8 qeq h10,
h14 qeq h12} >
```
Things to note about this representation: The _neg_r_rel takes a handle as its argument, which is related through a qeq to the handle of the _hurt_v_rel. The handle of _neg_r_rel is itself in turn the local top handle of the clause. These qeqs allow quantifiers to scope above or below _neg_r_rel so that I can't eat some cheese can either mean 'There is some cheese that I can't eat', or 'I can't eat just some cheese (I end up eating more)'.

Modals

can as an auxiliary verb

Use this version if in your language the morpheme expressing the same notion as can is a separate word which takes a VP complement and a subject.

Define a new verb type which inherits from your verb-lex and trans-first-arg-raising-lex-item-1 (and take a look at the definition of this type in matrix.tdl so you know what they're doing). If you already have auxiliaries from the customization system, see if you have a type like this already. (Note that this is the type for semantically contentful/elementary predication contributing auxiliaries. If the rest of you auxiliaries are of the semantically empty type, you'll need to create a new type.)
- In addition to inheriting from these types, your new type should put appropriate constraints on the values of ARG-ST and the valence features.
- If your auxiliary can be the input to any of your lexical rules, make sure that it has the right supertypes (xxx-rule-dtr) to serve as the input to right rules.
- Make sure that it constrains the part of speech of each argument.
Define a lexical entry (with PRED value "_can_v_rel" which inherits from your new type.
Create the appropriate form of the verb meaning eat, if necessary. This can be done either directly as a lexical entry, or via a lexical rule.
If you needed an additional form of eat, ensure that only that form of eat can appear as the complement of can (and add whatever items you use to test this to your master testsuite), and that the new form of eat can or can't appear in matrix clauses (as appropriate).
- In English, this involves defining a feature FORM on verb (subtype of head), somewhat similar to CASE on noun. You may have a FORM feature and some appropriate types already from the customization system.
Parse your translation of I can eat glass, and examine the chart for extra edges. Are they legitimate, or spurious? If they're spurious, try to rule them out (and then rerun your master testsuite to see if they were, in fact, spurious :-).
Parse your translation of I can eat glass and see if you get the right semantics. Debug as necessary.

can as a bound morpheme

Use this version if the morpheme expressing the same meaning as can in your language attaches morphologically to the main verb of the sentence.

The first step is to decide which lexical rule type is appropriate. Look at the section of matrix.tdl titled "Lexical Rules" and see if any of the xxx-only-xxx-rule types are appropriate. If not, construct an appropriate one out of the next level of supertypes. Unless you have concomittant changes to the valence features (such as the CASE value required on one of the arguments), something like the following is probably appropriate:

    infl-add-ccont-ltow-rule := same-non-local-lex-rule &
                                same-cat-lex-rule &
                                same-ctxt-lex-rule &
                                same-agr-lex-rule &
                                inflecting-lex-rule &
                                nocoord.

You'll also need to add lexeme-to-lexeme-rule or lexeme-to-word-rule, depending on how this rule fits into your morphology.

Your subtype for this particular rule will now need to constrain all three features within its C-CONT: RELS, HCONS and HOOK. (The C-CONT.HOOK.XARG can be identified with the HOOK.XARG of the daughter.):

The lexical rule's C-CONT.RELS is a diff list containing a single relation of type arg1-ev-relation. The PRED value of that relation should be "_can_v_rel", the LBL should be identified with the C-CONT.HOOK.LTOP, the ARG0 with the C-CONT.HOOK.INDEX.
The lexical rule's C-CONT.HCONS is a diff list containing one qeq. The HARG of the qeq should be identified with the ARG1 of the arg1-ev-relation and its LARG with the daugther's LTOP.

In order to get the affix in the right place in a chain of affixes, constrain the DTR value so that the possible inputs to the rule are the class of affixes that can occur just inside this one. If further affixes attach afterwards, add the appropriate xxx-rule-dtr supertype to the rule type.

Add an instance for your lexical rule to irules.tdl, with the appropriate spelling change information.

If appropriate, make sure that your rule applies only to verbs, and test this.

Parse your translation of I can eat glass and see if you get the right semantics. Debug as necessary.

Negation

Two-part negation

Use this version if your language expresses negation with both an affix on the verb and an adverb (e.g., French ne ... pas). If both elements are arguably affixes, you probably just want to write a pair of lexical rules, i.e., take the "Negation as a verbal affix" route, but write two rules and make sure you can require that they both apply or neither apply.

The strategy here is going to be add the affix with a lexical rule similar to the one above, but to have it change the COMPS value and not add any semantics. The COMPS list will be the same as the input's COMPS list, with the addition of a negative adverb.
We're assuming that the second part of the negation has an independent life as a negative adverb for constituent (i.e., not sentential) negation. This seems to work for French at first glance, I'd be curious about other languages. That adverb will have to have constraints on its MOD value that keep it from modifying finite verbs (which would give sentential negation). However, if you don't have any head-modifier rules in your grammar, you don't need to worry about that yet.
The negative adverb needs some kind of distinguished value (a feature inside HEAD, and in particular, HEAD.KEYS.KEY might be a good candidate) so that the rule won't license verbs just picking up any adverb as a complement.
The inflected verb will "hand" its LTOP value to the negative particle, and adopt the negative particle's LTOP value as its own. Regular semantic composition should take care of the rest.
Again, you'll need to look at the lexrule types and pick an appropriate supertype. Since we're changing both VAL.COMPS and CONT.HOOK, you might need to create your own supertype. Talk to me :).
Test your grammar: Try parsing your translation of It doesn't hurt me. and see if you're getting the right semantics. Test sentences with each of the two parts of the negation independently, and verify that they don't parse. Try putting the two part negation on a non-verb, and verify that it doesn't parse. Debug as necessary.

Negation: markers on either end

This option is for languages that mark negation with particles on either end of the clause or VP (or alternatively, with intonation or [in signed languages] non-manual signs which extend the length of the clause/constituent and are represented in transcription with markers on either end).

If the two markers show up immediately adjacent to the verb (rather than VP or S), consider whether it might be more appropriate to treat them as inflection.

The first thing is to consider whether there is any evidence for attaching the markers one at a time. In these instructions I focus on the case where there is not, so please contact me if you think the markers should attach one at a time in your language. Rather than attach one of these markers before the other, the most straight-forward thing appears to be to create a ternary rule. I've added some types supporting ternary rules to the matrix (included in the patch provided last week).

We're going to take a construction-y approach to analysis, creating a phrase structure rule which calls for specific elements in two of the three daughters and does the right thing in the semantics itself. Specifically, do the following:

Create a lexical type which encodes the constraints in common to the left and right markers. If the same elements have some other function in the grammar, write the constraints accordingly. If they don't, make the lexical type a subtype of norm-zero-arg and make sure its valence lists, MOD value, RELS and HCONS lists are all empty. You should also give it a distinctive HEAD value (lest it show up as the argument of something else). For example, in the ASL case, there should probably be a new subtype of head for non-manual markers:
```
nmm := head.
```
Create subtypes of your lexical type for the left-hand and right-hand elements. These subtypes don't need additional constraints, just contrasting names.
Create lexical entries of each of the subtypes, with the appropriate STEM value. These elements don't need any KEYREL information.
Create a type negation-phrase which inherits from ternary-head-middle-phrase. This supertype will make sure that the HEAD and VAL values come from the head daughter (the middle daughter).
You may need to specify the value of other features inside CAT on the mother, either copying them from the head daughter or otherwise specifying them appropriately. If you do need to, you'll be able to tell because your grammar will be over-generating.
To get the construction to contribute the right semantics, you'll need to constrain its C-CONT value. In particular:
- The C-CONT.RELS is a diff list containing a single relation of type arg1-ev-relation. The PRED value of that relation should be "_neg_r_rel", the LBL should be identified with the C-CONT.HOOK.LTOP, the ARG0 with the C-CONT.HOOK.INDEX.
- C-CONT.HCONS is a diff list containing one qeq. The HARG of the qeq should be identified with the ARG1 of the arg1-ev-relation and its LARG with the head daugther's LTOP.
- The C-CONT.HOOK.XARG should be identified with the head daughter's HOOK.XARG.
Constrain the first and third elements of the ARGS list of negation-phrase to be the lexical types of your left and right markers, respectively.
Create an instance of negation-phrase in rules.tdl and test it! Make sure you get the the right semantics

Negation as an adverb modifier

If your language uses an adverbial strategy, the customization script probably did the right thing. This is included just in case.

Use this version if your language expresses sentential negation via an adverb which modifies the V, VP or S.

(Note: English has two forms of sentential negation "contracted", which is actually an affix on the verb, cf. Zwicky and Pullum 1983, and the full-form adverb. This adverb is not actually treated syntactically as a modifier in sentential negation, but rather selected by auxiliary verbs, including the do of do-support. For the details of this analysis, see Sag, Wasow and Bender 2003 chapter 13 and Kim and Sag 1995. I would be surprised if another language being treated in this class had a system very similar to the English one, as it seems like a pretty quirky part of English grammar. Further, it's a subtle matter to establish what is actually going on in English, and I don't think anyone would have time in one week to show the same about another language.)

Determine where your negative adverb attaches: to V, VP or S and whether it attaches to the left or to the right of the node it attaches to.
For testing purposes, develop a set of sentences contrasting the correct attachment with the incorrect attachments.
Negative adverbs are scopal modifiers, so even though we did adverbs in the previous lab, you'll need to add some machinery:
- Define an instance of adj-head-scop-phrase and/or head-adj-scop-phrase in rules.tdl. (These types are fairly fully specified, which means you'll most likely not need to put anything in esperanto.tdl.) Which one you pick depends on whether your negative adverb is prehead or posthead (or either). Look at the type definitions in matrix.tdl to decide which is appropriate.
- Create a subtype of basic-scopal-adverb-lex. That type (through its supertypes) does much of the work for you. You will need to constrain is VAL and MOD..CAT values. The matrix type also leaves ARG-ST underconstrained. For consistency, if your adverb takes no arguments (and I'd be a bit surprised if the negative element did), ARG-ST should be constrained to be empty as well.
- Create a lexical entry which inherits from scopal-adverb-lex and introduces a relation with the PRED value 'neg_r_rel.
Test your grammar. Does the adverb show up only where it's supposed to? Do you get the right semantics for It doesn't hurt me.? Debug as necessary.

Negation as a verbal affix

If your language uses an adverbial strategy, the customization script probably did the right thing. This is included just in case.

Use this version if your language expresses sentential negation by adding a morpheme to the main verb.

The lexical rule for a negative verb will add a semantic relation in much the same way as the potential form (for can as verbal inflection) described above, so follow the directions there about selecting a supertype for the rule.

This time, the C-CONT RELS and HCONS lists, will each have one item on them:

RELS < ! adv-relation &
         [ PRED '_neg_r_rel,
           ARG1 #harg ] ! >

HCONS < ! qeq &
         [ HARG #harg,
	   LARG #larg ] !>

In addition, the C-CONT.HOOK.LTOP should be identified with the LBL of the adv-relation and the C-CONT.HOOK.INDEX with the ARG0.
Add an instance for your lexical rule to irules.tdl, with the appropriate spelling change information.
Constrain your lexical rule to apply only to verbs, and test this.
If you have other verbal inflection in your language, determine which lexical rule has to go first, and constrain the rules to only apply in that order. Make up nonsense forms with the affixes attached the other way around, and make sure that they don't parse. If your other affixes attach to the other end of the verb, you'll want to constrain the order anyway, or you will end up getting double parses.
Parse your translation of It doesn't hurt me and see if you get the right semantics. Debug as necessary.

Grammar clean up

This section asks you to find something about your grammar which needs fixing, and fix it (with help from me). This could be something that isn't quite working right from previous labs, or something that is important in your language but doesn't look like it will be otherwise covered in the class.

For your final documentation: Write up your analyses

For each of the following phenomena, please include the following in your write up:

A descriptive statement of the facts of your language.
Illustrative IGT examples from your testsuite.
A statement of how you implemented the phenomenon (in terms of types you added/modified and particular tdl constraints).
If the analysis is not (fully) working, a description of the problems you are encountering.
A statement of whether or not you can generate from examples illustrating the phenomenon.

"Can" (modals)
Negation
Whatever you fixed about your grammar.

In addition, your write up should include a statement of the current coverage of your grammar over your test suite (using numbers you can get from Analyze | Coverage and Analyze | Overgeneration in [incr tsdb()]) and a comparison between your baseline test suite run and your final one for this lab (see Compare | Competence).

Back to main course page

Francis Bond

Course materials borrow heavily from Linguistics 567: Knowledge Engineering for NLP at the University of Washington. Thanks to Emily Bender for letting us use them.