The first step is to get the tranlsation system running
from your language to English (xxx2eng). Here are step-by-step
instructions:
If you get an error, you'll need to compare the MRSs to to see
what the difference is. I expect that for Dogs sleep you won't
need any transfer rules (depending on what you called your
predicates), and thus any errors should be addressed through
harmonization (aka cleaning up your MRS) and/or work on your semi.vpm
file.
Comparing MRSs
To compare the MRSs, you can look at the MRS from the English
grammar directly, but this can be a bit misleading, since you really
want to look at the input to the generator (i.e., the transfer
output). To do this, you can select "Generate | Display Input MRS" or
"Generate | Display Internal MRS" from the "target" LKB Top menu.
- Generate | Display Internal MRS
- Parse the expected output
- Choose Indexed MRS from the pop-up menu
There are a number of things that could be wrong:
- Missing RELS or HCONS (broken diff-list append).
- Misspelled PRED values (look carefully at the underscores).
- Misspelled/differently spelled feature values (e.g. sing
instead of sg).
- Misspelled/differently spelled feature names (e.g., PERS
instead of PER).
- Incompatible variable properties (features and values).
You may have noticed that you get many variants on generation if
you start with a form that is underspecified for e.g., aspect or
evidentiality. We can get a handle on this by using variable property
mapping to supply default values in the unmarked case (either in
monolingual generation or in the MT scenario). The basic strategy is
to take any underspecified values in variable properties and translate
them, via vpm, to something that conflicts with any more specific
values your grammar can produce.
The file semi.vpm provides a mapping between grammar-external
features of indices (referential indices and events) and their values,
and grammar-internal ones. For background on VPM, see the
DELPH-IN wiki.
As soon as you start using a VPM file, then only variable properties
(features on indices) that are handled in the file are actually
preserved.
- You should already have a semi.vpm file provided by
the customization system. Open it up and see which variable
properties are there, and then look in your grammar to
see what is missing. In general, we'd expect to see all
of the features of the types event and ref-ind
represented in a mature semi.vpm file.
- You need to tell the lkb to load the semi.vpm file by uncommenting
the following line in lkb/script:
(mt:read-vpm (lkb-pathname (parent-directory) "semi.vpm") :semi)
- This line needs to be moved higher in the script file,
specifically, it needs to be before the code block that loads
the trigger rules.
- You'll also need to add this lines to lkb/mrsglobals.lsp:
(setf *variable-type-mapping* :semi)
- If your grammar uses a PERNUM feature, you'll need to map
separate PER and NUM features from the external (right-hand side) of
the VPM to a single PRENUM feature on the internal (left-hand side).
See the example under "Properties: An Example" on the DELPH-IN wiki page.
- If your grammar encodes aspectual distinctions, you'll need
to add an ASPECT section, modeled on tense. This should allow you
to create and use specific a default value of ASPECT.
- If you have any other features you have added on indices, you
will need to provide VPM entries for them as well.
- If your language has aspect marked in some sentences but other forms that are just underspecified for aspect, you'll want to have the default aspect be "no-aspect". Define this as a subtype of aspect in your grammar, but don't have anything other than the semi.vpm mention it otherwise.
- You can do a similar trick for other kinds of generation ambiguity
relating to variable properties.
Test your semi.vpm file by parsing and then generating. You
should see fewer strings coming out.
Preliminaries
This week, we'll be using the LOGON MT set up, which doesn't
respect ICONS. I hope to also try the ACE set up, which does.
(But you'll still need the LOGON/LKB version in order to debug
transfer, I believe.)
Running the translation system
The first step is to get the translation system running
for English to Frisian (eng2frr). Here are step-by-step
instructions:
Update semi.vpm, if necessary
The file semi.vpm provides a mapping between grammar-external
features of indices (referential indices and events) and their values,
and grammar-internal ones. For background on VPM, see the
DELPH-IN wiki.
- If your grammar uses a PERNUM feature, you'll need to map
separate PER and NUM features from the external (right-hand side) of
the VPM to a single PRENUM feature on the internal (left-hand side).
See the example under "Properties: An Example" on the DELPH-IN wiki page. (There is also a an example in the semi.vpm file in the eng grammar.)
- If your grammar encodes aspectual distinctions, you'll need
to add an ASPECT section, modeled on tense. This should allow you
to specific a default value of ASPECT as well.
- If your language has aspect marked in some sentences but other forms that are just underspecified for aspect, you'll want to have the default aspect be "no-aspect". Define this as a subtype of aspect in your grammar, but don't have anything other than the semi.vpm mention it otherwise. In the semi.vpm file, at hte bottom of your section on aspect, add:
* >> no-aspect
no-aspect << [e]
Create a transfer grammar
Once you have Dogs sleep translating, it's time to try
a broader range of the MMT sentences.
Note that you will be modifying the English and Italian grammars
for this part of the lab. You will need to add
mt-mrs.tdl, mtr.tdl and acm.tdl. Of
those, acm.tdl should be the most interesting. You'll
want to edit the file acm.mtr to create instances of the
transfer rules that you need for your grammar. It will be simplest to
edit this file in one grammar (say the English one) and create a
symbolic link to it in the other grammar, so that you have one
transfer grammar for your language.
- Try translating all of the MMT sentences from English to your
language and Italian to your language.
- For each one that doesn't go through, compare the input MRS
to the MRS your expected output is giving.
- Do any harmonization that is warranted.
- For the remaining differences, look to see if one of the existing
transfer rule types in acm.tdl will do the trick. If so,
create an instance of that transfer rule type in acm.mtr, e.g.,:
pro-drop := pronoun-delete-mtr.
- If you need a different transfer rule, ask Petter or I about what
you need, and we'll work out how to formulate it.
- Reload the "source" grammar and try translating again.
- Rinse and repeat.
Running the translation system
If you would like to try translating with ACE instead
(included in Ubuntu+LKB 17, 64-bit version), you can try out
these instructions, compiled by
Sanghoun Song.
Attempt to translate into your language
Comparing MRSs
To compare the MRSs, you can look at the MRS from the English
grammar directly, but this can be a bit misleading, since you really
want to look at the input to the generator (i.e., the transfer
output). To do this, you can select "Generate | Display Input MRS" or
"Generate | Display Internal MRS" from the "target" LKB Top menu.
- Generate | Display Internal MRS
- Parse the expected output
- Choose Indexed MRS from the pop-up menu
There are a number of things that could be wrong:
- Missing RELS or HCONS (broken diff-list append).
- Misspelled PRED values (look carefully at the underscores).
- Misspelled/differently spelled feature values (e.g. sing
instead of sg).
- Misspelled/differently spelled feature names (e.g., PERS
instead of PER).
- Incompatible variable properties (features and values).
Update semi.vpm, if necessary
The file semi.vpm provides a mapping between grammar-external
features of indices (referential indices and events) and their values,
and grammar-internal ones. For background on VPM, see the
DELPH-IN wiki.
- If your grammar uses a PERNUM feature, you'll need to map
separate PER and NUM features from the external (right-hand side) of
the VPM to a single PRENUM feature on the internal (left-hand side).
See the example under "Properties: An Example" on the DELPH-IN wiki page. (There is also a an example in the semi.vpm file in the eng grammar.)
- If your grammar encodes aspectual distinctions, you'll need
to add an ASPECT section, modeled on tense. This should allow you
to specific a default value of ASPECT as well. Note that the English
and Frisian grammars don't encode tense or aspect, so this is strictly
for the MT demo.
- If your language has aspect marked in some sentences but other forms that are just underspecified for aspect, you'll want to have the default aspect be "no-aspect". Define this as a subtype of aspect in your grammar, but don't have anything other than the semi.vpm mention it otherwise.
Back to course page
bond@ieee.org
Course materials borrow heavily
from Linguistics 567:
Knowledge Engineering for NLP at the University of Washington.
Thanks to
Emily Bender for
letting us use them.