Lab 10 - GIZA++ and Moses
The goal of the lab is to get GIZA++ and Moses running and to carry out an experiment comparing alignment error rate (AER) of several alignments with the final BLEU score.
In general, follow:
- Moses installation
- Baseline model construction (includes GIZA++ compilation) (you don't need the last step EMS)
More background info available in
Moses Training Tutorial (see the left menu, section Training).
Detailed Steps
- Download and compile GIZA++: https://github.com/moses-smt/giza-pp
- Download and compile Moses:
https://github.com/moses-smt/mosesdecoder/
- Pick an English->Czech corpus from OPUS:
http://opus.lingfil.uu.se/?src=en&trg=cs
- Follow the Baseline Model Construction mentioned above to:
- Tokenize the corpus using: moses/scripts/tokenizer/tokenizer.pl
- Extract language model from the target side of the corpus (
lmplz
)
- "Train" the model (
moses/scripts/training/train-model.perl
,
includes GIZA++, phrase extraction, final config)
- "Tune" the model (
moses/scripts/training/mert-moses.pl
: MERT, ie. weight optimization)
- Translate the test set: run
moses
with the optimized config (moses/bin/moses -f mert-work/moses.ini -i testcorpus.src.txt > mt-output.txt
)
- Score the translations (
moses/bin/evaluator --sctype BLEU --candidate mt-output.txt --reference testcorpus.tgt.txt --bootstrap 1000
)
HW04 Assignment: BLEU vs. AER
The previous lab 09 and this lab 10
lead directly to the solution of your homework 04.
For your homework:
- Run and evaluate your baseline MT system, record BLEU on the test set.
- Apply the same word alignment technique (i.e. GIZA++) to the concatenation of your training corpus and the test corpus. Extract the test set alignments, record AER on the test set.
- To get GIZA++ alignments for this combined corpus, use
moses/scripts/training/train-model.perl --first-step=1 --last-step=3
.
- To use a different symmetrization technique (e.g. union), use
--first-step=3 --last-step=3 --alignment=union
(assuming that step 2, GIZA++, was already performed).
- Repeat the previous two steps for two or more variations of the alignment or the symmetrization (e.g. intersection or union instead of the default gdfa, grow-diag-final-and).
-
At least one of the setups has to be IBM1 alignment script from lab 09, you may or may not experiment with token variation (stemming etc.) for this.
-
Feel free to reduce the training data size if your implementation could not handle the same amount of data as GIZA++. (Yes, this invalidates the comparison, but I don't want to torture you with running MERT etc. again to have also GIZA++ and MT results on this reduced training corpus.)
Please submit:
- A brief report on your experiment (just notes on 1 page are sufficient).
Make sure to indicate:
- What training corpus you used (describe exactly all parts).
- A table listing the number of parallel sentences, source and target tokens for all sections: training corpus, development (=tuning) corpus and the final test corpus. Everyone should have the same test corpus.
- A table listing your results (3 or more setups of word alignment/symmetrization technique), e.g.:
Setup |
BLEU |
AER |
baseline (GIZA++ default config, gdfa) |
??? |
??? |
my IBM1, intersection |
??? |
??? |
my IBM1, only source-to-target |
??? |
??? |
my IBM1, intersection, based on stems |
??? |
??? |
-
The input file (tokenized etc., as fed to Moses) for a sanity check.
- The output file of the run with the best BLEU score, as emited by Moses.
- The output file of the run with the best AER score, as emited by Moses.
Send your solutions to Ondrej Bojar (bojar -at- ufal.mff.cuni.cz
,
mention
FEL HW04
in the subject).
Deadline: 23:59 8th January 2017