The latest release of CzEng now is:

  • CzEng 1.6 (~62.5 M parallel sentences, fully automatically annotated): used in WMT17. Note that the text basis is identical to CzEng 1.6pre, the increased number of sentence pairs is only due to document-level (as opposed to segment-level) deduplication used in CzEng 1.6.

For reproducibility of past experiments, we provide also previous CzEng releases: