Preface to Version 2.0

Although the title of this report inherits the word "Manual" from the previous version, it is no more intended to guide the annotators. Rather it attempts to describe the current state of the morphological annotation in PDT 2.0. Most of the added information resulted from several semi-automatic checks performed on the data before having released it. In some cases it was not manageable to bring the data to the desired state - if so, both the desired and the current state of the data are described.

PDT 2.0 contains 1,960,657 morphologically annotated tokens in 126,831 sentences. There are 168,454 distinct word forms, 71716 distinct lemmas, and 1740 morphological tags.

The final checking and analysis of the data as well as the work on this manual revision were supported by the Czech Academy of Sciences program called "Information Society", project No. 1ET101120503.