Speech & Dialog

Prague Database of Spoken Language

The project focuses on speech reconstruction of Czech and English. It is part of the Prague Dependency Treebank family of annotated corpus resources and tools, to which it adds the spoken language layer(s). It consists of the Prague DaTabase of Spoken English and Prague DaTabase of Spoken Czech ... [learn more]


ROMi represents a specific subcorpus of CZESL (Czech as a Second Language). It collects examples of language use, both spoken and written, of Czech Romani children and teen-agers. The range of materials exceeds 1,5 million words. ... [learn more]


Other Speech & Dialog Data