Basic Czech grapheme-to-phoneme conversion rules

This is actually a very rough simplification, but it'll do for our example and it should (hopefully) produce intelligible Czech speech. Phonemes here are shown in the SAMPA notation for Czech used by MBROLA. Let us know if you find a mistake.

  1. ch (two graphemes) are actually pronounced as one phoneme, [x]

  2. Similarly, dz, are pronounced as single phonemes [dz], [dZ]

  3. di, , ti, , ni, are pronounced as ďi, ďí, ťi, ťí, ňi, ňí

  4. All y's are pronounced the same as i's (except for the previous rule), ů is the same as ú

  5. ě is a tricky letter:

  6. The vowel groups ie, ia, iu, io, ii, are pronounced as ije, ija, iju, ijo, iji, ijí.

  7. For all the remaining rules (voicing & stress), consider 1-syllable prepositions (k(u/e), v(e), z(e), s(e), bez, do, na, nad, o, od, po, pod, pro, před, při, u, za) as being part of the following word.

  8. At the end of each word, all of the following voiced consonants turn to unvoiced: bp, vf, dt, zs, dzc, žš, č, ďť, gk, hch

  9. Czech uses a regressive assimilation of consonants (see here for Czech info). If multiple consonants from Rule 8 occur together, the last one of the group determines whether all consonants in the group are voiced. Note that any r, l, m, n will break the groups. The voiced-unvoiced pairs are the same as in Rule 8, with one exception -- v gets affected by the change, but does not trigger it itself. Here are some examples:

  10. Stress goes on the 1st syllable in each word. Some short words have no stress (ho, je, jsem, jsi, jsme, jsou, jste, li, , mi, mně, mu, se, si, , ti, to, tu).