Homework 04

Download either the English data or the Czech data, depending on your liking. The data come from English and Czech Universal Dependencies corpora, respectively.

Write a btred script that prints all sentences in the data; assemble the sentences by concatenating values of attribute form of all (non-root) nodes in the trees and print them out in the following format and with the following details:

As a solution of the homework, send both the output as a textual file and the btred script.

Before delivering the homework, make sure that your output starts with these lines (in English):

DOCUMENT: weblog-blogspot.com_nominations_20041117172713_ENG_20041117_172713

From the AP comes this story :

President Bush on Tuesday nominated two individuals to replace retiring jurists on federal courts in the Washington area.
Bush nominated Jennifer M. Anderson for a 15-year term as associate judge of the Superior Court of the District of Columbia, replacing Steffen W. Graae.
Bush also nominated A. Noel Anketell Kramer for a 15-year term as associate judge of the District of Columbia Court of Appeals, replacing John Montague Steadman.

DOCUMENT: weblog-blogspot.com_gettingpolitical_20030906235000_ENG_20030906_235000

The sheikh in wheel-chair has been attacked with a F-16-launched bomb.
He could be killed years ago and the israelians have all the reasons, since he founded and he is the spiritual leader of Hamas, but they did n't .
Today 's incident proves that Sharon has lost his patience and his hope in peace.

or with these lines (in Czech):

DOCUMENT: cmpr9406-001

Třikrát rychlejší než slovo

Faxu škodí především přetížené telefonní linky *
Pomocí může být systém ECM

Šetřete peníze, netelefonujte, faxujte!
Je tento reklamní slogan pravdivý?
Hlasité přečtení dobře čitelného textu na stránce A4 , při řádkování 1.5, trvá zhruba 3 minuty.
Podle prospektů se faxem přenese normalizovaný obsah jedné stránky A4 za 10 až 30 sekund.

Further notes about using btred (useful for homework 04)