Homework 5

Write a PML-TQ query that searches for direct sons of a predicate (functor="PRED") on a tectogrammatical layer of the Prague Dependency Treebank 3.0. We are only interested in predicates with sons that have the following functors: ACT, PAT, ADDR, ORIG, EFF and we want to ignore predicates with other sons (i.e., predicates that have at least one son with another functor). We are interested in sequencies of these functors as they appear below predicates in the trees (i.e., you need to use attribute deepord, which controls the left-right order of nodes in tectogrammatical trees).

For example, if a predicate has these four direct sons: ACT, ADDR, PAT, TWHEN, it is to be ignored, as there is a son with functor TWHEN, which is not allowed in our enquiry. On the other hand, if a predicate has three direct sons: ACT, ADDR, PAT (in this deep-order), this should count as one occurrence of sequence "ACT ADDR PAT".

Use an output filter to generate a distribution of these sequences of functors as they appear in data and sort the disctribution in the decsending order by the frequency. As a solution of the homework, submit the textual version of the query.

For the search, use the web browser interface to the PML-TQ server for the PDT 3.0 data (direct link).

The output should start with the following lines:

ACT PAT	4177
PAT ACT	2063
PAT	1165
ACT EFF	632
ACT	584
ACT ADDR PAT	519

For the query, you will need to know:

Please consult the documentation and write me an e-mail if you are stuck.