Write a PML-TQ query that searches for direct sons of a predicate (functor="PRED") on a tectogrammatical layer of the Prague Dependency Treebank 3.0.
We are only interested in predicates with sons that have the following functors: ACT, PAT, ADDR, ORIG, EFF and we want to ignore predicates with other sons (i.e., predicates that have at least one son with another functor).
We are interested in sequencies of these functors as they appear below predicates in the trees (i.e., you need to use attribute deepord
, which controls the left-right order of nodes in tectogrammatical trees).
For example, if a predicate has these four direct sons: ACT, ADDR, PAT, TWHEN, it is to be ignored, as there is a son with functor TWHEN, which is not allowed in our enquiry. On the other hand, if a predicate has three direct sons: ACT, ADDR, PAT (in this deep-order), this should count as one occurrence of sequence "ACT ADDR PAT".
Use an output filter to generate a distribution of these sequences of functors as they appear in data and sort the disctribution in the decsending order by the frequency. As a solution of the homework, submit the textual version of the query.
For the search, use the web browser interface to the PML-TQ server for the PDT 3.0 data (direct link).
The output should start with the following lines:
ACT PAT 4177 PAT ACT 2063 PAT 1165 ACT EFF 632 ACT 584 ACT ADDR PAT 519
For the query, you will need to know:
t-node [ functor = "PRED", 0x t-node [ functor = "PAT" ]]
searches for PREDs that do not have PAT among their children;
>>give concat($2, " " over $1)
; do not forget to add sort by
in the concat
function to make sure the functors are sorted according to deepord
;
give distinct
);
>>for $1 give $1, count()
).