The talk will deal with language resources for Latin and issues both in computational and in theoretical linguistics.
In particular, it will focus on the Index Thomisticus Treebank (featuring works of Thomas Aquinas) and its relation with some theoretical and practical aspects of the Prague Dependency Treebank.
The talk will be organized into three sections:
a) the long-time connection between Latin and computational linguistics.
Some documents coming from the 'Busa Archive' will be presented concerning (1) the genesis of the Index Thomisticus in the 40s and (2) the relations between father Roberto Busa SJ and Prague in the 60s;
b) issues in tectogrammatical annotation of Latin.
The treatment of some Latin-specific constructions will be shown;
c) a comparison between the tectogrammatical-based valency lexicon Latin-Vallex and Latin WordNet.
The degree of overlapping between some subsets taken from the two lexical resources will be presented.
Marco Passarotti is head of the "Index Thomisticus" Treebank project at the Università Cattolica del Sacro Cuore in Milan, where he started the project in 2006.
A pupil of one of the pioneers of humanities computing, father Roberto Busa SJ, his main research interests deal with developing and disseminating language resources and NLP tools for Latin.
He has organized and chaired several international scientific events, among which are the eighth workshop on 'Treebanks and Linguistic Theories' (TLT8, Milan, 2009) and the 'Computational Linguistics and Latin Philology' workshop (Innsbruck, 2009). He co-chairs the series of workshops on 'Corpus-based Research in the Humanities' (CRH).
He is author of one book and of around seventy papers published in scientific reviews and proceedings of national and international conferences.
In 2010, he founded the CIRCSE research centre in computational linguistics. He runs the 'Busa Archive', i.e. the personal archive of father R. Busa that was donated by the Jesuit to the library of the Università Cattolica.