This talk presents a dependency treebank of Buddhist Chinese texts, which contains over 50K characters drawn from four sutras in the Chinese Buddhist Canon. The treebank has been annotated based on the part-of-speech tagset of the Penn Chinese Treebank and the Stanford Dependencies for Chinese. We apply the treebank to explore linguistic changes in Medieval Chinese, focusing on the vernacular style and literary style as reflected in usage patterns of classifiers, demonstratives, and copulae. We also discuss the use of the treebank for profiling characters in the Canon, such as their associations with verbs and toponyms and conversational networks.
***The talk will be streamed via Zoom. For details how to join the Zoom meeting, please write to sevcikova et ufal.mff.cuni.cz***
Dr. John Lee is Associate Professor at the Department of Linguistics and Translation at City University of Hong Kong. He obtained his PhD in Computer Science at the Massachusetts Institute of Technology. His research interest is in natural language processing and its applications in digital humanities and computer-assisted language learning. Dr. Lee’s recent projects include writing assistance tools for English as a foreign language and corpus-driven learning of Chinese as a foreign language, as well as treebank development of ancient Chinese texts.