Monday, October 15, 2018 - 13:30

UFAL 1st year PhD student microconference

Karolína Hořeňovská, Michal Auersperger, Tomáš Musil (ÚFAL MFF UK)

Karolína Hořeňovská: Automated text simplification

Abstract: The goal of text simplification is to reduce some aspects of text complexity (e. g. lexical, syntactical) while preserving meaning to make the text more accessible, either to general audience or to a specific target group (e. g. children, L2 learners, aphasics). Manual simplification is often performed for aforementioned target groups, but it is time consuming and it is hard to request simplification of a specific text of person's interest, which leads to the need to automate this task.

This talk will start by briefly reviewing the motivation for text simplification. It will introduce simplification subtasks and it will present current approaches, results and possible future works for the most intensively studied ones. Special attention will also be paid to challenges arising from simplifying Czech, given the nature of the language and the lack of training data.

Michal Auersperger: Vector Representations of Text

Vector representation of text is a basic building block for many NLP applications such as information retrieval, machine translation, sentiment analysis, document classification and others. The aim of the talk is to give a brief overview of the approaches to representing documents in different domains. These approaches include simple count-based, bag-of-words methods; topic modeling; combination of word vectors; as well as the use of artificial neural networks.

Tomáš Musil: Interpretation of Deep Neural Networks and Theory of Language

Neural networks and deep learning are currently dominating NLP applications as well as other fields of AI. The downside of this new technology is that the models are difficult to interpret. In this talk I will outline the approach that I intend to take in my dissertation. I will summarize the demands that we need to place on a theory of language if we want it to help us understand what is going on inside modern NLP applications. Then I will present results that I obtained so far. I will propose an interpretation of the Skip-gram language model as a model of Fregean equivalence of meaning. I will describe preliminary findings of ongoing research on interpretation of word embeddings in NMT and other NLP tasks.