Monday, June 5, 2017 - 15:30

Deep Learning for Bilingual Sentiment Analysis of Short Texts

Yelaman Abdullin

Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging because of the limited contextual information that they normally contain. In this presentation, we will have talk about deep neural network model that use bilingual word embeddings for effectively solving classification problem for both languages. I will demonstrate the approach which was used for two corpora of two different language pairs: English-Russian and Russian-Kazakh. Also I will show how to train a classification model in one language and predict in another. The approach achieved good results for English with 73% accuracy and Russian 74% accuracy. There was some baseline method built for Kazakh sentiment analysis with 60% accuracy and also was proposed a method to learn bilingual embeddings from a large unlabelled corpus using set of bilingual word pairs.