Principal investigator (ÚFAL): 
Provider: 
Grant id: 
3537/2011
ÚFAL budget: 
660 000 Kč
Duration: 
2011-2013

Sentence-Level Polarity Detection in a Computer Corpus

The aim of the project is the analysis of possible relations between the syntactic structure and the polarity of a Czech sentence (or larger text span) by means of sentiment analysis of a dependency treebank.

Sentiment analysis (SA) is one of the subfields of the so-called opinion mining. Opinion mining tasks are interested in the automatic extraction of subjective information from text and determination of speaker’s attitude.

The main goal of sentiment analysis is the detection of a positive or negative polarity, or neutrality of a sentence (or, more broadly, a text). Most often this takes place by detecting the so-called polarity items, i.e. words or phrases inherently bearing a positive or negative value. These words (phrases) are collected in the subjectivity lexicons, i.e. corpora of lexical items bearing an inherent positive or negative value. The implementation of polarity items from the subjectivity lexicon into the data is the first step towards SA.

Publications

  • Veselovská, Kateřina: Czech Subjectivity Lexicon: A Lexical Resource for Czech Polarity Classification. In Proceedings of SLOVKO, 7th International Conference of NLP, Corpus Linguistics and E-Learning. Bratislava, Slovakia, 2013.
  • Veselovská Kateřina, Hajič, jr. Jan: Why Words Alone Are Not Enough: Error Analysis of Lexicon-based Polarity Classifier for Czech. In: Proceedings of the 6th International Joint Conference on Natural Language Processing, Nagoya, Japan, ISBN 978-4-9907348-0-0, pp. 1-5, 2013.
  • Hajič, jr. Jan, Veselovská, Kateřina: Developing Sentiment Annotator in UIMA – the Unstructured Management Architecture for Data Mining Applications. In: ITAT 2013: Information Technologies - Applications and Theory (Workshops, Posters, and Tutorials), Donovaly, Slovakia, ISBN 978-1490952086, pp. 5-10, 2013.
  • Šindlerová, Jana, Veselovská, Kateřina: Building a Corpus of Evaluative Sentences in Multiple Domains. In Corpus Linguistics 2013 - Abstract Book, Lancaster: UCREL, pp. 273-275.
  • Veselovská, Kateřina; Hajič jr., Jan; Šindlerová, Jana: Creating Annotated Resources for Polarity Classification in Czech. In: Proceedings of the 11th Conference on Natural Language Processing, Schriftenreihe der Österreichischen Gesellschaft für Artificial Intelligende (ÖGAI), Vienna, Austria, ISBN 3-85027-005-X, 2012. [pdf]
  • Veselovská, Kateřina: Sentence-Level Sentiment Analysis in Czech. In: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, Copyright © ACM , New York, NY, USA, ISBN 978-1-4503-0915-8, 2012. [pdf]
  • Veselovská, Kateřina: Sentence-Level Polarity Detection in a Computer Corpus. In: WDS'11 Proceedings of Contributed Papers, Part I, Copyright © Matfyzpress, Praha, Czech Republic, ISBN 978-80-7378-184-2, pp. 167-170, 2011. [pdf]