Jan Hajič jr.
Main Research Interests
Optical Music Recognition: I've recently published the MUSCIMA++ dataset.
Music Information Retrieval in general (see e.g. the defended Bc. thesis of Marek Židek).
Bayesian models, non-parametric Bayesian models
Neural networks for text modeling
Multimodal (text/image) models
Ribosomal RNA secondary structure prediction
Multimodal Optical Music Recognition (GAUK 1444217), 2017 - 2019 (PI).
Convolutional Neural Networks for Optical Music Recognition (GAUK 170217), 2017 - 2018 (Co-investigator)
rRNA Secondary Structure Prediction (GAUK 550214), 2015 - 2016 (PI).
My CV is available here: CV_HajicJr.pdf
Jan Hajič jr., Pavel Pecina.: The MUSCIMA++ Dataset for Handwritten Optical Music Recognition. Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, Osaka Prefecture University, Kyoto, Japan, November 2017. pp. ?? [accepted manuscript] [pdf]
Hajič jr., J. & Pecina, P.: Detecting Noteheads with ConvNets and Bounding Box Regression. Technical report, to appear in ArXiv e-prints, 2017 [pdf]
Hajič jr., J. & Pecina, P.: In Search of a Dataset for Handwritten Optical Music Recognition: Introducing MUSCIMA++
ArXiv e-prints, 1703.04824, 2017 [pdf]
Hajič jr., J.; Novotný, J.; Pecina, P. & Pokorný, J.: Further Steps towards a Standard Testbed for Optical Music Recognition. Proceedings of the 17th International Society for Music Information Retrieval Conference, New York University, 2016, 157-163 [pdf]
Straka, M.; Hajič, J.; Straková, J. & Hajič jr., J.: Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle. 14th International Workshop on Treebanks and Linguistic Theories (TLT 2015), IPIPAN, 2015, 208-220
Hajič jr., J. & Pecina, P.: Matching Illustrative Images to “Soft News” Articles. In: UFAL WDS 2015 (Conference of PhD Students in Mathematical Linguistics), Institute of Formal and Applied Linguistics, Charles University in Prague, 2015, 49-56
Veselovská, K.; Hajič jr., J. & Šindlerová, J.: Subjectivity Lexicon for Czech: Implementation and Improvements.
Journal for Language Technology and Computational Linguistics, German Society for Computational Linguistics and Language Technology, 2014, 29, 47-61 [pdf]
Veselovská, K. & Hajič jr., J.: Why Words Alone Are Not Enough: Error Analysis of Lexicon-based Polarity Classifier for Czech. Proceedings of the 6th International Joint Conference on Natural Language Processing, Asian Federation of Natural Language Processing, 2013, 1-5 [pdf]
Veselovská, K.; Hajič jr., J. & Šindlerová, J.: Creating Annotated Resources for Polarity Classification in Czech
Proceedings of the 11th Conference on Natural Language Processing, Schriftenreihe der Österreichischen Gesellschaft für Artificial Intelligende (ÖGAI), 2012 [pdf]
I am open to topics concerning music technology. Currently, I am supervising:
Marek Židek (defended Bc. thesis on generating music with LSTMs, includes significant effort in evaluation)
Jiří Balhar (working on Bc. thesis on melody extraction from orchestral audio)
I am a PhD student at ÚFAL, writing my thesis under RNDr. Pavel Pecina on the topic of Neural Network Models for Intepretation of Multimodal Data. This work is done for the CEMI project. In 2016, I started focusing on Optical Music Recognition (with the eventual goal of applying these multimodal models). I am generally interested in music informatics: if you are a student and have interest in music, especially machine learning for musical applications, I will be happy to hear about it! For instance, we did some music generation (interview in Czech).
My Mgr. thesis, also under RNDr. Pecina, was on the topic of automatically selecting images for news articles. This work was also done for the CEMI project. My thesis is available here: Matching Images to Texts
I have previously worked on the SEANCE project on Sentiment Analysis, with my Bc. thesis and in the following years.