Zpráva o účasti týmu Univerzity Karlovy v soutěži vyhledávání zdravotních informací CLEF eHealth Evaluation Lab 2016.
In this paper we present our participation as the team of
the Charles University at Task3 Patient-Centred Information Retrieval. In the monolingual task and its subtasks, we submitted two runs: one is based on language model approach and the second one is based on vector space model. For the multilingual task, Khresmoi translator, a Statistical Machine Translation (SMT) system, is used to translate the queries into English and get the n-best-list. For the baseline system, we take 1-best-list translation and use it for the retrieval, while for other runs, we use a machine learning model to rerank the n-best-list translations and predict the translation that gives the best CLIR performance in terms of P@10. We present set of features to train the model, these features are generated from the SMT verbose output, different resources like UMLS Metathesaurus, MetaMap, document collection and from the Wikipedia
articles. Experiments on previous CLEF eHealth IR tasks test set show significant improvement brought by the reranker over the baseline system.
default – not confidential
Kristzian Balog; Linda Cappellato; Nicola Ferro; Craig Macdonald
School of Sciences and Technology of the University