Provider: 
Grant id: 
TA02010182
Duration: 
2012-2015

INTLIB

Intelligent library

The aim is to provide a more efficient and user-friendly tool for querying textual documents than full-text. On the input we assume a collection of documents related to a particular problem domain (e.g., legislation, medicine, environment, architecture, etc.). In the first phase we extract from the documents a knowledge base -- a set of objects and their relationship -- using natural language processing tools. In the second phase we deal with efficient and user friendly visualization and browsing (querying) of the extracted knowledge. The whole system is proposed as a general framework which can be modified and extended for particular data domains. To depicts its features we use the legislation and environmental domains.

Partners

Publications

  • Hladká Barbora, Holub Martin and Kríž Vincent: Feature Engineering in the NLI Shared Task 2013: Charles University Submission Report. Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pages 232–241, Atlanta, Georgia, June 13 2013
     
  • Holubová Irena, Knap Tomáš, Kríž Vincent, Nečaský Martin, Hladká Barbora: INTLIB - an INTelligent LIBrary. In: Proceedings of the Dateso 2014 Annual International Workshop on DAtabases, TExts, Specifications and Objects, Copyright © Czech Technical University in Prague, Faculty of Information Technology, Praha, Czechia, ISBN 978-80-01-05482-6, ISSN 1613-0073, pp. 13-24, 2014
     
  • Jakub Klímek, Jiří Helmich, Martin Nečaský: Application of the Linked Data Visualization Model on Real World Data from the Czech LOD Cloud. Proceedings of the Workshop on Linked Data on the Web co-located with the 23rd International World Wide Web Conference (WWW 2014), Seoul, Korea, April 8, 2014. CEUR-WS.org 2014 CEUR Workshop Proceedings.
     
  • Jakub Klímek, Jiří Helmich, Martin Nečaský: Payola: Collaborative Linked Data Analysis and Visualization Framework. Proceedings of 10th Extended Semantic Web Conference (ESWC 2013), Satellite Events, Demonstration Session. Montpellier, France, May 2013. Springer, LNCS 7955. Pp 147-151. ISBN 978-3-642-41241-7.
     
  • Tomáš Knap, Jan Michelfeit, Jakub Daniel, Petr Jerman, Dusan Rychnovský, Tomás Soukup, Martin Necaský: ODCleanStore: A Framework for Managing and Providing Integrated Linked Data on the Web. In the Proceedings of Web Information Systems Engineering - WISE 2012 - 13th International Conference, pp. 815 – 816. Paphos, Cyprus, November 28-30, 2012. Lecture Notes in Computer Science 7651 Springer 2012, ISBN 978-3-642-35062-7.
     
  • Jakub Kozak, Martin Necasky, Jan Dedek, Jakub Klimek, Jaroslav Pokorny: Linked Open Data for Healthcare Professionals. Proceedings of the 15th International Conference on Information Integration and Web-based Applications&Services (iiWAS 2013), Vienna, Austria. ACM International Conference Proceeding Series. ISBN 978-1-4503-2113-6. Pages 400-409.
     
  • Kríž Vincent: Detecting Semantic Relations in Texts and Their Integration with External Data Resources. In: WDS'13 Proceedings of Contributed Papers, Copyright © Matfyzpress, Praha, Czechia, ISBN 978-80-7378-250-4, pp. 18-23, 2013
     
  • Kríž Vincent, Hladká Barbora: RExtractor: a Robust Information Extractor. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, Copyright © Association for Computational Linguistics, Denver, CO, USA, pp. 21-25, 2015
     
  • Kríž Vincent, Hladká Barbora, Nečaský Martin, Dědek Jan: Statistical Recognition of References in Czech Court Decisions. In: 13th Mexican International Conference on Artificial Intelligence, MICAI 2014, Tuxtla Gutiérrez, Mexico, November 16-22, 2014. Proceedings, Part I, Copyright © Springer International Publishing, Switzerland, ISBN 978-3-319-13646-2, pp. 51-61, 2014
     
  • Kríž Vincent, Hladká Barbora, Nečaský Martin, Knap Tomáš: Data Extraction Using NLP Techniques and Its Transformation to Linked Data. In: 13th Mexican International Conference on Artificial Intelligence, MICAI 2014, Tuxtla Gutiérrez, Mexico, November 16-22, 2014. Proceedings, Part I, Copyright © Springer International Publishing, Switzerland, ISBN 978-3-319-13646-2, pp. 113-124, 2014
     
  • Nečaský Martin, Knap Tomáš, Klímek Jakub, Holubová Irena, Hladká Barbora: Linked Open Data for Legislative Domain - Ontology and Experimental Data. In: Lecture Notes in Business Information Processing, Copyright © Springer Berlin Heidelberg, ISBN 978-3-642-41686-6, pp. 172-183, 2013.
     
  • Jakub Stárka, Irena Holubová, Martin Nečaský. Strigil: A Framework for Data Extraction in Semi-Structured Web Documents. Proceedings of the 15th International Conference on Information Integration and Web-based Applications&Services (iiWAS 2013), Vienna, Austria. ACM International Conference Proceeding Series. ISBN 978-1-4503-2113-6. Pages 453 – 462.

Corpora

Demos

Presentations

  • 2015
     
    • Vincent Kríž. RExtractor: a Robust Information Extractor. September 2015. Recent developments in natural language processing and corpus Linguistics. Seminar on the 35th Anniversary of the Cooperation between Charles University in Prague and Hamburg University. MFF UK, Prague. (presentation)
       
    • Vincent Kríž. RExtractor: a Robust Information Extractor. May 2015. NAACL HLT 2015, Denver, Colorado, USA.
       
    • Vincent Kríž. RExtractor: a Robust Information Extractor. May 2015. Seminar KEG, VŠE, Prague.
       
    • Vincent Kríž. RExtractor: a Robust Information Extractor. March 2015. NLP Applications, MFF UK, Prague. (presentation)
       
  • 2014
     
    • Vincent Kríž. Statistical Recognition of References in Czech Court Decisions. November 2014, MICAI 2014, Tuxtla Gutierrez, Mexico. (presentation)
       
    • Vincent Kríž. Data Extraction using NLP techniques and its Transformation to Linked Data. November 2014. MICAI 2014, Tuxtla Gutierrez, Mexico. (presentation)
       
    • Vincent Kríž. Statistical Recognition of References in Czech Court Decisions. June 2014. Police of the Czech Republic, Prague.
       
    • Martin Nečaský, Barbora Hladká, Vincent Kríž. Data Extraction with NLP techniques and its Transformation to Linked Data. May 2014. Linguistic Mondays, MFF UK, Prague. (presentation, video​)
       
    • Vincent Kríž. Statistical Recognition of References in Czech Court Decisions. February 2014. Seminář strojového učení a modelování, MFF UK, Prague. (presentation)
       
  • 2013
     
    • Vincent Kríž. Detecting Entity Relations in Texts. December 2013. Seminar KEG, VŠE, Prague. (presentation)
       
    • INTLIB team. Audit, September 2013. (documentation)
       
    • Martin Nečaský. Linked Open Data for Czech Legislation. June 2013. (presentation)
       
    • Vincent Kríž. Detecting Semantic Relations in Texts and Their Integration with External Data Resources. June 2013. WDS 2013, MFF UK, Prague (presentation)
       
    • Martin Nečaský, Barbora Hladká. Half way through the INTLIB project. May 2013. Linguistic Mondays, MFF UK, Prague. (video)
       
    • Barbora Hladká, Vincent Kríž. Manual syntactic analysis of Czech legal texts. 2013. ​​(presentation)
       
    • Barbora Hladká, Vincent Kríž. Extraction relations between named entities from Czech (legal) texts. 2013. (presentation)