[ Skip to the content ]

Institute of Formal and Applied Linguistics

at Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic


[ Back to the navigation ]

Publication


Year 2014
Type in proceedings
Status published
Language English
Author(s) Kríž, Vincent Hladká, Barbora Nečaský, Martin Knap, Tomáš
Title Data Extraction Using NLP Techniques and Its Transformation to Linked Data
Czech title Extrakce dat pomocí NLP technik a jejich transformace do Linked Data
Proceedings 2014: Switzerland: MICAI 2014: 13th Mexican International Conference on Artificial Intelligence, MICAI 2014, Tuxtla Gutiérrez, Mexico, November 16-22, 2014. Proceedings, Part I
Pages range 113-124
How published print
Supported by 2012-2015 TA02010182 (Inteligentní knihovna - INTLIB) 2012-2016 PRVOUK P46 (Informatika)
Czech abstract Prezentujeme systém pro extrakci znalostní báze z nestrukturovaných textů. Bázy definujeme jako množinu entit a vztahů mezi nimi a reprezentujeme ji v ontologickém frameworku. Extrakční procedura zpracovává vstupní texty lingvistickými procedurami a extrahuje entity a vztahy mezi nimi z jejich syntaktické reprezentace. Následně jsou extrahované informace reprezentovány dle principů Linked Data. Systém je navržen nezávisle na doméně a jazyce textů tak, aby poskytl uživatelům inteligentnější vyhledávání než fulltextové. Prezentujeme první výsledky na českých legislativních dokumentech.
English abstract We present a system that extracts a knowledge base from raw unstructured texts that is designed as a set of entities and their relations and represented in an ontological framework. The extraction pipeline processes input texts by linguistically-aware tools and extracts entities and relations from their syntactic representation. Consequently, the extracted data is represented according to the Linked Data principles. The system is designed both domain and language independent and provides users with data for more intelligent search than full-text search. We present our first case study on processing Czech legal texts.
Specialization linguistics ("jazykověda")
Confidentiality default – not confidential
Open access no
ISBN* 978-3-319-13646-2
Address* Switzerland
Publisher* Springer International Publishing
Institution* Mexican Society for Artificial Intelligence
Organization* Instituto Tecnológico de Tuxtla Gutiérrez
Creator: Common Account
Created: 10/15/14 1:29 PM
Modifier: Common Account
Modified: 11/9/15 10:31 AM
***

Content, Design & Functionality: ÚFAL, 2006–2016. Page generated: Mon Sep 25 06:20:12 CEST 2017

[ Back to the navigation ] [ Back to the content ]

100% OpenAIRE compliant