Due to variability and ambiguity inherent in natural languages, Language Understanding has been always considered a bottleneck in the development of Dialogue Systems with respect to domain coverage and scalability. This is especially true in Spoken Dialogue Systems (SDS) in which the understanding component is usually in a pipeline after Automatic Speech Recognition (ASR) component and receives its propagated errors.
In this project, we propose improvements for Spoken Language Understanding (SLU) components in two ways.
First, we enhance accuracy of SLU by proposing a statistical model for enforcing linguistic expressive knowledge as a set of constraints on top of regular machine learning methods. In other words, the constraints fine tune the results of regular machine learning methods. The constraints are built automatically from large RDF databases using schema matching and graph search algorithms.
Second, we aim to build an SLU for open-domain environment. Instead of confining our SLU to predefined symbols, we aim to make a seamless connection between the SLU component and large RDF databases to expand its domain of understanding. This approach eliminates the dependence of SLU component on predefined domains and makes it useful for an open-domain environment.