Abstract: In NLP we rely on manually annotated data, e.g. treebanks. Such data is hard to come by, explaining recent interests in semi-supervised NLP. However, our labeled data is also (almost always) extremely biased. This talk presents bias correction techniques and discusses their applicability in NLP.
Anders Søgaard did his Ph.D. in 2007 at the University of Copenhagen in mathematical linguistics. He has been a Senior Researcher at the University of Potsdam and now works as an Associate Professor at the University of Copenhagen. He was recently awarded an European Research Council Starting Grant.