Jan Hajic

Tel: +420 221 914 257 Fax: +420 221 914 309
Email: hajic@ufal.mff.cuni.cz
WWW: http://ufal.mff.cuni.cz/~hajic
Office: UFAL MFF UK Malostranske nam. 25, 4th floor, rm 420

Research  
Short Bio  
Publications  
Teaching  
Service  
Archive/Links  


News

2007/12/04 Some small updates made; 2007 publications added.
2007/07/18 (These) new web pages are finally in place! Look below, or click on the above shortcut links to the individual sections of my pages.


Research Interests, Grants

My research interest evolved from morphology and tagging of inflective languages (lexicons, analysis and generation tools - see this demo) to machine translation (French-English while at IBM and Czech-English; also, Czech-Russian and other closely related languages). I am also interested in parsing (see e.g. the CLSP Workshop on parsing Czech) and generation. However, in the past 10 years, I devoted most of my research time to creating linguistic resources, such as the Prague Dependency Treebank family of projects (Czech, English, Arabic).

I am also interested in spoken language understanding. I participated in the now finishing project Malach, both on the language modeling part (for ASR), on thesaurus translation and on the IR Czech test collection.

I closely work not only with my students, but also with other Czech and foreign teams, such as the University or West Bohemia in the Czech Republic, Center for Speech and Language Processing at the JHU, Center for Spoken Language Research at CU-Boulder, Linguistic Data Consortium, and several European Universities on EU projects (see below).

I am or have been the PI, or the national PI of several major Czech, EU and NSF (US) grants. The list of current projects is below.

2006-2009 EuroMatrix, STREP of the 6th FP of the EU (Coordinator: Hans Uszkoreit, Univ. of Saarland, Germany)
2006-2010 Companions, IP of the 6th FP of the EU (Coordinator: Yorick Wilks, Univ. of Sheffield, GB)
2002-2007 Malach A project for automatic speech recognition (in many languages) of taped interviews with Holocaust survivors, collected by the Shoah Visual History Foundation. Also, Information Retrieval experiments and resource creation.
2006-2010 PIRE, a project funded by the NSF to promote U.S. graduate student education in Europe. Topic: Investigation of Meaning Representations in Language Understanding for Speech Reconstruction and Machine Translation Systems.
2005-2009 From Language to the Semantic Web, a project funded by the Academy of Sciences of the Czech Republic: design of a knowledge representation system and its relation to Natural Language.
2006-2008 Language Understanding and Machine Translation, a project funded by the Grant Agency of the Czech Republic to create resources for understanding-based MT and Speech Understanding.
2005-2009 Center for Computational Linguistics, a virtual Center for joint research with the University of West Bohemia, Masaryk University of Brno, and the Institute of the Czech Language in Prague)

Before that, I have been the PI or Co-PI of many other projects, such as the Czech Grant-Agency supported highly collaborative, nation-wide Czech National Corpus project (2003-2006), of several collaborative grants for mutual visits to/from U.S. institutions (Johns Hopkins University, University of Pennsylvania, Univ. of Colorado), and of several smaller subcontracting grants (such as the U.S.-based GALE project). In the 90s, I have been the Czech PI of several collaborative EU projects specifically aimed at the formerly Soviet Bloc Countries (EU project STEEL, EU project CEGLEX).

I have been working on some other grants as a researcher as well, such as the predecessor Center for Computational Linguistics (2000-2004), the Laboratory for Linguistic Data (1996-2000), Czech-English MT project supported by the Czech Grant Agency MATRACE (1993-1995), and many smaller projects.

Several industrial projects have got my attention as well, such as the Czech Grammar Checker project and certain lexicon(s) for Microsoft, morphological databases for companies like IBM, Xerox, Lotus, Morphologic, Zi Corp., Lernout & Hauspie, and cooperation on product development for several Czech companies, such as ASPI (legal information system using NL search), Oracle (the Oracle Context product) and morphological dictionary development for the Czech and Slovak portal centrum.cz and centrum.sk.

Back to top.


Short Bio

My full current CV can be downloaded from here (Czech language version).

2003- Director, Institute of Formal and Applied Linguistics, School of Computer Science, Faculty of Mathematics and Physics, Charles University in Prague.
2007- Full Professor of the Charles University in Prague
2003-2007 Associate Professor of the Charles University in Prague
2002 Team Leader, CLSP JHU Summer Workshop, Generation in the Context of Machine Translation
1999-2000 Visiting Assistant Professor, Computer Science Dept. and Center for Speech and Language Processing, Johns Hopkins University, Baltimore, MD, USA. Teaching "Introduction to NLP" and "Data Structures"
1998 Team Leader, CLSP JHU Summer Workshop, Core Natural Language Processing Technology Applicable to Multiple Languages
1994 PhD ("Dr.") in Computational Linguistics, Faculty of Mathematics and Physics, Charles University in Prague. Topic: Computational Morphology of Czech.
1993-2003 Researcher, Assistant Professor, Institute of Formal and Applied Linguistics, School of Computer Science, Faculty of Mathematics and Physics, Charles University in Prague.
1991-1993 Visiting Scientist, IBM T.J.Watson Research Center, Yorktown Heights, NY, USA. Project: Candide (Statistical Machine Translation French -> English, project head(s): Robert Mercer, Peter Brown)
1990,1991 Visiting Scientist, ISSCO, Univ. of Geneva, Switzerland. Project: Multilingual Morphological Analysis.
1984-1991 Researcher, Research Institute of Mathematical Machines, Prague. Project: Machine Translation Czech -> Russian (software documentation).
1979-1984 Bc. & Master Degree study, Faculty of Mathematics and Physics, Charles University in Prague (high honors, RNDr. 1984, thesis topic: Natural Language Robot Control).

Back to top.


Publications

My publications (only after 1998) and until 2005 and those in 2006-2007. For a complete list of my publications until the end of 2005 please see this PDF.

Back to top.


Teaching

I am now teaching an adapted version of the "Introduction to (statistical) NLP" course which I developed while at JHU. The current course is divided into two parts: NPFL067 and NPFL068. Please see also my Hopkins' archive web pages for more information and the compelte set of foils in html form.

My other current and former teaching at Charles University in Prague can be found here.

Back to top.


Service

General Chair

2010 ACL'10, Uppsala, Sweden

Program Committee Chair, Co-chair

2007 TLT'07 (Treebanks and Linguistic Theories), Bergen, Norway
2006 TLT'06 (Treebanks and Linguistic Theories), Prague, Czech Rep.
2003 EACL'03 (European ACL Conference), Budapest, Hungary
2002 EMNLP'02 (Empirical Methods in NLP), Philadelphia, PA, USA
1999 Thematic Session on "Parsing inflective and free word order languages" ACL '99, June 1999, College Park, MD, USA

Program Committee Area Chair, Full PC Member

2004 EMNLP'04, Barcelona, Spain
2004 EAMT Workshop, La Valetta, Malta
2002 ACL'02, Philadelphia, PA, USA
1995 EACL'95, Dublin, Ireland
2003- Text, Speech and Dialog Conference, Czech Rep., (standing) PC (SC) Member

I have also served as a reviewer at additional 29 conferences or workshops (between 1994 and 2008).

Organization of conferences and workshops

2007 ACL'07, Prague, Czech Republic (Local Coordinator)
2006 TLT'06, Prague, Czech Republic
2006- Vilem Matheisus Courses (Schools), Prague, Czech Republic

Committees, Boards

2008-2010 Computational Linguistics, Editorial Board Member
2003- NSF Panels (ITR, HLT)
1999-2002 TEI Consortium Board of Directors Member, ACL Representative
1998-1999 TEI Steering Committee Member, ACL Representative
1997- EU Evaluation Committee(s), Research Projects
1996- Grant Agency of the Czech Republic, reviewer (Linguistic and Computer Science Programs)
1995-1996 European Chapter of the ACL Advisory Board Member
1990- Czech National Corpus Founding Member

Awards

2001 Silver Medal of the Charles University in Prague (for the Czech National Corpus)

Membership

I am member of the ACL, ISCA, ACM, IEEE, Czech Cybernetics Society and the Prague Linguistic Circle.

Back to top.


Former Web Page(s)

You might want to visit my previous page(s) and teaching pages at http://www.cs.jhu.edu/~hajic.

You might also want to visit our Institute's pages at http://ufal.mff.cuni.cz.

Back to top.