Otakar Smrž, Ph.D.
Personal Information
LinkedIn:
cz.linkedin.com/in/otakarsmrz
E-mail:
otakar.smrz seznam.cz
Education
2001 – 2007 Institute of Formal and Applied Linguistics
Faculty of Mathematics and Physics, Charles University in Prague
Ph.D. studies in Mathematical Linguistics
[www]
2004 – 2005 Department of Computer and Information Science
University of Pennsylvania
Fulbright-Masaryk Fellowship grantee
[www]
2000 – 2001 Faculty of Mathematics and Physics, Charles University in Prague
one year of MSc. studies in Informatics
1996 – 2001 Faculty of Mathematics and Physics, Charles University in Prague
MSc. degree in Physics, summa cum laude, master's thesis in Geophysics
1992 – 1996 Gymnázium Josefa Ressela Chrudim
grammar school graduation
Employment
January 2005 – present Institute of Formal and Applied Linguistics, Charles University
researcher, project coordinator
[www]
September 2005 – February 2006 Department of Middle Eastern Studies, University of West Bohemia
lecturer
[www]
September 2004 – June 2005 Linguistic Data Consortium, University of Pennsylvania
visiting scholar and researcher
[www]
October 2001 – December 2004 Center for Computational Linguistics, Charles University
researcher, project coordinator
[www]
January 2002 – December 2006 Institute of Comparative Linguistics, Charles University
programmer of tools for dictionary compilation
March 2001 – November 2001 Language Centre David Holiš
webmaster and system administrator
IT Experience
Programming Languages: Perl (experienced), PHP, C, Pascal, Fortran
Haskell (experienced), Oz, Lisp
TeX (dictionary typesetting)
Programming Tools: Perl modules and libraries, TrEd contexts and macros, PPM, POD
Haskell libraries, Cabal, Haddock, QuickCheck
TeX/LaTeX packages, ArabTeX, Beamer, Listings, PSTricks, PGF
XML technology (XPath, XSLT, DTD, XSD), HTML, CSS, XSH
Matlab/Octave, IDL
Operating Systems: Red Hat Linux, MS Windows
Application Software: Emacs, SVN/CVS, TextPad, WinEdt, Servant Salamander, MS Office, TrEd
Language Skills
English fluent (TOEFL in Sep 2002, total score 267, essay rating 4.0)
Arabic fluent (Basic State Exam in Dec 2000, with high distinction)
Farsi elementary (most recent study)
Korean conversational (university courses in 2002, previous experience in 2001 and earlier study)
German conversational (six non-recent years of study)
French passive (experience in 2002, earlier one semester of study)
Russian passive (two non-recent years of study)
Slovak passive (perfect comprehension)
Czech fluent (native language)
Recent Projects
Prague Arabic Dependency Treebank research project of Charles University in Prague with application in machine translation
scientific contributions, coordination of international working group
[www] [www]
ElixirFM – Functional Arabic Morphology doctoral thesis in Mathematical Linguistics
original theory and implementation
[www] [pdf] [html]
Encode Arabic Haskell [Encode] and Perl [Encode::Arabic] libraries for encodings of Arabic [www] [cgi] [pdf]
MorphoTrees methodology and implementation of efficient morphological annotation [gif] [gif] [gif] [pdf]
ArabTeX Plus extensions of ArabTeX concerning its notation and the support for colorizing [www] [pdf] [pdf]
Haskell and Domain-Specific Languages gradute course for the Faculty of Mathematics and Physics, Charles University
lecturing, original design of the course
[www] [pdf]
ANLP – Arabic Natural Language Processing undergradute course for the Middle Eastern studies, University of West Bohemia
lecturing, original design of the course
[www] [pdf]
Qamus open project of compilation of Arabic-Czech and Czech-Arabic dictionary
computer method design, resource compilation, programming and data processing
[html/xml]
ArabSpell spell-checker of Arabic following formal rules [www] [pdf]
... projects in Oriental languages processing of dictionaries, typesetting of books etc.
design and implementation of tools for public use
Printed Publications
Prague Arabic Dependency Treebank: A Word on the Million Words Otakar Smrž, Viktor Bielický, Iveta Kouřilová, Jakub Kráčmar, Jan Hajič, Petr Zemánek
LREC 2008 Workshop on Arabic and Local Languages, Marrakech, Morocco, 2008
[pdf] [www]
Building the Valency Lexicon of Arabic Verbs Viktor Bielický, Otakar Smrž
LREC 2008 Conference, Marrakech, Morocco, 2008
[pdf] [www]
Functional Arabic Morphology: Dissertation Summary Otakar Smrž
Prague Bulletin of Mathematical Linguistics 88, 2007
[pdf] [www]
Functional Arabic Morphology. Formal System and Implementation Otakar Smrž
Doctoral Thesis, Charles University in Prague, July 2007
[pdf] [pdf-short]
ElixirFM – Implementation of Functional Arabic Morphology Otakar Smrž
Computational Approaches to Semitic Languages, ACL 2007, Prague
[pdf]
Tips and Tricks of the Prague Arabic Dependency Treebank Otakar Smrž
The Challenge of Arabic for NLP/MT, London, UK, 2006
[pdf-paper]
[pdf-slides]
Encode Arabic: Exercise in Functional Parsing
(under review)
Otakar Smrž
(unpublished manuscript)
[pdf]
Information Structure with the Prague Arabic Dependency Treebank
(under review)
Otakar Smrž, Petr Zemánek, Jakub Kráčmar, Viktor Bielický
Communication and Information Structure in Spoken Arabic, College Park, USA
[pdf-paper]
[pdf-slides]
The Other Arabic Treebank: Prague Dependencies and Functions
(under review)
Otakar Smrž, Jan Hajič
Arabic Computational Linguistics: Current Implementations, CSLI Publications (to appear)
[pdf]
Feature-Based Tagger of Approximations of Functional Arabic Morphology Jan Hajič, Otakar Smrž, Tim Buckwalter, Hubert Jin
Proceedings of TLT 2005, Barcelona, Spain, 2005
[pdf-paper]
[pdf-slides]
Learning to Use the Prague Arabic Dependency Treebank Otakar Smrž, Petr Pajas, Zdeněk Žabokrtský, Jan Hajič, Jiří Mírovský, Petr Němec
Perspectives on Arabic Linguistics XIX, John Benjamins, 2007
[pdf] [ps]
MorphoTrees of Arabic and Their Annotation in the TrEd Environment Otakar Smrž, Petr Pajas
NEMLAR Conference Proceedings, Cairo, Egypt, 2004
[pdf] [ps] [pps]
Prague Arabic Dependency Treebank: Development in Data and Tools Jan Hajič, Otakar Smrž, Petr Zemánek, Jan Šnaidauf, Emanuel Beška
NEMLAR Conference Proceedings, Cairo, Egypt, 2004
[pdf] [ps] [pps]
Arabic Syntactic Trees: from Constituency to Dependency Zdeněk Žabokrtský, Otakar Smrž
EACL'03 Research Note, Budapest, Hungary, 2003
[pdf] [ps] [pps]
Sherds from an Arabic Treebanking Mosaic Otakar Smrž, Petr Zemánek
Prague Bulletin of Mathematical Linguistics 78, 2002
[pdf] [www]
Searching for Non-linearities in Natural Language Kiril Ribarov, Otakar Smrž
7th Experimental Chaos Conference, San Diego, USA, 2002
[pdf]
External Tools Not Only for ArabTeX Documents Karel Mokrý, Otakar Smrž
International Symposium on Processing of Arabic, Manouba, Tunisia, 2002
[pdf] [ps] [pps]
Earthquake of Athens, 1999: Study of Aftershocks Otakar Smrž
Master's Thesis, Charles University in Prague, 2001
[pdf-I] [pdf-II]
[zip/ps-I+II]
Lectures & Reviews
Programming the Arabic Treebank Otakar Smrž
Invited Lecture, National Centre for Language Technology, Dublin City University, 2008
[pdf]
Demo Proposal: Extensible Integrated Treebank Annotation Environment
Otakar Smrž
Computational Approaches to Arabic Script-based Languages, Stanford, 2007
[pdf]
Functional Arabic Morphology: Principles of Design
(Funkční arabská morfologie)
Otakar Smrž
Research Report, Formal Linguistics Seminar, Charles University in Prague, 2006
[pdf-slides] [pdf]
Prague Arabic Dependency Treebank
(Prague Treebanking for Everyone)
Otakar Smrž
Invited Lecture, Vilem Mathesius Lecture Series 21, Prague, 2006
[pdf-slides] [pdf]
[www]
Impressive Haskell
Otakar Smrž
Invited Lecture, Seminar of the Institute of Formal and Applied Linguistics, Kvilda, 2006
[pdf]
[www]
Yet Another Introduction to Arabic NLP
Otakar Smrž
Lecture Notes, Faculty of Philosophy, University of West Bohemia in Pilsen, 2005
[pdf]
[www]
Functional Morphology
by Markus Forsberg and Aarne Ranta
Otakar Smrž
LDC Institute Presentation, University of Pennsylvania, Philadelphia, USA, 2005
[pdf] [pps]
[www]
Review of Finite State Morphology
by Kenneth R. Beesley and Lauri Karttunen
Otakar Smrž
Prague Bulletin of Mathematical Linguistics 81, 2004
[pdf]
[www]
Review of A Student Grammar of Modern Standard Arabic
by Eckehard Schulz
Iveta Kouřilová, Otakar Smrž
The Linguist List, online issue 16.2221, 2005
[www] [www]
Intro to Natural Language Processing
(in Czech)
Otakar Smrž
Presentation of PADT, Faculty of Arts, Charles University in Prague, 2003
[pps]
[www]
Scripts of Ancient Orient and Modern Information Technology
(in Czech + English)
Otakar Smrž
Orientalia Antiqua Nova, University of West Bohemia in Pilsen, 2006
[pdf]
[www]
International Meetings
October 2006 The British Computer Society, London, UK
The Challenge of Arabic for NLP/MT
September 2006 Portland, OR, USA
International Conference on Functional Programming 2006
August–September 2006 International Center for Persian Studies, Tehran, Iran
Elementary course in Farsi
August 2006 Faculty of Literature and Human Science, University of Tehran, Tehran, Iran
Invited lecture on Prague Arabic Dependency Treebank
June 2006 University of Maryland, College Park, USA
Conference on Communication and Information Structure in Spoken Arabic
December 2005 University of Barcelona, Barcelona, Spain
The Fourth Workshop on Treebanks and Linguistic Theories TLT 2005
August 2005 Heriot-Watt University, Edinburgh, Scotland, UK
ESSLLI 2005 European Summer School in Logic, Language and Information
July 2005 Johns Hopkins University, Baltimore, MD, USA
CLSP Workshop '05 on Parsing Arabic Dialects (visiting participant)
June 2005 University of Michigan, Ann Arbor, MI, USA
ACL 2005 Conference
May 2005 University of Chicago, Chicago, IL, USA
Chicago 2005 Machine Learning Summer School
April 2005 University of Illinois, Urbana-Champaign, IL, USA
XIXth Arabic Linguistics Symposium
September 2004 Cairo, Egypt
NEMLAR International Conference on Arabic Language Resources and Tools
November–December 2003 Yemen Language Center, Sanaa, Yemen
Individual advanced course in Arabic
July 2003 Charles University in Prague, Czech Republic
XVIIth International Congress of Linguists
July 2003 Johns Hopkins University, Baltimore, MD, USA
CLSP Workshop '03 Summer School in Natural Language Processing
May 2003 Charles University in Prague, Czech Republic
Prague-Penn Arabic Treebanking Workshop
April 2003 Budapest, Hungary
EACL'03 Conference
December 2002 Pisa, Italy
ISLE/EAGLES Workshop, CNR meeting on Arabic
July 2002 University of Pennsylvania, Philadelphia, PA, USA
Penn-Prague Arabic Treebanking Workshop, ACL'03 Conference
May 2002 Las Palmas, Canary Islands, Spain
LREC'02 Conference
April 2002 University of Manouba, Tunis, Tunisia
International Symposium on Processing of Arabic
July–August 2001 International House, Seoul, South Korea
Korean language and culture lessons
June 2001 Smolenice Castle, Slovak Republic
Czech-Slovak Seismological Days
July–August 1999 Bourguiba Institute of Modern Languages, Tunis, Tunisia
35th Intensive Summer Course in Arabic
Other Information
Non-profit organizations Opus arabicum, former vice-chairman and webmaster [www]
ACL SIG on Computational Approaches to Semitic Languages [www]
Fellowships & grants Fulbright-Masaryk Fellowship, University of Pennsylvania, September 2004 – June 2005 [www]
Grant Agency of Charles University in Prague, project UK 373/2005, 2005 – 2006 [www]