Lectures (fall semester):
Where: Malostranské nam. 25, 1st floor, S8
When: Thu 9:00-10:30
Seminars (fall semester):
Where: Malostranské nam. 25, 1st floor, S8
When: Thu 10:40-12:10
Instructor: Jan Hajič/Pavel Pecina
Email: hajic@ufal.mff.cuni.cz
WWW: http://ufal.mff.cuni.cz/~hajic
Office(s): MFF UK Malostranske nam., 4th floor, rm 420/422
Students should have a substantial programming experience in either C, C++, Java and/or Perl, and have preferably taken Data Structures (Datove struktury, NTIN060), Unix (NSWI095), and Intro to Probability (NMAI059) or their equivalents, even though all the probability theory needed will be re-explained. Knowledge of, or willingness to learn the basics of Perl as-you-go (and on your own) is also important. One of the benefits of the course is that it is given in English; it should enable you to read current literature on NLP more smoothly, since the literature is almost exclusively in English. Czech terminology will be explained for those interested.
The material covered in this course is selected in such a way that at its completion you should be able to understand papers in the field of Natural Language Processing, and it should also make your life easier when taking more advanced courses either at UFAL MFF UK or elsewhere.
No background in NLP is necessary.
![]() |
Manning, C. D. and H. Schütze: Foundations of Statistical Natural Language Processing. The MIT Press. 1999. ISBN 0-262-13360-1. Eight copies of this book are available at the CS library for borrowing. Please be considerate to other students and do not keep the book(s) longer than absolutely necessary. |
![]() |
Jurafsky, D. and J. H. Martin: Speech and Language Processing. Prentice-Hall. 2000. ISBN 0-13-095069-6. Three copies of Jurafsky's book are available at UFAL's library. |
![]() |
Wall, L., Christiansen, T. and R. L. Schwartz: Programming PERL, 3rd ed.. O'Reilly. 1996. ISBN 0-596-00027-8. (Sorry no large cover picture available.) |
![]() |
Allen, J.: Natural Language Understanding. The Benajmins/Cummings Publishing Company Inc. 1994. ISBN 0-8053-0334-0. |
![]() |
Cover, T. M. and J. A. Thomas: Elements of Information Theory. Wiley. 1991. ISBN 0-471-06259-6. |
![]() |
Charniak, E.: Statistical Language Learning. The MIT Press. 1996. ISBN 0-262-53141-0. |
![]() |
Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press. 1998. ISBN 0-262-10066-5. Four copies of Jelinek's book are available at UFAL's library, but they are primarily reserved for those taking Nino Peterek's and/or Filip Jurcicek's courses. |
Some of the Proceedings are available at UFAL's library, physically and/or in electronic form. Most of them are, however, freely available through the ACL Anthology, including all volumes of the Computational Linguistics journal and the newTransactions of the ACL journal.
For MFF UK students, please see http://www.ms.mff.cuni.cz/labs/unix. For others, please visit http://www.ms.mff.cuni.cz/students/externisti.html.
cd
tar -czvf ~/username.assignx.tgz ./*
Send the resulting file by e-mail (as an attachment) to
with the following subject line:
Subject:
e.g.
Subject: Jan.Novak 2
for Jan Novák, turning in the second assignment.
No plagiarism will be tolerated. The assignments are to be worked on on your own; please respect it. If the instructor determines that there are substantial similarities exceeding the likelihood of such an event, he will call the two (or more) students to explain them and possibly to take an immediate test (or assignment, at the discretion of the instructor, not to exceed four hours of work) to determine the student's abilities related to the offending work. *All* cases of confirmed plagiarism will be reported to the Student Office.
For each day your submission is late, 5 points will be subtracted from the points awarded to the solution or a part of it, up to max. of 50 points per homework. Submissions received less then 4 weeks before the closing date of the term will not be graded and will be awarded 0 points.
No. | Course | Due date | Task | Resources |
---|---|---|---|---|
#1 | NPFL067 | February 29, 2016 | Exploring Entropy and Language Modeling | TEXTEN1.txt (large!) TEXTCZ1.txt(large!) |
#2 | NPFL068 | June 30, 2016 | Word Classes | TEXTEN1.txt (large!) TEXTCZ1.txt(large!) TEXTEN1.ptg (large!)TEXTCZ1.ptg (large!) |
#3 | NPFL068 | July 31, 2016 | Tagging | texten2.ptg (large!) textcz2.ptg(large!) |
Exam | Date, Time | Where |
---|---|---|
NPFL067 | Jan. 12, 2016, 10:40-11:40 | S6 |
NPFL068 | May 16, 2016, 10:40-12:10 | S6 |
NPFL068. 2nd date | June 30, 2016, 10:00-11:30 | 420 |
Both the mid-term and the final exams will be written (not oral), with about 6 major questions and some subquestions. You will have 60 minutes for the mid-term (fall semester or "ZS", NPFL067), and up to 90 minutes for the final exam (i.e., spring semester or "LS", NPFL068) to write down the answers.
To get an idea of the type of exam questions, please see the questionaire for one of the previous year's final exam (Questionnaire).
As stated above, your final grade (or pass/fail for PhD students) will be determined by both the final exam and your assignment results in a 50:50 ratio (NPFL067), and 1:1:1 (or in other words, roughly 33:33:33) for NPFL068.
In special circumstances (long-term absence etc.), some other schedule and grading scheme could be worked out individually, but please try hard to hand in all assignments in time and come for the final exam on the regular date.
The official, 'usual' grading table is now available here. You will need a username and password to access it - I will email it to you.
The official, 'usual' grading table is now available here. You will need a username and password to access it - same as above (it has been mailed to you).
The web pages from 2012/2013, including grading (password needed as usual) are available at http://ufal.mff.cuni.cz/~hajic/courses/archive/npfl067/1213/syllabus.html.
The original web pages for this course are also still active at http://www.cs.jhu.edu/~hajic/courses/cs465/syllabus.html.