======================================================== ======================================================= ACL 2007 NEWSLETTER NO.3 (February 23, 2007) ======================================================== ======================================================= ======================================================= :: Important Dates ======================================================= Posters and Demos submission deadline: March 1, 2007 Workshops and Co-Located Events deadlines: see Sect. 4 No extensions will be granted. Workshops: June 23-24 and 28-30, 2007 Tutorials: June 24, 2007 Main conference: June 25-27, 2007 ======================================================== :: Table of Contents ======================================================= 1. ACL 2007 in Prague, Czech Republic 2. Organising Committee 3. Last Call for Posters and Demos 4. Workshops and Co-Located Events 5. Abstracts of Tutorials 6. Sponsorship News 7. Czech Airlines Discounts 8. Deadline for a (very) Cheap Accommodation in TOP Hotel 9. Newsletter No.4 ======================================================== :: 1. ACL 2007 in Prague, Czech Republic ======================================================= The conference is organized by the Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague ("Univerzita Karlova v Praze"), Czech Republic, the oldest University in Europe north of the Alps (founded in 1348). The conference will take place in the TOP HOTEL Praha, located in the quiet neighborhood of the Prague 4 district, only 15-20 minutes from thehistoric center of Prague. The hotel can accommodate up to 1000 participants on-site (with a small number of dormitory rooms available nearby). The hotel offers one auditorium, three large lecture rooms, number of smaller rooms for tutorials and workshops, several restaurants and cafes and lot of open air space for walks and informal discussions. The conference banquet and a conference concert will take place in the historic buildings in the city center -- one in the Municipal Hall (built in the Art-nouveau style of the early 20th century) and the other in the 14th century main University Hall. ======================================================== :: 2. Organising Committee ======================================================= General Chair: John Carroll Local Arrangements Chair: Eva Hajicova Program Chairs: Annie Zaenen and Antal van den Bosch Student Workshop Chairs: Violeta Seretan and Chris Biemann, with Ellen Riloff Workshops Chair: Simone Teufel Tutorials Chair: Joakim Nivre Demos/Posters Chair: Sophia Ananiadou Exhibits Chair: Jaroslava Hlavacova and Pavel Pecina Sponsorship: Martha Palmer, Gabor Proszeky and Jan Hajic Publicity: Pavel Stranak and Jiri Mirovsky Publications: Su Jian Mentoring Service: Florence Reeder Student Volunteers: Marketa Lopatkova Website: Zlatica Subrova and Juraj Simlovic Secretariat: Anna Kotesovcova Registration: Priscilla Rasmussen ======================================================== :: 3. Last Call for Posters and Demos ======================================================= Demos/Posters Chair: Sophia Ananiadou (School of Computer Science, University of Manchester, UK) Important Dates: Paper submission deadline March 1, 2007 5pm US Eastern time (10pm GMT) Notification of acceptance April 2, 2007 Camera ready copies due May 4, 2007 ACL-07 features a special session for posters and demonstrations. Topics of interest cover all aspects of computational linguistics, as outlined in the main conference call for papers. Posters and demos will be held on June 25 till June 27, at the main conference venue. Posters should describe original work in progress, and present innovative methodologies used to solve a problem in computational linguistics or NLP. Demos should be concerned with mature systems or prototypes in which computational linguistics or NLP technologies are used to solve practically important problems. Demo papers should describe how the system is used to solve practical problems and should include a discussion of the implementation and case studies of how the system is applied. For further information, see the conference web site: http://ufal.mff.cuni.cz/acl2007/demos/ ======================================================== :: 4. Workshops and Co-Located Events ======================================================= This is the schedule of Workshops and Co-Located Events at ACL 2007. Paper submission deadlines are listed, too. For further information and calls for papers, see http://ufal.mff.cuni.cz/acl2007/workshops/ Co-Located Events ----------------- June 23 (Sat) and 24 (Sun) IWPT: International Workshop on Parsing Technology Submission deadline: March 26, 2007 June 28 (Thu) and 29 (Fri) EMNLP/CoNLL: Empirical Methods in Natural Language Processing Submission deadline: March 26, 2007 Workshops --------- June 23 (Sat) and 24 (Sun) WS1: SemEval-2007: 4th International Workshop on Semantic Evaluations Submission deadline for Description Papers: April 17, 2007 June 23 (Sat) WS2: Second Workshop on Statistical Machine Translation Submission deadline: April 2, 2007 June 28 (Thu) and June 29 (Fri) WS9: ACL-PASCAL Workshop on Textual Entailment and Paraphrasing Submission deadline for general papers: March 26, 2007 WS10: The Linguistic Annotation Workshop (The LAW) A Merger of NLPXML 2007 and FLAC 2007 Submission deadline: March 26, 2007 June 28 (Thu) WS3: Computational Approaches to Semitic Languages: Common Issues and Resources Submission deadline: March 11, 2007 WS4: Language Technology for Cultural Heritage Data (LaTeCH 2007) Submission deadline: March 26, 2007 WS14: Embodied Language Processing Submission deadline: March 26, 2007 WS6: A Broader Perspective on Multiword Expressions Submission deadline: March 26, 2007 WS7: Deep Linguistic Processing Submission deadline: March 26, 2007 WS8: Computing and Historical Phonology Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology Submission deadline: March 26, 2007 WS11: 4th ACL-SIGSEM Workshop on Prepositions Submission deadline: March 26, 2007 June 29 (Fri) WS12: Balto-Slavonic Natural Language Processing 2007 Special Theme: Information Extraction and Enabling Technologies Submission deadline: April 1, 2007 WS13: Grammar-based approaches to spoken language processing Submission deadline: March 26, 2007 WS5: BioNLP 2007 Submission deadline: March 26, 2007 WS15: Cognitive Aspects of Computational Language Acquisition Submission deadline: March 26, 2007 ======================================================== :: 5. Abstracts of Tutorials ======================================================= The following five tutorials will be offered at ACL 2007 in Prague, June 24, 2007: Tutorial #1: Bayesian Nonparametric Structured Models ----------------------------------------------------- Percy Liang, Dan Klein Probabilistic modeling is a dominant approach for both supervised and unsupervised learning tasks in NLP. One constant challenge for models with latent variables is determining the appropriate model complexity, i.e. the question of "how many clusters." While cross-validation can be used to select between a limited number of options, it cannot be feasibly applied in the context of larger hierarchical models where we must balance complexity in many parts of the model at the same time. Nonparametric "infinite" priors such as Dirichlet processes are powerful tools from the Bayesian statistics literature which address exactly this issue. Such priors, which have seen increasing use in recent NLP work, allow the complexity of the model to adapt to the data and admit more tractable and elegant inference methods than traditional model selection approaches. In explaining how to do inference in these new models, we try to dispel two myths: first, that Bayesian methods are too slow and cumbersome, and, second, that Bayesian techniques require a whole new set of algorithmic ideas. We depart from the traditional sampling methodology which has dominated past expositions and focus on variational inference, an efficient technique which is a natural extension of EM. This approach allows us to tackle structured models such as HMMs and PCFGs with the benefits of Bayesian nonparametrics while maintaining much of the existing EM machinery so familiar to this community. In addition to our foundational presentation, we discuss both concrete implementation issues and demonstrate the empirical advantages of these methods. Tutorial #2: Usability and Performance Evaluation for Advanced Spoken Dialogue Systems ------------------------------------ Michael McTear, Kristiina Jokinen This tutorial will focus on methods, problems and challenges in the evaluation of advanced spoken dialogue systems. It is grounded in research that combines various speech and language technology components into an integrated system, and surveys the issues related to the design, evaluation and comparison of such systems. A number of different approaches to robust and efficient interaction management will be reviewed, together with various performance and user evaluation methods in academic and industrial environments. A closer look will be taken at the different metrics and usability criteria, as well as automatic design and evaluation methods. Practical requirements for dialogue systems, such as robustness, scalability and portability will also be discussed and exemplified from the point of view of performance evaluation and usability. Special attention will be paid to user evaluation, and to the user’s expectations and experience of the system. The tutorial is targeted at researchers and system developers who wish to learn more about theoretical and practical issues concerning development and evaluation of spoken dialogue systems. The tutorial will consist of two parts. The first part will survey techniques in spoken dialogue management and motivate the need for evaluation studies by discussing novel challenges and requirements for interactive systems. Issues concerning conversational dialogue modelling, such as knowledge representation, context, adaptation, learning, error handling, multimodality, ubiquitous computing, as well as practical requirements for commercial dialogue systems, such as robustness, scalability and portability, will be discussed and exemplified from the point of view of performance evaluation, usability, and user evaluation. The second part of the tutorial will focus on methodologies and practices of evaluation, and will deal with basic concepts, methods and metrics for system performance evaluation and usability. In particular, challenges for the evaluation of advanced dialogue systems will be discussed. Tutorial #3: Textual Entailment ------------------------------- Ido Dagan, Dan Roth, Fabio Massimo Zanzotto Recognizing Textual Entailment is the task of determining, for example, that the sentence: "Google files for its long awaited IPO" entails that "Google goes public". Determining whether the meaning of a given text passage entails that of another or whether they have the same meaning is a fundamental problem in natural language understanding that requires the ability to abstract over the inherent syntactic and semantic variability in natural language. This challenge is at the heart of many natural language understanding tasks including Question Answering, Information Retrieval and Extraction, Machine Translation, and others that attempt to reason about and capture the meaning of linguistic expressions. The task has attracted significant interest over the last couple of years mainly fostered by the PASCAL Recognizing Textual Entailment Challenge (RTE). A substantial number of papers on these topics have been published in major conferences and workshops in the last couple of years. The primary goals of this tutorial are to review the framework of applied Textual Entailment and motivate it as a generic paradigm for natural language semantics. We will present some of the key computational approaches proposed and some of the obstacles identified by the research community in this area, as a way to promote further research. The tutorial will thus be useful for many of the senior and junior researchers that have prior or new interest in this area, providing a concise overview of recent perspectives and research results. Tutorial #4: From Web Content Mining to Natural Language Processing ------------------------------ Bing Liu Web mining is a growing research area. It consists of Web usage mining, Web structure mining, and Web content mining. Web usage mining refers to the discovery of user access patterns from Web usage logs. Web structure mining tries to discover useful knowledge from hyperlinks. Web content mining aims to extract/mine useful information or knowledge from Web page contents. This tutorial focuses on Web content mining and its extensive connection with natural language processing (NLP). In the past few years, there was a rapid expansion of activities in Web content mining. This is not surprising because of the huge amount of valuable information of almost any imaginable type on the Web and significant economic benefits of such mining. However, due to the heterogeneity and the lack of structure of the Web data, automated discovery of targeted or unexpected knowledge/information still presents many challenging problems. This tutorial will introduce several such problems and some state-of-the-art techniques for dealing with them, e.g., data/information extraction, Web information integration, opinion mining, and information synthesis. These problems all have strong connections with NLP. In the tutorial, I will pay special attention to such connections and discuss how NLP researchers may contribute towards solving these problems. Many real-life examples will also be given to help participants understand research concepts and see how the technologies may be deployed to real-life applications. The tutorial will thus have a mix of research and industry flavor, addressing seminal research ideas and looking at the technology from an industry angle. Tutorial #5: Quality Control of Corpus Annotation Through Reliability Measures --------------------------------------- Ron Artstein The need for quality control of corpus annotation should be obvious: research based on annotated corpora can only be as good as the annotations themselves. In recent years, corpus annotation has expanded from marking basic morphological and syntactic structure to many new kinds of linguistic phenomena. Each new annotation scheme and every individual set of annotation guidelines need to be checked for quality, because quality inferences do not carry over from one scheme to another. A standard way of assessing the quality of an annotation scheme and guidelines is to compare annotations of the same text by two or more independent annotators. While researchers are generally aware of this technique, it seems that the inner workings of the statistics involved and how to interpret them are understood by few. Mechanical application of agreement coefficients found in software packages can lead to serious errors, and it is therefore crucial for people who do research on annotation to become intimately familiar with these statistics. This tutorial is a thorough introduction to the statistics used for measuring agreement between corpus annotators, and hence for inferring the reliability of the annotation. The tutorial will focus on the mathematics of the various agreement measures, and consequently on the implicit assumptions they make about annotators and annotation errors. A major part of the tutorial will be devoted to agreement coefficients of the kappa family, which are the most commonly used reliability measures in computational linguistics, but the tutorial will also discuss alternative measures such as latent class analysis. The tutorial will not assume advanced mathematical knowledge beyond basic probability theory, and will thus be accessible to most researchers in computational linguistics. ======================================================== :: 6. Sponsorship News ======================================================= The ACL 2007 Sponsorship Chairs very gratefully acknowledge the following committments in sponsorship: Google - Gold Sponsor, $5,000.00 Microsoft - Silver Sponsor, $4,000.00 TextKernel - Silver Sponsor, $2,500 BBN - Bronze Sponsor, $1,000.00 IBM - Bronze Sponsor, $1,000.00, earmarked for the Best Student Paper Award, Xerox - Bronze Sponsor, $1,000.00 MorphoLogic - Bronze Sponsor, $1,000.00 Anyone interested in becoming a sponsor or finding out more about sponsorship opportunities is encouraged to contact any of the chairs at the following e-mail addresses. Gabor Proszeky: proszeky@morphologic.hu Jan Hajic: hajic@ufal.mff.cuni.cz Jun'ichi Tsujii: j.tsujii@manchester.ac.uk Martha Palmer: Martha.Palmer@colorado.edu =========================================================== :: 7. Czech Airlines Discounts ========================================================== Czech Airlines is the Official Carrier of ACL 2007. Czech Airlines offers a comprehensive global route network linking many major cities around the world with Prague. Feel free to ask for the discount of 15% from applicable Economy Class fares and 25% from Business Class fares in Czech Airlines offices worldwide. More information on http://ufal.mff.cuni.cz/acl2007/travel/ There is also a list of low-cost airline routes to Prague available on the internet: http://www.flycheapo.com/flights/prague =========================================================== :: 8. Deadline for a (very) Cheap Accommodation in TOP Hotel ========================================================== Please note that the deadline for reservation of a very cheap accommodation in TOP Hotel (the conference venue) is April 15, 2007. After this date the hotel can quarantee neither the accommodation nor the prices. More information on http://ufal.mff.cuni.cz/acl2007/accommodation/ =========================================================== :: 9. Newsletter No.4 ========================================================== ACL Newsletter No.4 will be published in the first half of April.