SemEval 2007: Instructions for Task Participants

SemEval-2007 Home
News
Schedule
senseval.org
Call for Tasks
Call for Participation
Task Descriptions
Download Data
Paper Submissions
Program Committee
Program
Registration
Organizers
Administration

=================================================================
Semeval-2007: 4th International Workshop on Semantic Evaluations
*Information for Task Participants*
Contributors to this document: 
   Eneko Agirre, Lluís Màrquez, Richard Wicentowski
=================================================================

This document provides useful information for task participants at
SemEval-2007. Some of the information is necessarily partial and
incomplete at the moment. This document is available at the
SemEval-2007 official website.

Updates on the document: 
Jan 18 2007:
  * added section on registration id's
Feb 13 2007:
  * changed section of registration and general rules about
    team-system definition and paper autorship

Contact Information
==========================================================
SemEval-2007 website: http://nlp.cs.swarthmore.edu/semeval/ 
SemEval-2007 organizers: semeval@cs.swarthmore.edu


Task participants are responsible for
==========================================================
 - submit the system results
 - write a paper describing the system 
 - make a presentation/poster at the SemEval-2007 workshop


General Scheduling
==========================================================
This is the general schedule. Note that some tasks might have specific
constraints. We will be working under a very tight schedule. We note
below which are the hard deadlines for task participants:

 Trial datasets: January 3, 2007
 Evaluation start: February 26, 2007
 Evaluation end: April 1, 24:00 pm (GMT-7) 2007   HARD DEADLINE
 Task coordinators send evaluation: April 10, 2007 
 Description papers due: April 17, 2007           HARD DEADLINE
 Paper reviews due: April 27, 2007
 Camera-ready papers: May 6, 2007                 HARD DEADLINE

 Workshop (in conjunction with ACL-07 in Prague): June 23-24, 2007


Task descriptions
==========================================================
Descriptions of the 19 accepted tasks have been available at the
SemEval-2007 website since October 11.

Registration
==========================================================
Registration is required before downloading task training/test
datasets.

When registering, teams need to carefully choose how they wish to be
identified.  Note that both your team name and the name(s) of your
system(s) will be used on the website, in the proceedings, in the
result tables, and also in your system paper title. For this reason
participants must to follow the guidelines below.

(Additional information can be found here: "General rules about
team-system definition and paper authorship")

- Each team should be identified by a descriptive 2-5 letter
  abbreviation, preferably taken from the affiliation (e.g. UBC for
  the University of the Basque Country)

- We allow sites to register more than one team, where each of the
  teams signs in for a number of tasks. Team members don't need to be
  disjoint, but two teams with exactly the same members may not register
  as two teams.  These teams should use abbreviation-label as
  identification. Teams should choose the label with care. Some options
  include to choose the first letters of the members (e.g. UBC-ALMB,
  UBC-AC) or just a number (e.g. UBC1, UBC2)

- In order to ease browsing the proceedings and finding citations,
  teams will have to put their team name in the title of their system
  paper (e.g "UBC-ALMB: WSD using ..."). Please take into account the
  "General rules about team-system definition and paper authorship"
  in these guidelines, and try to match team identifications and papers.

Please try to follow these guidelines. If you have trouble with them
please contact the Semeval coordinators.

General rules about team-system definition and paper authorship
================================================================
The organizers would like to avoid multiple submissions of very
similar systems from the same teams, as well as multiple papers from
the same team describing one basic system applied to similar tasks. We
would also like to make the proceedings easier to browse, making the
mapping from system identification to paper title easier.
            
We therefore would like to set general rules regarding these issues:
            
1. On registration, participant teams will need to provide a unique
team ID and a list of its members.

2. The team will then specify the tasks it is participating on, adding
a short description of the systems (e.g. "supervised system based on
SVMs and kernels"). The webpage will then return unique "team-task"
keywords (e.g. UBC-ALMB-11 for task 11). 
            
3. Each team can only submit a single system for each task. The system
will be identified by the "team-task" keyword. If multiple results are
uploaded all but the last will be automatically discarded.
            
4. Each team will get one paper in the proceedings.

Exceptions to 3 are allowed when one team prepares two
substantially different systems for the same (or the same family of)
tasks. Participants should contact task organizers and Semeval
organizers, which might then grant the authorization. In any case, all
systems will have to be described in a single (possibly longer) paper.

Exceptions to 4 are allowed when one team participates in two
different kind of tasks. Participants should contact task organizers
and Semeval organizers, which might then grant the authorization.


Downloading the Datasets 
==========================================================

Trial data
By January 3, 2007, we expect task organizers to provide a trial
dataset for the task. This dataset can be fairly small but should be
as complete as possible. The goal of the trial datasets is to allow
participant teams to start working on their systems. Ideally, the
degree of completeness of the trial datasets should allow participants
to work exactly in the same framework as during the evaluation period
except for the amount of data. For that, trial datasets should be
accompanied by detailed documentation on the data formatting,
evaluation measures and software, accompanying resources, baseline
systems, etc. We have recommended organizers to provide scripts for
easing data management, feature generation, etc. to participants.

Final training/test datasets
Complete datasets (training and test), full documentation, and
evaluation software will be available by February 26. 

The SemEval-2007 website will centralize the uploading/downloading of
all datasets. 


Evaluation Period
========================================================== 
The evaluation period will comprise the 5 weeks from February 26 to
April 1. During this period, participants can download training/test
data for any given task at any time, with the following restrictions:

* Results for a given task have to be submitted no later than 21 days
  after downloading the training data for that particular task.

* Results for a given task have to be submitted no later than 7 days
  after downloading the test data for that particular task.

Time constraints will be checked automatically by the downloading
application. Before the test period expires, participants will upload
the answer files output by their systems, again to the application in
the SemEval-2007 website.

Most tasks will abide by the described time constraints, but there are
a few exceptional cases with different evaluation schedules which will
be announced by task organizers.


Results analysis and paper preparation
========================================================== 
Task organizers will be provided with the answer files submitted by
the participants in the corresponding tasks (by April 2). Task
organizers evaluate the data and return answers to participants (by
April 10). 

Each participant and each task organizer will have to write a paper
describing their system/task (April 17). Papers are reviewed by the
SemEval-2007 program committee (reviews due April 27), then revised by
the authors, and final camera-ready papers are submitted by May 6.

This procedure has internal dependencies. In particular, task
organizers should contact participants as needed, anytime during April
2-16, to obtain additional information about their participation in
the task (basic architecture of the system, knowledge sources,
features, etc.) so that they can include this information in the task
description paper. In the other direction task organizers should
provide participnats with any information about the task that is
useful to produce a better system description paper. In particular,
task organizers may consider informing participants about baseline
results, results obtained by other participants, etc. Coherence among
task description paper and participants' system description papers is
very welcome.  A good point here is to enforce a common formatting
style among all participants for the presentation of results. Also,
task organizers might release some guidelines for requiring
participants to include in their papers some specific information
relevant to the task analysis (e.g., knowledge sources and features
used, learning algorithms, training/testing times, error analysis,
etc.).

All paper submissions have to be compliant with ACL styles, available
at the conference website. Paper submission will be online. The
website with the submission form will be announced in advance.

At the Workshop
==========================================================
Task participants are invited to present a description of their
systems at the workshop, which will take place in conjunction with
ACL-2007 in Prague on June 23-24. When we decide on workshop details,
we will inform about the type of presentations that will be given
(oral presentation, posters, etc.).


Acknowledgements
==========================================================
Thanks to Rada Mihalcea and Phil Edmonds for their continuous advice
and for providing us with the very useful handbook of Senseval
organizers.
Please e-mail questions to .
Website hosted by the Department of Computer Science at Swarthmore College.