Documentation to EVALD 4.0, EVALD 4.0 for Foreigners and EVALD 4.0 for Beginners

EVALD 4.0, EVALD 4.0 for Foreigners serve for automatic evaluation of surface coherence (cohesion) in Czech texts written by native speakers of Czech and non-native speakers of Czech, respectively. EVALD 4.0 for Beginners aims at overall evaluation of texts written by non-native speakers of Czech – beginners.

The evaluation part (the backend server) is implemented in Treex (http://ufal.cz/treex), a highly modular NLP framework written in the Perl programming language, and uses the Weka toolkit (http://www.cs.waikato.ac.nz/ml/weka/) for the final prediction of a coherence mark. It can be used directly from a command line or as a backend for a client. The frontend part is implemented as a web server, accessible with a web browser.

There are three possible ways of using any of the three versions of EVALD 4.0:

  1. interactively as a web demo and RESTful web service hosted at the LINDAT/CLARIN server,

  2. interactively but locally, with both the server and the client running on the same machine (or two machines in the same network),

  3. in a batch mode run on the local machine.

1. EVALD as a LINDAT/CLARIN web service

No installation is needed in this case; in a web browser (such as Firefox or Chrome), go to https://lindat.mff.cuni.cz/services/evald/ for EVALD 4.0, https://lindat.mff.cuni.cz/services/evald-foreign/ for EVALD 4.0 for Foreigners, or to https://lindat.mff.cuni.cz/services/evald-begin/ for EVALD 4.0 for Beginners.

How to use the web interface:

Write or copy the text to be tested in the textfield and click on the button "Evaluate!". The evaluation takes 10–20 seconds, then the result is displayed, e.g.:

Evaluation
Evaluation class: B1
Probability of the evaluation: 0.59
Language aspects stronger than B1:
    Syntax: complexity and diversity
    Text structure: coreference
    Text syntax: sentence information structure
Language aspects corresponding to B1:
    Vocabulary: complexity and diversity
    Morphology: complexity and diversity
    Text structure: frequency of discourse connectives
    Text structure: diversity of discourse connectives
    Text comprehention: readability
Language aspects weaker than B1:
    Spelling: unrecognized words
Readability measures:
    Flesch-Kincaid Grade Level Formula: 10
    Flesch Reading Ease: 54
    SMOG index: 13
    Coleman-Liau index: 7
    Automated readability index: 5
Estimated reading time: 1 min. 24 sec.
Variety of vocabulary:
    Number of lemmas used in the text: 111
    Richness of vocabulary: Yule's K characteristics: 198
    Richness of vocabulary: Simpson's Diversity Index (D): 199

2. EVALD as a backend and frontend server

Both the backend and the frontend components are distributed via the Docker software (https://www.docker.com/), which needs to be installed first. Docker greatly simplifies the installation process of the two components and allows them to be run on Linux-based operating systems, Windows 10, as well as Mac OS X.

How to install the app:
  1. To install Docker, please follow the instructions for your operating system: MS Windows, Linux, Mac OS X. Please see the memory requirements for the EVALD backend and frontend components below and set up the docker environment accordingly.
  2. Download the backend component by running:
    docker pull ufal/evald.treex-server:4.0
  3. Download the frontend component by running:
    docker pull ufal/evald.php-server:4.0
How to run the app:
  1. The backend component must be run with one parameter: --port=$PORT specifying the port number, on which the backend server listens to requests (e.g. 34567). The other optional parameter is --app=$APP specifying the type of application to run: L1 or native to run Evald 4.0, L2 or foreign to run Evald 4.0 for Foreigners, and L2b or begin to run Evald 4.0 for Beginners. If this optional parameter is omitted, all three applications are run and it is decided in runtime which of them should be used. Run the following command to run the backend server in background:
    docker run -d --expose $PORT -p $PORT:$PORT -t ufal/evald.treex-server:4.0
      run_evald_server.pl --port=$PORT [--app=$APP]
  2. The previous command prints out the ID of the running backend container (represented by a $ID variable in the following). Another way to find out the ID is to run the following command and look for the ID of the running ufal/evald.treex-server:4.0 component:
    docker ps
  3. Find out the IP address of the running backend component (represented by a $IP_ADDR variable in the following, e.g., 172.17.0.2):
    docker inspect -f '{{ .NetworkSettings.IPAddress }}' $ID
    If you are about to execute the frontend component on a different machine in the same network, store the IP address of the machine where backend server is run to the variable $IP_ADDR.
  4. The frontend component allows three parameters to be set: --L1=$L1_URL specifying the URL where a running backend component to process L1 queries is available, and analogously --L2=$L2_URL for a backend component to process L2 queries, or --L2b=$L2b_URL for a backend component to process L2b queries. URLs may combine the backend component's IP address, the port on which it listens, and more specifying path. The combination must be in form "http://$IP_ADDR:$PORT/$PATH" (e.g. http://172.17.0.2:34567/native). Note that if you are running a backend component without the --app=$APP parameter, the L1 application is accessible using $PATH=native, L2 using $PATH=foreign and L2b using $PATH=begin. Execute the frontend server in background, which will be listening to requests on a standard HTTP port 80:
    docker run -d -p 80:80 ufal/evald.php-server:4.0 run_server.sh [--L1=$L1_URL]
      [--L2=$L2_URL] [--L2b=$L2b_URL]
  5. Open your favorite web browser and visit the webpages http://localhost/index.php, http://localhost/index-foreign.php or http://localhost/index-begin.php to start working with Evald 4.0, Evald 4.0 for Foreigners or Evald 4.0 for Beginners, respectively.
How to stop the app:
  1. Execute the following command to find the IDs of the two running components:
    docker ps
  2. Use Docker to stop the running background processes:
    docker stop $BACKEND_ID
    docker stop $FRONTEND_ID
System requirements:
  • Linux with sudo privileges, Windows 10, Mac OS X
  • Docker
  • backend component:
    • 2.5 GB free on a hard drive, 5 GB RAM
  • frontend component:
    • 500 MB free on hard drive, 30 MB RAM

3. EVALD in a batch mode

EVALD can be run in a batch mode to process just the selected documents stored as a plain text in the UTF-8 encoding and stop. There are two options how to install and use it in a batch mode: as a dockerized container, or directly as a Treex scenario.

Batch mode as a dockerized container

The dockerized backend component ufal/evald.treex-server:4.0 can be used also in a batch mode. The advantage of this approach is easy installation and the possibility to run the application on the operating systems other than Linux.

How to install the app:

Follow the instructions above to install the backend component ufal/evald.treex-server:4.0.

How to run the app:
  1. Create two directories - one for input documents, and the second one for the outputs. The absolute paths to the directories are represented by variables $FROM_DIR_ABS_PATH and $TO_DIR_ABS_PATH, respectively. Put your texts into the $FROM_DIR_ABS_PATH directory.
  2. To process all the texts inside the $FROM_DIR_ABS_PATH directory with the scenario specified by the $APP parameter (see above), run the following command:
    docker run -v $FROM_DIR_ABS_PATH:/from/ -v $TO_DIR_ABS_PATH:/to/ -t ufal/evald.treex-server:4.0
      run_evald.sh -f /from -t /to -a $APP
    The parameter -v directory_outside_Docker:directory_inside_Docker serves to map in a runtime a directory outside the Docker container to a specified directory inside the container, which makes all the content of the directory visible for the container. However, additional steps must be done for this feature to function in Docker for Windows. See the instructions here.
System requirements

Requirements are the same as for the backend component in the previous case.

Batch mode as a Treex scenario

In this approach the main processing tool Treex is installed without being wrapped in a Docker container. On the one hand, it allows the user to gain better control over the processing pipeline, on the other hand, installation of Treex and supporting tools is a rather complex task that is not recommended for inexperienced users. Furthermore, Treex in the version used in EVALD is not guaranteed to work out directly on the operating systems other than Linux.

How to install the app:

The instructions described here are very rough. Please contact us on evald@ufal.mff.cuni.cz to get the details and help you with the installation.

Treex (http://ufal.cz/treex) needs to be installed on the local machine, along with all (mostly CPAN) dependencies for the Czech text analysis. Treex must be in the revision tagged as EVALD_4.0 (https://github.com/ufal/treex/releases/tag/EVALD_4.0). In addition, Vowpal Wabbit 8.1.1 (https://github.com/JohnLangford/vowpal_wabbit/releases/tag/8.1.1) must be installed to the location installed_tools/ml/vowpal_wabbit-v8.1-3cf3f692/ relative to the Treex Share directory. The Treex scenario to be run is a part of the LINDAT/CLARIN EVALD 4.0 distribution (http://hdl.handle.net/11234/1-3065, file Evald-4.0.scen), the LINDAT/CLARIN EVALD 4.0 for Foreigners distribution (http://hdl.handle.net/11234/1-3066, file Evald-4.0-Foreign.scen), or the LINDAT/CLARIN EVALD 4.0 for Beginners distribution (http://hdl.handle.net/11234/1-3067, file Evald-4.0-Begin.scen).

How to run the app:

Replace $FILE with the actual file name and run one of the following commands (for texts written by native or non-native speakers, respectively):

treex -Lcs from=$FILE Read::Text Evald-4.0.scen
treex -Lcs from=$FILE Read::Text Evald-4.0-Foreign.scen
treex -Lcs from=$FILE Read::Text Evald-4.0-Begin.scen

The evaluation of the text (the overall score and also evaluation for individual sets of features) is printed during the run of the Treex scenario, along with probabilities of the predicted results; the first mark (unnamed feature set) represents the overall evaluation, the subsequent marks reflect individual qualities of the text, e.g.:

 - feature set:                       class: B1       probability: 0.67
 - feature set: spelling              class: B1       probability: 0.516
 - feature set: morphology            class: B1       probability: 0.79
 - feature set: vocabulary            class: C2       probability: 0.65
 - feature set: syntax                class: B1       probability: 0.69
 - feature set: connectives_quantity  class: B1       probability: 0.62
 - feature set: connectives_diversity class: B2       probability: 0.66
 - feature set: coreference           class: B1       probability: 0.58
 - feature set: tfa                   class: B2       probability: 0.66
 - feature set: readability           class: B1       probability: 0.69
System requirements:

This version of the application (i.e., as a Treex scenario) can be installed and run only on Linux. On the other hand, there is no need for sudo privileges and for Docker to be installed. The hardware requirements correspond to those listed above for the backend component.

Technical Support

If you have questions or need technical support, please contact evald@ufal.mff.cuni.cz.

How to cite EVALD 4.0, EVALD 4.0 for Foreigners and EVALD 4.0 for Beginners

  • Michal Novák, Jiří Mírovský, Kateřina Rysová, Magdaléna Rysová, Eva Hajičová: EVALD 4.0 – Evaluator of Discourse. Data/software, LINDAT/CLARIN digital library, Prague, Czech Republic, http://hdl.handle.net/11234/1-3065, Oct 2019.
  • Michal Novák, Jiří Mírovský, Kateřina Rysová, Magdaléna Rysová, Eva Hajičová: EVALD 4.0 for Foreigners – Evaluator of Discourse. Data/software, LINDAT/CLARIN digital library, Prague, Czech Republic,  http://hdl.handle.net/11234/1-3066, Oct 2019.
  • Michal Novák, Jiří Mírovský, Kateřina Rysová, Magdaléna Rysová, Eva Hajičová: EVALD 4.0 for Beginners – Evaluator of Discourse. Data/software, LINDAT/CLARIN digital library, Prague, Czech Republic,  http://hdl.handle.net/11234/1-3067, Oct 2019.

There are also papers describing the related research and experiments:

  • Kateřina Rysová, Magdaléna Rysová, Michal Novák, Jiří Mírovský, Eva Hajičová: EVALD – a Pioneer Application for Automated Essay Scoring in Czech. In: The Prague Bulletin of Mathematical Linguistics, Vol. 113, Univerzita Karlova, Prague, Czech Republic, ISSN 0032-6585, pp. 9–30, Oct 2019. WWW: https://ufal.mff.cuni.cz/pbml/113/art-rysova-et-al.pdf
  • Michal Novák, Jiří Mírovský, Kateřina Rysová, Magdaléna Rysová: Exploiting Large Unlabeled Data in Automatic Evaluation of Coherence in Czech. In: Lecture Notes in Computer Science, Vol. 11697, Proceedings of the 22nd International Conference on Text, Speech and Dialogue – TSD 2019, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-030-27946-2, ISSN 0302-9743, pp. 197–210, 2019. WWW: https://link.springer.com/chapter/10.1007%2F978-3-030-27947-9_17
  • Michal Novák, Jiří Mírovský, Kateřina Rysová, Magdaléna Rysová: Topic–Focus Articulation: A Third Pillar of Automatic Evaluation of Text Coherence. In: Advances in Computational Intelligence (LNAI 11289): 17th Mexican International Conference on Artificial Intelligence, MICAI 2018, Proceedings, Part II, Springer, Switzerland, ISBN 978-3-030-04497-8, pp. 92–105, 2018. WWW: https://link.springer.com/chapter/10.1007/978-3-030-04497-8_8
  • Michal Novák, Kateřina Rysová, Magdaléna Rysová, Jiří Mírovský: Incorporating Coreference to Automatic Evaluation of Coherence in Essays. In: Statistical Language and Speech Processing, Springer International Publishing, Cham, Switzerland, ISBN 978-3-319-68455-0, ISSN 1611-3349, pp. 58–69, 2017. WWW: https://link.springer.com/content/pdf/10.1007%2F978-3-319-68456-7_5.pdf
  • Rysová Kateřina, Rysová Magdaléna, Mírovský Jiří, Novák Michal: Introducing EVALD – Software Applications for Automatic Evaluation of Discourse in Czech. In Proceedings of the International Conference Recent Advances in Natural Language Processing, Šumen, Bulgaria: INCOMA Ltd., 2017, pp. 634–641. ISBN 978-954-452-048-9, ISSN 1313-8502. WWW: https://acl-bg.org/proceedings/2017/RANLP%202017/pdf/RANLP082.pdf
  • Kateřina Rysová, Magdaléna Rysová, Jiří Mírovský: Automatic Evaluation of Surface Coherence in L2 Texts in Czech. In: Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ROCLING XXVIII (2016), The Association for Computational Linguistics and Chinese Language Processing (ACLCLP), Taipei, Taiwan, ISBN 978-957-30792-9-3, pp. 214–228, 2016. WWW: http://aclweb.org/anthology/O/O16/O16-1021.pdf

Acknowledgment

Software applications EVALD 4.0, EVALD 4.0 for Foreigners and EVALD 4.0 for Beginners were developed in the years 2016–2019 at the Institute of Formal and Applied Linguistics (ÚFAL, http://ufal.mff.cuni.cz/), Faculty of Mathematics and Physics, Charles University, with the financial support of the Ministry of Culture of the Czech Republic, project Automatic Evaluation of Text Coherence in Czech (DG16P02B016, http://ufal.mff.cuni.cz/grants/evald-evaluator-discourse).