Flect is a morphological generation tool based on Python and Scikit-Learn that learns morphological inflection patterns from corpora.

Use any morphologically annotated corpus to have the system learn how to automatically obtain inflected word forms from lemmas and morphological features. The system is able to inflect even previously unseen words by using lemma suffixes as features and predicting “edit scripts” that describe the difference between the lemma and the form.

This is an overall scheme of how Flect works:

The way this works is similar to Morfette, which uses edit scripts for lemmatization.

Usage

You may download Flect from Github under the Apache 2.0 license.

If you run into any problems, please do not hesitate to contact Ondřej Dušek. CoNLL or ARFF training data formats are supported. Treex can be used to generate both formats (a how-to for training Flect using Treex is described in the Treex SVN).

Citing

Flect and our experiments on six different languages have been described in the following paper:

Ondřej Dušek and Filip Jurčíček: Robust Multilingual Statistical Morphology Generation Models. In: ACL Student Research Workshop, Sofia, 2013.
[paper PDF] [slides PDF]