Introduction to Natural Language Processing (600.465) Word Classes: Programming Tips & Tricks

10/10/00


Click here to start


Table of Contents

Introduction to Natural Language Processing (600.465) Word Classes: Programming Tips & Tricks

The Algorithm (review)

Complexity Issues

Trick #1: Recomputing The MI the Smart Way: Subtracting...

...and Adding

Trick #2: Precompute the Counts-to-be-Subtracted

Formulas for Tricks #1 and #2

Formulas - cont.

Trick #3: Ignore Zero Counts

Trick #4: Use Updated Loss of MI

Formulas for Trick #4 (sk-1,Lk-1)

Completing Trick #4

Effective Implementation

Implementation: the Initialization Phase

Implementation: Select & Update

Towards the Next Iteration

Moving Words Around

Using the Hierarchy

Numbering the Classes (within the Hierarchy)

Author: Jan Hajic

Email: hajic@cs.jhu.edu

Home Page: http://www.cs.jhu.edu/~hajic/courses/cs465/syllabus.html