PA153: Natural Language Processing

Technical information

Homeworks

General small homeworks (1-3 points per example)

For any presentation in class:

to improve understanding

Multi-sense words in Little Prince (1-3 points)

Find words from the Little Prince book with more than one meaning.

Stability of word embeddings (3-10 points)

Choose one or more methods for creating word embeddings (word2vec, FastText, GloVe, …), run the traning on same data with different parameters (and/or epochs), evaluation stability.

Stability can be computed in several ways:

  1. How many pair similarities are same. It can be computed on the whole vocabulary on a sample (for example: 10 words with frequences from [100, 400, 1600, 6400, 25600, …]).

  2. Percentage of changes in analogy tasks. Same percentage in the taks doesn’t mean the same succesful analogy items. Calculate how many items changed successful/unsuccessful estimation.

  3. Percentage of changes in the Outlier Detection task

Projects