MIPT Deep Learning Club #13

less than 1 minute read


Taras Khakhulin about “Breaking the Softmax Bottleneck: A High-Rank RNN Language Model”

The problem of constructing the Language model was considered as a factorization of matrix. Authors showed that Softmax has a bottleneck which affects the expressive power of the model. Also they proposed to solve such a problem using the Mixture of Softmax.

Results mentioned in the paper are impressive. A lot of experiments were conducted and state-of-the-art results were achieved on a big number of datasets.

To conclude, even very strong RNN Language model’s expressive power will be restricted because of a high rank representation of natural language.

Leave a Comment