Biased Estimates: Topic Models Reading List

Topic Models Reading List

Below is a list of topic modeling papers and other resources that I've found helpful and informative. If any of the links are broken or have a pay wall, please let me know in the comments or use the "Contact me directly" form to your right. I'll see what I can do to fix it.

Topic Modeling & LDA Basics

The Topic Modeling Software I Use

My own textmineR package
R's lda package by Jonathan Chang

On Priors and Zipf's Law

Rethinking LDA: Why Priors Matter (This is a good paper, though I am skeptical of the conclusion.)
Comparison of topic models, their estimation algorithms, and priors. (Very underrated, MUST READ.)
Incorporating Zipf's law in language models
A note on estimating LDA with asymmetric priors

Evaluating LDA/Issues With LDA

LDA is an inconsistent estimator
Reading Tea Leaves: How humans interpret topic models (Also, MUST READ.)
A coherence (cohesion?) metric for topic models. (Note: This metric has the issue of "liking" topics full of statistically-independent words. It is still useful though.)
My working paper on an R-squared for topic models

Other Topic Models

Spherical topic models
Dynamic topic models
Ensembles of topic models (not our stuff, but from Jordan Boyd-Graber who is super smart and a friend of DC-NLP)

Other Stuff

KERA keyword extraction used to label topics in one of my examples. (The paper applying it to LDA is forthcoming, however.)
Rethinking Language: How probabilities shape the words we use (MUST READ, though not about topic modeling specifically.)
David Blei's topic modeling website

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Subscribe to: Posts (Atom)