#### Topic Modeling & LDA Basics

- The clearest statement of LDA I've seen is on Wikipedia.
- Here is David Blei et. al's original paper.
- This paper introduces Gibbs sampling for LDA.
- pLSA (pLSI) is the frequentist version of LDA. They are equivalent under certain conditions.

#### The Topic Modeling Software I Use

- My own textmineR package
- R's lda package by Jonathan Chang

#### On Priors and Zipf's Law

- Rethinking LDA: Why Priors Matter (This is a good paper, though I am skeptical of the conclusion.)
- Comparison of topic models, their estimation algorithms, and priors. (Very underrated, MUST READ.)
- Incorporating Zipf's law in language models
- A note on estimating LDA with asymmetric priors

#### Evaluating LDA/Issues With LDA

- LDA is an inconsistent estimator
- Reading Tea Leaves: How humans interpret topic models (Also, MUST READ.)
- A coherence (cohesion?) metric for topic models. (Note: This metric has the issue of "liking" topics full of statistically-independent words. It is still useful though.)
- My working paper on an R-squared for topic models

#### Other Topic Models

- Spherical topic models
- Dynamic topic models
- Ensembles of topic models (not our stuff, but from Jordan Boyd-Graber who is super smart and a friend of DC-NLP)

#### Other Stuff

- KERA keyword extraction used to label topics in one of my examples. (The paper applying it to LDA is forthcoming, however.)
- Rethinking Language: How probabilities shape the words we use (MUST READ, though not about topic modeling specifically.)
- David Blei's topic modeling website

## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.