Friday, February 23, 2018

textmineR 2.1.0 is up

Over the weekend I released textmineR 2.1.0 to CRAN (current version here). The current version contains a couple minor updates and 5 vignettes to get you up and running with text mining.

The vignettes cover the philosophy of textmineR, basic corpus statistics, document clustering, topic modeling, text embeddings (which is basically topic modeling of a term co-occurrence matrix), and building a basic document summarizer. That last vignette uses text embeddings plus a variation of the TextRank algorithm.

The other updates are relatively minor. @manuelbickle discovered that my implementation of CalcProbCoherence was scaled differently from what I'd intended. That's fixed, though it shouldn't affect the qualitative use of probabilistic coherence. Second, I realized that my documentation for CreateTcm was misleading. So, that's now fixed.