Looks great, but not a mixture model (i.e., apparently it assumes that each document contains a single topic...)
Looks like a pretty seamless way to extract the text of news documents. Looks very simple and pretty easy.
NLP toolkit by the same team that built the Java Wikipedia database indexer/API. Looks pretty good.