Somebody’s list of drawbacks for topic extraction using Latent Dirichlet Allocation reads like a list of reasons why I like it: You cannot influence the topics that show up. You can tweak parameters and hope but you can’t influence it directly.
Main disadvantages of LDA
Lots of fine-tuning
If LDA is fast to run, it will give you some trouble to get good results with it. That’s why knowing in advance how to fine-tune it will really help you.
It needs human interpretation
Topics are found by a machine. A human needs to label them in order to present the results to non-experts people.
You cannot influence topics
Knowing that some of your documents talk about a topic you know, and not finding it in the topics found by LDA will definitely be frustrating. And there’s no way to say to the model that some words should belong together. You have to sit and wait for the LDA to give you what you want.