-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topic Coherence for DTM (or any model) #808
Comments
@bhargavvader thanks for raising the issue! I'll answer this in two parts. Plugging in another model : topics = []
for id, dist in hm.show_topics(formatted=False):
topic = []
for t, prob in dist:
topic.append(t)
topics.append(topic) and then plug in the topics into the coherence model. Can this be done with the DTM wrapper? Sorry I don't have too much experience with it. Making your own coherence pipeline: I hope this clarifies your doubts a little. |
Ah yes, this certainly clears things up, thanks @dsquareindia . :) |
Yeah I'll do that asap. Please keep this open so that I can reference this in my PR later. |
@dsquareindia any update on the tutorial notebook for this? I'll be opening PRs for an easy way to get topics ready for coherence for the DTM wrapper and python DTM so thought a more thorough notebook from your side on plugging in models (maybe for LSI/HDP/Mallet) and a few examples of custom coherence pipelines would be handy. |
Sorry for the delay @bhargavvader. I'll also write a blog post soon for this. |
Hey @bhargavvader I've also added a small gist here where I have included a small snippet of how to make your own pipeline. Hope this helps too. |
This is great! You should add it to the notebook. |
Thanks for all the help. I'm closing the issue. |
Wanted to know if there was any way to plug in the topics (topic-term distributions, doc-topic distributions, vocabulary counts, etc) to the coherence pipeline and get a measure.
Right now the notebook talks about using models (either the LdaModel or a wrapper), but if trained through an external source, I am unsure of how to do it.
pyLDAvis does this neatly in it's prepare method where it allows just matrices as inputs if you don't have a model object.
This post also mentions using individual pipeline modules for your own coherence measure - is there any documentation/tutorial on how to the same?
@dsquareindia , would love help regarding this.
The text was updated successfully, but these errors were encountered: