Port metrics to select number of LDA topics to Python from R #1275

tmylk · 2017-04-11T21:05:37Z

An R package ldatuning implements 4 metrics to select the best number of topics.
The metrics are quite easy to implement.
An ipynb with graphs implementing those metrics would be great.

Also see MDL for LSI ticket #28

souravsingh · 2017-04-13T20:14:40Z

Do we make use of the rpy2 bridge or write the code in Python from scratch?

tmylk · 2017-04-13T20:30:10Z

It is easier to write in Python from scratch

souravsingh · 2017-08-03T15:10:26Z

@menshikh-iv There is an inplementation for Arun metric here- https://github.com/AdrienGuille/TOM/blob/master/tom_lib/nlp/topic_model.py#L63

Would it be fine to use the implementation for Gensim?

souravsingh · 2017-11-11T17:42:57Z

The repo- https://github.com/WZBSocialScienceCenter/tmtoolkit implements Griffiths, Cao Juan and Arun metric, but can only use Cao Juan currently for Gensim.

We can mention this somewhere in the docs and close the issue.

menshikh-iv · 2017-11-13T05:06:36Z

@souravsingh TM evaluation typically isn't trivial, for this reason, we want to see it as a part of gensim (in current lib OR in the notebook, depends on "how difficult it is to calculate the metric".

moyid · 2020-11-07T04:38:34Z

Hi, has this ever been done yet?

tmylk added difficulty medium Medium issue: required good gensim understanding & python skills wishlist Feature request labels Apr 11, 2017

souravsingh mentioned this issue Aug 19, 2017

Add Arun metric #1541

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port metrics to select number of LDA topics to Python from R #1275

Port metrics to select number of LDA topics to Python from R #1275

tmylk commented Apr 11, 2017 •

edited

Loading

souravsingh commented Apr 13, 2017

tmylk commented Apr 13, 2017

souravsingh commented Aug 3, 2017 •

edited

Loading

souravsingh commented Nov 11, 2017 •

edited

Loading

menshikh-iv commented Nov 13, 2017

moyid commented Nov 7, 2020

Port metrics to select number of LDA topics to Python from R #1275

Port metrics to select number of LDA topics to Python from R #1275

Comments

tmylk commented Apr 11, 2017 • edited Loading

souravsingh commented Apr 13, 2017

tmylk commented Apr 13, 2017

souravsingh commented Aug 3, 2017 • edited Loading

souravsingh commented Nov 11, 2017 • edited Loading

menshikh-iv commented Nov 13, 2017

moyid commented Nov 7, 2020

tmylk commented Apr 11, 2017 •

edited

Loading

souravsingh commented Aug 3, 2017 •

edited

Loading

souravsingh commented Nov 11, 2017 •

edited

Loading