How to accelerate the Representation step? #2158
Unanswered
shizhediao
asked this question in
Q&A
Replies: 1 comment 6 replies
-
|
Thank you for sharing the issue. Have you tried going through the example notebook on the README? It also shows some tricks for running c-TF-IDF on large datasets. It might also be worthwhile to check which representation model is slow for you. You technically have three: MMR, PoS, and c-TF-IDF. I believe you commented out the OpenAI one. Check which one is slow and it would help figure out where to optimize. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am using BERTopic to process a large dataset (> 10M docs).
Currently, I find that if I process 100K docs, it takes around 12 mins. Considering 10M, it would take 1200 mins, which is too slow.
Could you help me think about any acceleration methods? Thanks!
Here is my code:
This is my log:
Beta Was this translation helpful? Give feedback.
All reactions