-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use in-memory graph struct for storing and quering data #7
Comments
question with networkx for C codes |
networkx graph performance:
So, we can get two conclusions: |
So, maybe for the low level APIs in Synonyms, we just use dict for quick access. But networkx provides many fancy APIs for upper level tasks like textsum and NER, such as pagerank and textrank. https://networkx.github.io/documentation/latest/tutorial.html Unless networkx can achieve better performance, we just rely on dict. Another option is using levelDB. It may bring more inconvenience with installation, but it may help with both performance and abilities. |
I read the source code of NetworkX where is about finding neighbors, as following:
So I guess there are two reasons for the slow query : (1) twice type conversion operation , (2)Multiple function calls And then I just modified the code like this :
It runs faster, only two times slower than Dict. |
Great job, does the modifications have side impacts for other functions in networks?
We may borrow the codes of networkx and make the changes right now and contribute back afterwards. |
https://github.com/snap-stanford/snap FYI, this library has a C++ interface. |
@tyrinwu thanks. |
description
目前使用python dict, list 来存储,在建立多个词之间关系的时候,效率低。
solution
将word's vector, adjacent words and their distance, 存储在Graph中。
Possible solution:
方案1: graphlite
https://pypi.python.org/pypi/graphlite
方案2: projx
http://davebshow.github.io/projx/getting-started/
方案3:networkx
https://pypi.python.org/pypi/networkx/
支持 load 和 dump 文件 pickle 文件。
采用哪种格式? https://networkx.github.io/documentation/networkx-1.9.1/reference/readwrite.html
安装是否方便? 保持从 pip 安装,依赖少,跨平台
是否支持高级查询?比如 cypher
性能怎么样?
The text was updated successfully, but these errors were encountered: