-
Notifications
You must be signed in to change notification settings - Fork 777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small changes to metrics computation + additional metrics #107
Conversation
With this mechanism an implementation can provide additional information, for example the number of distance computations required to answer all queries. Add this as a plotting variant.
ann_benchmarks/plotting/metrics.py
Outdated
for i in range(len(run_distances)): | ||
t = threshold(dataset_distances[i], count, epsilon) | ||
actual = 0 | ||
for j in range(min(count, len(run_distances[i]))): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not blocking but could have been a bit more elegant :)
for d in run_distances[i][:count]:
if d <= t:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Much better. Changed it in cdda15b. (Also removed the break statement because it assumes that algorithms sort their answers to queries.)
i think this change looks good – i don't find the plots super easy to understand, but the code seems fine! |
looks like build is passing except for flann (will solve separately) |
I think we don't enforce that answers to queries should be sorted in some way. The break statement would be the only place where this would be visible.
That seems to be related to flann-lib/flann#399. I've added the libraries added their as explicit requirements in ce74df9, but their seems to be another issue with the Python wrapper. |
A caption is definitely necessary. ;-) We made quite a lot of new plots for a recent paper. I don't think that all of such scripts should go into the base repo. What do you think about a repo just for additional plotting functionality? ( |
Flann is still not building properly, opened up flann-lib/flann#406. |
thanks! i might disable flann for now |
Small changes to metrics computation + additional metrics
This PR makes the following changes:
The general goal of these changes is to make it easier to work with our result files, e.g. for visualization of results beyond those provided at the moment.
We use it to create plots like this, which provide more inside into what the distribution of individual recall values looks compared to the average (the single dot):

This gives more details into the inner workings of different algorithmic ideas. (To be discussed: should such plottings scripts be added to ann-benchmarks as well?)