Small changes to metrics computation + additional metrics #107

maumueller · 2019-03-06T13:10:44Z

This PR makes the following changes:

Added mechanism to allow an implementation to report on additional characteristics. This is showcast by having IVF and HNSW (from faiss) report the number of distance computations done to answer the set of queries (which is a general benchmark of the quality more independent than running times).
Small refactorings in metric computation, report not only on avg but also standard deviation. (To be later added as error bars in plots.)
Store computed metrics inside the result file to avoid recomputation
Improved plotting time by a factor ~4 by better caching of hdf5 access and small changes to the metric computation

The general goal of these changes is to make it easier to work with our result files, e.g. for visualization of results beyond those provided at the moment.

We use it to create plots like this, which provide more inside into what the distribution of individual recall values looks compared to the average (the single dot):

This gives more details into the inner workings of different algorithmic ideas. (To be discussed: should such plottings scripts be added to ann-benchmarks as well?)

With this mechanism an implementation can provide additional information, for example the number of distance computations required to answer all queries. Add this as a plotting variant.

erikbern · 2019-03-06T16:02:19Z

ann_benchmarks/plotting/metrics.py

+    for i in range(len(run_distances)):
+        t = threshold(dataset_distances[i], count, epsilon)
+        actual = 0
+        for j in range(min(count, len(run_distances[i]))):


not blocking but could have been a bit more elegant :)

for d in run_distances[i][:count]: if d <= t:

Thanks! Much better. Changed it in cdda15b. (Also removed the break statement because it assumes that algorithms sort their answers to queries.)

erikbern · 2019-03-06T16:04:19Z

i think this change looks good – i don't find the plots super easy to understand, but the code seems fine!

erikbern · 2019-03-06T16:04:39Z

looks like build is passing except for flann (will solve separately)

flann-lib/flann#399

I think we don't enforce that answers to queries should be sorted in some way. The break statement would be the only place where this would be visible.

maumueller · 2019-03-07T08:06:45Z

looks like build is passing except for flann (will solve separately)

That seems to be related to flann-lib/flann#399. I've added the libraries added their as explicit requirements in ce74df9, but their seems to be another issue with the Python wrapper.

maumueller · 2019-03-07T08:09:39Z

i think this change looks good – i don't find the plots super easy to understand, but the code seems fine!

A caption is definitely necessary. ;-) We made quite a lot of new plots for a recent paper. I don't think that all of such scripts should go into the base repo. What do you think about a repo just for additional plotting functionality? (ann-benchmarks-plotting? Would be easiest if everything would go under an ann-benchmarks organization.)

maumueller · 2019-03-07T08:30:53Z

Flann is still not building properly, opened up flann-lib/flann#406.

erikbern · 2019-03-07T17:51:46Z

thanks! i might disable flann for now

Small changes to metrics computation + additional metrics

maumueller added 5 commits March 6, 2019 13:06

Cache computed metrics in hdf5 file and improve metric comp. time.

3a6e6f1

Add get_additional mechanism. Showcast it using FAISS.

97d6304

With this mechanism an implementation can provide additional information, for example the number of distance computations required to answer all queries. Add this as a plotting variant.

Introduced dummy object in the unit test to simulate caching.

a4eb030

Need to run plotting scripts as sudo because the update the hdf5 file.

724f15b

Change permission instead of running python as sudo in travis.

5e7da2d

erikbern reviewed Mar 6, 2019

View reviewed changes

maumueller added 2 commits March 7, 2019 08:50

FLANN needs explicit installation of libraries.

ce74df9

flann-lib/flann#399

Made loop simpler. Removed break statement.

cdda15b

I think we don't enforce that answers to queries should be sorted in some way. The break statement would be the only place where this would be visible.

erikbern merged commit 37a70ca into erikbern:master Mar 7, 2019

erikbern added a commit that referenced this pull request Apr 14, 2023

Merge pull request #107 from maumueller/precompute_metrics

752bd85

Small changes to metrics computation + additional metrics

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small changes to metrics computation + additional metrics #107

Small changes to metrics computation + additional metrics #107

maumueller commented Mar 6, 2019

erikbern Mar 6, 2019 •

edited

Loading

maumueller Mar 7, 2019

erikbern commented Mar 6, 2019

erikbern commented Mar 6, 2019

maumueller commented Mar 7, 2019

maumueller commented Mar 7, 2019

maumueller commented Mar 7, 2019

erikbern commented Mar 7, 2019

Small changes to metrics computation + additional metrics #107

Small changes to metrics computation + additional metrics #107

Conversation

maumueller commented Mar 6, 2019

erikbern Mar 6, 2019 • edited Loading

Choose a reason for hiding this comment

maumueller Mar 7, 2019

Choose a reason for hiding this comment

erikbern commented Mar 6, 2019

erikbern commented Mar 6, 2019

maumueller commented Mar 7, 2019

maumueller commented Mar 7, 2019

maumueller commented Mar 7, 2019

erikbern commented Mar 7, 2019

erikbern Mar 6, 2019 •

edited

Loading