-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help understanding ApplySortingLsh? #3
Comments
Thank you, that helps. Some more questions:
Thanks, |
Thank you, @shigeki. In the document you link to, I assume that the relevant part is this:
Two questions: I assume that the "Chrome-operated server-side pipeline" has already been computed and the results are in the contents of "cluster_data" - is that correct? Also, the step "The 50-bit hashes start in two big cohorts" makes sense to me, but I don't see how the code in Given a
I'm having trouble seeing how this is the same what the document describes. |
To expand: It seems like what the document is saying the algorithm should do is something like this (just making up the numbers for illustration):
|
Right.
It was the step at the internal server in Google, not floc_simulator. |
Hi,
Thank you for your package - it is very nicely written and very easy to use.
I am trying to understand how the
ApplySortingLsh
function works. In particular:cluster_data
represent? I see that there are 33872 values in the array, so I assume there is one per cohort. But I see that most of the values seem to be 35, 34, 36, 98, or 99. What do these values mean/represent in the real world? Or, why/how are they used?The text was updated successfully, but these errors were encountered: