Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write vignette on dimension ordering #140

Open
charlie-gallagher opened this issue Apr 26, 2022 · 2 comments
Open

Write vignette on dimension ordering #140

charlie-gallagher opened this issue Apr 26, 2022 · 2 comments

Comments

@charlie-gallagher
Copy link
Collaborator

Dimension ordering can make a dramatic difference in the time it takes to run a query, and we've included this point in the documentation.

However, I don't think we've mentioned how dimension ordering can change your results. We should outline in a vignette some guidelines for when dimension ordering matters, and when you can fiddle with it. At the very least, it deserves a comment in the documentation.

@gilliganondata
Copy link
Collaborator

I'm thinking what you're calling out is that the top value at each level is going to return the "top x" at that level. So, if you have dim_a and dim_b and are reporting visits and top = c(5,5), then:

  • dimensions = c("dim_a", "dim_b") would return (up to) 25 rows based on the top 5 values of dim_a overall, and then the top 5 values of dim_b _within each value for dim_a
  • dimensions = c("dim_b", "dim_a") may feel like it would return the same thing, but just with the first two columns reversed...but it might not, because the dim_b values returned would be the top 5 values of dim_b overall, and then the top 5 values of dim_a _within each value for dim_b

(Um...okay...I see what you mean. This warrants some thought on how best to explain in the documentation and warrants a vignette to illustrate! And...is this what you're referencing?)

Do you have a couple of standard dimensions in mind that would be illustrative? I'm thinking (relatively) higher cardinality dimensions—page and mobiledevicename?

@charlie-gallagher
Copy link
Collaborator Author

Exactly, and that is a good start towards an explanation. It does deserve some thought to have a clear and simple example of when switching the order isn't the same. I don't have any standard dimensions in mind yet, but I'll think about it for a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants