-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review: Ch 3 (wrangling) #103
Comments
Reviewer B:
ML: No changes needed here |
Reviewer D
|
Reviewer A
|
Reviewer C
|
|
Adding Tiffany's comment from #92
|
these are all addressed or new issues |
Reviewer E:
across()
https://dplyr.tidyverse.org/reference/summarise_all.htmlselect helpers (e.g. matches, starts_with) covered since this can make coding much more efficient.map
in favour of summarize + across in wrangling (ch 3)? #193Ch titles/subtitles: Section 3.6-3.7: Both of these use “iterate” in the title to describe the group_by+summarize workflow as well as the purrr workflow. This may lead to confusion on how the two methods are different. I would tend to call group_by+summarize something like “aggregation” or “subgroup analysis/summarization” to distinguish from mapping. I think iteration is a better description for purrr since it works more like traditional for-loop style iteration (same number of elements returned as inputted.)Explicitly try to define “wide” versus “long” before using. It should be intuitive but might trip students up.Explain why each situation might arise. Students may wonder why anyone would produce the “wrong” form of data, but discussion could explain, for example, that human-readable tables are often in “wide” format, but for visualization and analysis we want them “long”.For the table that is “too long”, the important column (containing the future column names) is truncated. This makes it harder to understand the concept and why this is a bad formatI love the repeatedly checking of the tidy criteria after each example, but I wonder if it could be more effective or helpful to explain before the transformation the shape we are trying to get it into so students know what the goal is from the outset.Consider showing “nested” alternative to pipe [f(g(h(x)))] to demonstrate how pipe adds readability and order of operationsConsider providing advise when to break a pipe and assign to an intermediate variable. For example, it can be overwhelming to have 10 functions piped to one another. Additionally, one might want to save the data they have prepared before feeding it into a plotting or model function so that they can alter the plotting or model function without having to continually remake all of their data transformations.In 3.6: See 3 for content suggestions motivated by this part in particularML: this pertains to first comment in this issueSee 3 for content suggestions motivated by this part in particularThe NA issue with sum() could be explained better. The text reads like this is a purrr issue, so it could be clarified that it is a sum() issue that can be addressed while using purrrThe text was updated successfully, but these errors were encountered: