-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some doc about transition from tidyverse? #130
Comments
This would indeed be nice, but given how rapidly the API is evolving and changing, it would be better to wait for a bit before preparing a document like this. It would ideally look something like this: https://dplyr.tidyverse.org/articles/base.html |
Yes exactly, I didn't know this table, it's super useful |
One question is, whether we want to mimic most/all important function? While |
I would say no, we simply don't have the manpower to develop and maintain a full alternative to the data-wrangling abilities of dplyr, especially since they basically developed a whole new architecture to support their group_by -> summarize pipeline. plus, I don't think we have any legitimate reason (aside from our own hubris ^^) to present a full alternative to the tidyverse, most users will have tidyverse installed anyway so the dependency-argument doesn't really hold for regular users. Coexistence ftw And the scope of easystats and tidyverse in general is arguably somewhat different, and in particular datawizard could be more explicitly framed as "data preprocessing / cleaning" (implied: before doing stats) than pure data wrangling |
I agree with Dom. Trying to mimic dplyr/tidyr will expand the scope of easystats beyond what we currently have in mind. The existing data wrangling functions have organically materialized out of our "0-external-dependency principle", and we should continue to operate the same way, adding only those data wrangling functions which are needed in the ecosystem without being concerned whether the suite of functionality is comparable to that provided by tidyverse. In the near future, I think other developers might also be interested in using datawizard to keep their dependencies to a minimum and if they request or implement features that mimic tidyverse, then that will be a welcome addition! But, for now, our modus operandi should be to develop only what we need for the ecosystem. |
Ok, agreed |
as a matter of fact, datawizard could benefit from some functionalities like janitor has, and other (missing value imputation etc.) |
I still use a lot the tidyverse because I know the main functions and how they work. I think that one thing missing in the docs is an article/table that shows equivalent functions between tidyverse and easystats ecosystems. This would do two things: 1) reduce the time spent looking for the equivalent of a function we use a lot with tidyverse, 2) highlight which functions of easystats don't have equivalent in tidyverse. I don't have something complicated in mind, maybe simply a table like:
pull
data_extract
rename
data_rename
replace_na
convert_na_to
To go further, this could be accompanied by some examples showing how to convert tidyverse workflows (just a few functions separated by a pipe) to easystats workflows. What do you think?
The text was updated successfully, but these errors were encountered: