Add some doc about transition from tidyverse? #130

etiennebacher · 2022-03-21T09:10:21Z

I still use a lot the tidyverse because I know the main functions and how they work. I think that one thing missing in the docs is an article/table that shows equivalent functions between tidyverse and easystats ecosystems. This would do two things: 1) reduce the time spent looking for the equivalent of a function we use a lot with tidyverse, 2) highlight which functions of easystats don't have equivalent in tidyverse. I don't have something complicated in mind, maybe simply a table like:

tidyverse	easystats
`pull`	`data_extract`
`rename`	`data_rename`
`replace_na`	`convert_na_to`
...	...

To go further, this could be accompanied by some examples showing how to convert tidyverse workflows (just a few functions separated by a pipe) to easystats workflows. What do you think?

The text was updated successfully, but these errors were encountered:

IndrajeetPatil · 2022-03-21T09:45:57Z

This would indeed be nice, but given how rapidly the API is evolving and changing, it would be better to wait for a bit before preparing a document like this.

It would ideally look something like this: https://dplyr.tidyverse.org/articles/base.html

etiennebacher · 2022-03-21T09:48:53Z

It would ideally look something like this: https://dplyr.tidyverse.org/articles/base.html

Yes exactly, I didn't know this table, it's super useful

strengejacke · 2022-06-09T20:25:58Z

One question is, whether we want to mimic most/all important function? While mutate() can be replaced somehow by transform(), a summarise() equivalent (that also works on grouped df) is missing, and aggregate() is by design only a poor substitute, as you can only apply one function... so, do we also want something like summarise() in datawizard, or do we promote a co-existence with dplyr / tidyr?

DominiqueMakowski · 2022-06-10T03:51:26Z

I would say no, we simply don't have the manpower to develop and maintain a full alternative to the data-wrangling abilities of dplyr, especially since they basically developed a whole new architecture to support their group_by -> summarize pipeline.

plus, I don't think we have any legitimate reason (aside from our own hubris ^^) to present a full alternative to the tidyverse, most users will have tidyverse installed anyway so the dependency-argument doesn't really hold for regular users. Coexistence ftw

And the scope of easystats and tidyverse in general is arguably somewhat different, and in particular datawizard could be more explicitly framed as "data preprocessing / cleaning" (implied: before doing stats) than pure data wrangling

IndrajeetPatil · 2022-06-10T07:19:45Z

I agree with Dom. Trying to mimic dplyr/tidyr will expand the scope of easystats beyond what we currently have in mind.

The existing data wrangling functions have organically materialized out of our "0-external-dependency principle", and we should continue to operate the same way, adding only those data wrangling functions which are needed in the ecosystem without being concerned whether the suite of functionality is comparable to that provided by tidyverse.

In the near future, I think other developers might also be interested in using datawizard to keep their dependencies to a minimum and if they request or implement features that mimic tidyverse, then that will be a welcome addition!

But, for now, our modus operandi should be to develop only what we need for the ecosystem.

strengejacke · 2022-06-10T07:48:11Z

Ok, agreed

DominiqueMakowski · 2022-06-10T09:02:20Z

as a matter of fact, datawizard could benefit from some functionalities like janitor has, and other (missing value imputation etc.)

IndrajeetPatil added the Docs 📚 Improvements or additions to documentation label Mar 21, 2022

IndrajeetPatil linked a pull request Jul 1, 2022 that will close this issue

Add a vignette about equivalence with tidyverse #183

Merged

IndrajeetPatil mentioned this issue Jul 1, 2022

Add a vignette about equivalence with tidyverse #183

Merged

etiennebacher mentioned this issue Jul 5, 2022

add a data_arrange()? #193

Closed

IndrajeetPatil closed this as completed in #183 Jul 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some doc about transition from tidyverse? #130

Add some doc about transition from tidyverse? #130

etiennebacher commented Mar 21, 2022

IndrajeetPatil commented Mar 21, 2022

etiennebacher commented Mar 21, 2022

strengejacke commented Jun 9, 2022

DominiqueMakowski commented Jun 10, 2022 •

edited

Loading

IndrajeetPatil commented Jun 10, 2022

strengejacke commented Jun 10, 2022

DominiqueMakowski commented Jun 10, 2022

Add some doc about transition from tidyverse? #130

Add some doc about transition from tidyverse? #130

Comments

etiennebacher commented Mar 21, 2022

IndrajeetPatil commented Mar 21, 2022

etiennebacher commented Mar 21, 2022

strengejacke commented Jun 9, 2022

DominiqueMakowski commented Jun 10, 2022 • edited Loading

IndrajeetPatil commented Jun 10, 2022

strengejacke commented Jun 10, 2022

DominiqueMakowski commented Jun 10, 2022

DominiqueMakowski commented Jun 10, 2022 •

edited

Loading