Skip to content

Commit

Permalink
Add docs for forcats (#53)
Browse files Browse the repository at this point in the history
* Specify calling env for verbs internally for 1) better performance and 2) ensurance of pipda.options.assume_all_piping working thoroughly.

* Add forcats

* Add forcats

* Fix linting

* Add tests for forcats

* Fix tests where factor get lost for fct_count() result for pandas < 1.3

* Update docs
  • Loading branch information
pwwang authored Sep 3, 2021
1 parent abdb098 commit a73bddd
Show file tree
Hide file tree
Showing 9 changed files with 160 additions and 2 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ Unlike other similar packages in python that just mimic the piping syntax, `data

```shell
pip install -U datar
# to make sure dependencies to be up-to-date
# pip install -U varname pipda datar
```

`datar` requires python 3.7.1+ and is backended by `pandas (1.2+)`.
Expand Down
2 changes: 1 addition & 1 deletion datar/base/table.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def tabulate(
bin: Union[ArrayLikeType, Categorical],
nbins: int = None,
) -> numpy.ndarray:
"""Takes the integer-valued vector bin and counts the
"""Takes the integer-valued vector `bin` and counts the
number of times each integer occurs in it.
Args:
Expand Down
2 changes: 1 addition & 1 deletion datar/base/verbs.py
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,7 @@ def complete_cases(_data: DataFrame) -> Iterable[bool]:
def proportions(
x: DataFrame, margin: Union[int, tuple, list] = None
) -> DataFrame:
"""Returns conditional proportions given margins (alias: prop_table)
"""Returns conditional proportions given `margins` (alias: prop_table)
Args:
x: A numeric table
Expand Down
18 changes: 18 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,21 @@
## 0.5.0

Added:

- Added `forcats` (#51 )
- Added `base.is_ordered()`, `base.nlevels()`, `base.ordered()`, `base.rank()`, `base.order()`, `base.sort()`, `base.tabulate()`, `base.append()`, `base.prop_table()` and `base.proportions()`
- Added `gss_cat` dataset

Fixed:

- Fixed an issue when `Collection` dealing with `numpy.int_`

Enhanced:

- Added `base0_` argument for `datar.get()`
- Passed `__calling_env` to registered functions/verbs when used internally (this makes sure the library to be robust in different environments)


## 0.4.4

- Adopt `varname` `v0.8.0`
Expand Down
2 changes: 2 additions & 0 deletions docs/reference-maps/ALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
|`dplyr`|APIs ported from `tidyverse/dplyr`|[:octicons-cross-reference-16:][2]|
|`tidyr`|APIs ported from `tidyverse/tidyr`|[:octicons-cross-reference-16:][4]|
|`tibble`|APIs ported from `tidyverse/tibble`|[:octicons-cross-reference-16:][1]|
|`forcats`|APIs ported from `tidyverse/forcats`|[:octicons-cross-reference-16:][9]|
|#|#|#|
|`datasets`|Datasets collected from `tidyverse` or other related packages|[:octicons-cross-reference-16:][3]|
|#|#|#|
Expand All @@ -21,3 +22,4 @@
[6]: ../datar
[7]: ../stats
[8]: ../utils
[9]: ../forcats
18 changes: 18 additions & 0 deletions docs/reference-maps/base.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,9 @@
|[`levels`][44]|Get levels of factors|[:material-notebook:][4]|
|[`is_factor`][45] [`is_categorical`][45]|Test if data is factor|[:material-notebook:][4]|
|[`as_factor`][46] [`as_categorical`][46]|Cast data to factor|[:material-notebook:][4]|
|[`is_ordered`][140]|Check if a factor is ordered||
|[`nlevels`][141]|Get number of levels of a factor||
|[`ordered`][142]|Create an ordered factor||

### Logical/Boolean values

Expand Down Expand Up @@ -173,6 +176,9 @@
|[`sample`][64]|Sample the elements from sequence|[:material-notebook:][4]|
|[`length`][65]|Get the length of data|[:material-notebook:][4]|
|[`match`][129]|match returns a vector of the positions of (first) matches of its first argument in its second.||
|[`rank`][143]|Returns the sample ranks of the values in a vector.||
|[`order`][144]|Returns a permutation which rearranges its first argument into ascending or descending order||
|[`sort`][145]|Sorting or Ordering Vectors||

### Special functions

Expand Down Expand Up @@ -220,6 +226,7 @@
|API|Description|Notebook example|
|---|---|---:|
|[`table`][91]|Cross Tabulation and Table Creation|[:material-notebook:][4]|
|[`tabulate`][146]|Takes the integer-valued vector `bin` and counts the number of times each integer occurs in it.||

### Testing value types

Expand Down Expand Up @@ -267,7 +274,9 @@
|[`identity`][114]|Identity Function|[:material-notebook:][4]|
|[`expandgrid`][115]|Create a Data Frame from All Combinations of Factor Variables|[:material-notebook:][4]|
|[`max_col`][136]|Find the maximum position for each row of a matrix||
|[`append`][147]|Add elements to a vector.||
|[`complete_cases`][137]|Get a bool array indicating whether the values of rows are complete in a data frame.||
|[`proportions`][147], [`prop_table`][147]|Returns conditional proportions given `margins`||
|[`make_names`][137]|Make names available as columns and can be accessed by `df.<name>`||
|[`make_unique`][138]|Make the names unique, alias of `make_names(names, unique=True)`||
|[**`data_context`**][116]|Mimic R's `with`|[:material-notebook:][4]|
Expand Down Expand Up @@ -412,3 +421,12 @@
[137]: ../../api/datar.base.verbs/#datar.base.verbs.complete_cases
[138]: ../../api/datar.base.funs/#datar.base.funs.make_names
[139]: ../../api/datar.base.funs/#datar.base.funs.make_unique
[140]: ../../api/datar.base.factor/#datar.base.factor.is_ordered
[141]: ../../api/datar.base.factor/#datar.base.factor.nlevels
[142]: ../../api/datar.base.factor/#datar.base.factor.ordered
[143]: ../../api/datar.base.funs/#datar.base.funs.rank
[144]: ../../api/datar.base.seq/#datar.base.seq.order
[145]: ../../api/datar.base.seq/#datar.base.seq.sort
[146]: ../../api/datar.base.table/#datar.base.table.tabulate
[147]: ../../api/datar.base.verbs/#datar.base.verbs.append
[148]: ../../api/datar.base.verbs/#datar.base.verbs.proportions
3 changes: 3 additions & 0 deletions docs/reference-maps/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@
|`seals`|Vector field of seal movements|[`r-ggplot2-seals`][28]|
|`txhousing`|Housing sales in TX|[`r-ggplot2-txhousing`][29]|
|`luv_colours`|`colors()` in Luv space|[`r-ggplot2-luv_colours`][30]|
|#|#|
|`gss_cat`|A sample of categorical variables from the General Social survey|[`r-forcats-gss_cat`][32]|

[1]: https://github.com/tidyverse/nycflights13
[2]: https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/airquality
Expand Down Expand Up @@ -97,3 +99,4 @@
[29]: https://ggplot2.tidyverse.org/reference/txhousing.html
[30]: https://ggplot2.tidyverse.org/reference/luv_colours.html
[31]: https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/faithfulZZ
[32]: https://forcats.tidyverse.org/reference/gss_cat.html
114 changes: 114 additions & 0 deletions docs/reference-maps/forcats.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
<style>
.md-typeset__table {
min-width: 100%;
}

.md-typeset table:not([class]) {
display: table;
max-width: 80%;
}
</style>

## Reference of `datar.forcats`

Reference map of `r-tidyverse-forcats` can be found [here][1].

<u>**Legend:**</u>

|Sample|Status|
|---|---|
|[normal]()|API that is regularly ported|
|<s>[strike-through]()</s>|API that is not ported, or not an API originally|
|[**bold**]()|API that is unique in `datar`|
|[_italic_]()|Working in process|

### Change order of levels

|API|Description|Notebook example|
|---|---|---:|
|[fct_relevel()][2]|Reorder factor levels by hand|[:material-notebook:][3]|
|[fct_inorder()][4] [fct_infreq()][5] [fct_inseq()][6]|Reorder factor levels by first appearance, frequency, or numeric order|[:material-notebook:][3]|
|[fct_reorder()][7] [fct_reorder2()][8] [last2()][9] [first2()][10]|Reorder factor levels by sorting along another variable|[:material-notebook:][3]|
|[fct_shuffle()][11]|Randomly permute factor levels|[:material-notebook:][3]|
|[fct_rev()][12]|Reverse order of factor levels|[:material-notebook:][3]|
|[fct_shift()][13]|Shift factor levels to left or right, wrapping around at end|[:material-notebook:][3]|

### Change value of levels

|API|Description|Notebook example|
|---|---|---:|
|[fct_anon()][15]|Anonymise factor levels|[:material-notebook:][14]|
|[fct_collapse()][16]|Collapse factor levels into manually defined groups|[:material-notebook:][14]|
|[fct_lump()][17] [fct_lump_min()][18] [fct_lump_prop()][19] [fct_lump_n()][20] [fct_lump_lowfreq()][41]|Lump together actor levels into "other"|[:material-notebook:][14]|
|[fct_other()][21]|Replace levels with "other"|[:material-notebook:][14]|
|[fct_recode()][22]|Change factor levels by hand|[:material-notebook:][14]|
|[fct_relabel()][23]|Automatically relabel factor levels, collapse as necessary|[:material-notebook:][14]|

### Add/remove levels

|API|Description|Notebook example|
|---|---|---:|
|[fct_expand()][25]|Add additional levels to a factor|[:material-notebook:][24]|
|[fct_explicit_na()][26]|Make missing values explicit||[:material-notebook:][24]|
|[fct_drop()][27]|Drop unused levels||[:material-notebook:][24]|
|[fct_unify()][28]|Unify the levels in a list of factors||[:material-notebook:][24]|

### Combine multiple factors

|API|Description|Notebook example|
|---|---|---:|
|[fct_c()][29]|Concatenate factors, combining levels|[:material-notebook:][31]|
|[fct_cross()][30]|Combine levels from two or more factors to create a new factor|[:material-notebook:][31]|

### Other helpers

|API|Description|Notebook example|
|---|---|---:|
|[as_factor()][33]|Convert input to a factor|[:material-notebook:][32]|
|[fct_count()][34]|Count entries in a factor|[:material-notebook:][32]|
|[fct_match()][35]|Test for presence of levels in a factor|[:material-notebook:][32]|
|[fct_unique()][36]|Unique values of a factor|[:material-notebook:][32]|
|[lvls_reorder()][37] [lvls_revalue()][38] [lvls_expand()][39]|Low-level functions for manipulating levels|[:material-notebook:][32]|
|[lvls_union()][40]|Find all levels in a list of factors|[:material-notebook:][32]|

[1]: https://forcats.tidyverse.org/reference/index.html
[2]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_relevel
[3]: ../../notebooks/forcats_lvl_order
[4]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_inorder
[5]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_infreq
[6]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_inseq
[7]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_reorder
[8]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_reorder2
[9]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.last2
[10]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.first2
[11]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_shuffle
[12]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_rev
[13]: ../../api/datar.forcats.lvl_order/#datar.tidyr.lvl_order.fct_shift
[14]: ../../notebooks/forcats_lvl_value
[15]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_relevel
[16]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_relevel
[17]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_lump
[18]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_lump_min
[19]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_lump_prop
[20]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_lump_n
[21]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_other
[22]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_recode
[23]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_relabel
[24]: ../../notebooks/forcats_lvl_addrm
[25]: ../../api/datar.forcats.lvl_addrm/#datar.tidyr.lvl_addrm.fct_expand
[26]: ../../api/datar.forcats.lvl_addrm/#datar.tidyr.lvl_addrm.fct_explicit_na
[27]: ../../api/datar.forcats.lvl_addrm/#datar.tidyr.lvl_addrm.fct_drop
[28]: ../../api/datar.forcats.lvl_addrm/#datar.tidyr.lvl_addrm.fct_unify
[29]: ../../api/datar.forcats.fct_multi/#datar.tidyr.fct_multi.fct_c
[30]: ../../api/datar.forcats.fct_multi/#datar.tidyr.fct_multi.fct_cross
[31]: ../../notebooks/forcats_fct_multi
[32]: ../../notebooks/forcats_misc
[33]: ../../api/datar.forcats.misc/#datar.tidyr.misc.as_factor
[34]: ../../api/datar.forcats.misc/#datar.tidyr.misc.fct_count
[35]: ../../api/datar.forcats.misc/#datar.tidyr.misc.fct_match
[36]: ../../api/datar.forcats.misc/#datar.tidyr.misc.fct_unique
[37]: ../../api/datar.forcats.misc/#datar.tidyr.misc.lvls_reorder
[38]: ../../api/datar.forcats.misc/#datar.tidyr.misc.lvls_revalue
[39]: ../../api/datar.forcats.misc/#datar.tidyr.misc.lvls_expand
[40]: ../../api/datar.forcats.misc/#datar.tidyr.misc.lvls_union
[41]: ../../api/datar.forcats.lvl_value/#datar.tidyr.lvl_value.fct_lump_lowfreq
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ nav:
'dplyr': 'reference-maps/dplyr.md'
'tibble': 'reference-maps/tibble.md'
'tidyr': 'reference-maps/tidyr.md'
'forcats': 'reference-maps/forcats.md'
'datasets': 'reference-maps/datasets.md'
'datar': 'reference-maps/datar.md'
- 'Porting rules': 'porting_rules.md'
Expand Down

0 comments on commit a73bddd

Please sign in to comment.