Skip to content

Commit

Permalink
Merge branch 'main' into patch-1
Browse files Browse the repository at this point in the history
  • Loading branch information
bundfussr authored Feb 6, 2025
2 parents 7068dd6 + a6feba3 commit 88fc1fe
Show file tree
Hide file tree
Showing 6 changed files with 44 additions and 8 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: admiral
Title: ADaM in R Asset Library
Version: 1.2.0.9005
Version: 1.2.0.9006
Authors@R: c(
person("Ben", "Straub", , "[email protected]", role = c("aut", "cre")),
person("Stefan", "Bundfuss", role = "aut",
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

## Updates of Existing Functions

- The function `extract_duplicate_records()` was updated to consider all variables in the input dataset for the by group if the `by_vars` argument is omitted entirely. (#2644)
- In `slice_derivation`, previously the derivation is not called for empty subsets, however this can lead to issues when the input dataset is empty. Now the derivation is called for all subsets.

## Breaking Changes
Expand Down
16 changes: 13 additions & 3 deletions R/duplicates.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ get_duplicates_dataset <- function() {
#' @param by_vars Grouping variables
#'
#' Defines groups of records in which to look for duplicates.
#' If omitted, all variables in the input dataset are used in the by group.
#'
#' **Note:** Omitting `by_vars` will increase the function's run-time, so it is
#' recommended to specify the necessary grouping variables for large datasets
#' whenever possible.
#'
#' `r roxygen_param_by_vars()`
#'
Expand All @@ -55,9 +60,14 @@ get_duplicates_dataset <- function() {
#' adsl <- rbind(admiral_adsl[1L, ], admiral_adsl)
#'
#' extract_duplicate_records(adsl, exprs(USUBJID))
extract_duplicate_records <- function(dataset, by_vars) {
assert_expr_list(by_vars)
assert_data_frame(dataset, required_vars = extract_vars(by_vars), check_is_grouped = FALSE)
extract_duplicate_records <- function(dataset, by_vars = NULL) {
if (is.null(by_vars)) {
assert_data_frame(dataset, check_is_grouped = FALSE)
by_vars <- exprs(!!!parse_exprs(names(dataset)))
} else {
assert_expr_list(by_vars)
assert_data_frame(dataset, required_vars = extract_vars(by_vars), check_is_grouped = FALSE)
}

data_by <- dataset %>%
ungroup() %>%
Expand Down
7 changes: 6 additions & 1 deletion man/extract_duplicate_records.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tests/testthat/_snaps/duplicates.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# signal_duplicate_records Test 2: dataset of duplicate records can be accessed using `get_duplicates_dataset()`
# signal_duplicate_records Test 3: dataset of duplicate records can be accessed using `get_duplicates_dataset()`

Code
get_duplicates_dataset()
Expand Down
24 changes: 22 additions & 2 deletions tests/testthat/test-duplicates.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,29 @@ test_that("extract_duplicate_records Test 1: duplicate records are extracted", {
)
})

## Test 2: duplicate records for all variables ----
test_that("extract_duplicate_records Test 2: duplicate records for all variables", {
input <- tibble::tribble(
~USUBJID, ~COUNTRY, ~AAGE,
"P01", "GER", 22,
"P01", "JPN", 34,
"P02", "CZE", 41,
"P03", "AUS", 39,
"P04", "BRA", 21,
"P04", "BRA", 21
)
expected_ouput <- input[c(5:6), ]

expect_equal(
expected_ouput,
extract_duplicate_records(input)
)
})


# signal_duplicate_records ----
## Test 2: dataset of duplicate records can be accessed using `get_duplicates_dataset()` ----
test_that("signal_duplicate_records Test 2: dataset of duplicate records can be accessed using `get_duplicates_dataset()`", { # nolint
## Test 3: dataset of duplicate records can be accessed using `get_duplicates_dataset()` ----
test_that("signal_duplicate_records Test 3: dataset of duplicate records can be accessed using `get_duplicates_dataset()`", { # nolint
input <- tibble::tribble(
~USUBJID, ~COUNTRY, ~AAGE,
"P01", "GER", 22,
Expand Down

0 comments on commit 88fc1fe

Please sign in to comment.