title change typo to R + white space formating

UBC-DSCI · Dec 7, 2024 · 925c831 · 925c831
1 parent 9152eb2
commit 925c831
Showing 1 changed file with 46 additions and 46 deletions.
diff --git a/book/lectures/142-R-testing-example.qmd b/book/lectures/142-R-testing-example.qmd
@@ -1,54 +1,54 @@
 ---
-title: Python testing example
+title: R testing example
 ---
 
 ### Example of workflow for writing functions and tests for data science
 
-Let's say we want to write a function 
-for a task we repeatedly are performing in our data analysis. 
-For example, summarizing the number of observations in each class. 
-This is a common task performed for almost every classification problem 
-to examine how many classes there are to understand if we are facing a binary 
-or multi-class classification problem, 
-as well as to examine whether there are any class imbalances 
+Let's say we want to write a function
+for a task we repeatedly are performing in our data analysis.
+For example, summarizing the number of observations in each class.
+This is a common task performed for almost every classification problem
+to examine how many classes there are to understand if we are facing a binary
+or multi-class classification problem,
+as well as to examine whether there are any class imbalances
 that we may need to deal with before tuning our models.
 
 #### 1. Write the function specifications and documentation - but do not implement the function:
 The first thing we should do is write the function specifications and documentation. This can effectively represented by an empty function and `roxygen2`-styled documentation in R as shown below:
 
 ```{r}
-#' Count class observations                                                    
-#'                                                                             
-#' Creates a new data frame with two columns, 
+#' Count class observations
+#'
+#' Creates a new data frame with two columns,
 #' listing the classes present in the input data frame,
 #' and the number of observations for each class.
 #'
 #' @param data_frame A data frame or data frame extension (e.g. a tibble).
 #' @param class_col unquoted column name of column containing class labels
 #'
-#' @return A data frame with two columns. 
+#' @return A data frame with two columns.
 #'   The first column (named class) lists the classes from the input data frame.
-#'   The second column (named count) lists the number of observations 
+#'   The second column (named count) lists the number of observations
 #'   for each class from the input data frame.
 #'   It will have one row for each class present in input data frame.
 #'
 #' @export
 #' @examples
 #' count_classes(mtcars, am)
-count_classes <- function(data_frame, class_col) {                           
+count_classes <- function(data_frame, class_col) {
   # returns a data frame with two columns: class and count
 }
 ```
 
 #### 2. Plan the test cases and document them:
 
-Next, we should plan out our test cases and start to document them. 
-At this point we can sketch out a skeleton for our test cases with code, 
-but we are not yet ready to write them, 
-as we first will need to reproducibly create test data 
-that is useful for assessing whether your function works as expected. 
-So considering our function specifications, 
-some kinds of input we might anticipate our function may receive, 
+Next, we should plan out our test cases and start to document them.
+At this point we can sketch out a skeleton for our test cases with code,
+but we are not yet ready to write them,
+as we first will need to reproducibly create test data
+that is useful for assessing whether your function works as expected.
+So considering our function specifications,
+some kinds of input we might anticipate our function may receive,
 and correspondingly what it should return is listed below:
 
 ##### Simple expected use test case #1
@@ -85,7 +85,7 @@ Dataframe (or tibble)
 
 ##### Simple expected use test case #2
 
-- Dataframe with 2 classes, with 2 observations for one class, 
+- Dataframe with 2 classes, with 2 observations for one class,
 and only one observation in the other
 
 *Function input:*
@@ -199,7 +199,7 @@ class_labels
 Error
 
 ```{r, eval=FALSE}
-Error : 
+Error :
   `data_frame` should be a dataframe or dataframe extension (e.g. a tibble)
 ```
 
@@ -215,21 +215,21 @@ With `testthat` we create a `test_that` statement for each related group of test
 
 library(testthat)
 
-test_that("`count_classes` should return a data frame, or tibble, 
-with the number of rows corresponding to the number of unique classes 
-in the `class_col` from the original dataframe. The new dataframe 
-will have a `class column` whose values are the unique classes, 
-and a `count` column, whose values will be the number of observations 
+test_that("`count_classes` should return a data frame, or tibble,
+with the number of rows corresponding to the number of unique classes
+in the `class_col` from the original dataframe. The new dataframe
+will have a `class column` whose values are the unique classes,
+and a `count` column, whose values will be the number of observations
 for each  class", {
   # "expected use cases" tests to be added here
 })
 
-test_that("`count_classes` should return an empty data frame, or tibble, 
+test_that("`count_classes` should return an empty data frame, or tibble,
 if the input to the function is an empty data frame", {
   # "edge cases" test to be added here
 })
 
-test_that("`count_classes` should throw an error when incorrect types 
+test_that("`count_classes` should throw an error when incorrect types
 are passed to the `data_frame` argument", {
   # "error" tests to be added here
 })
@@ -307,11 +307,11 @@ These are fall-back expectations that you can use when none of the other more sp
 ```{r}
 #| error: true
 
-test_that("`count_classes` should return a data frame, or tibble, 
-with the number of rows corresponding to the number of unique classes 
-in the `class_col` from the original dataframe. The new dataframe 
-will have a `class column` whose values are the unique classes, 
-and a `count` column, whose values will be the number of observations 
+test_that("`count_classes` should return a data frame, or tibble,
+with the number of rows corresponding to the number of unique classes
+in the `class_col` from the original dataframe. The new dataframe
+will have a `class column` whose values are the unique classes,
+and a `count` column, whose values will be the number of observations
 for each  class", {
   expect_s3_class(count_classes(two_classes_2_obs, class_labels),
                   "data.frame")
@@ -321,15 +321,15 @@ for each  class", {
                     two_classes_2_and_1_obs_output, ignore_attr = TRUE)
 })
 
-test_that("`count_classes` should return an empty data frame, or tibble, 
+test_that("`count_classes` should return an empty data frame, or tibble,
 if the input to the function is an empty data frame", {
   expect_equal(count_classes(one_class_2_obs, class_labels),
                     one_class_2_obs_output, ignore_attr = TRUE)
   expect_equal(count_classes(empty_df, class_labels),
                     empty_df_output, ignore_attr = TRUE)
 })
 
-test_that("`count_classes` should throw an error when incorrect types 
+test_that("`count_classes` should throw an error when incorrect types
 are passed to the `data_frame` argument", {
   expect_error(count_classes(two_classes_two_obs_as_list, class_labels))
 })
@@ -346,14 +346,14 @@ FINALLY!! We can write the function body for our function! And then call our tes
 ```{r}
 #' Count class observations
 #'
-#' Creates a new data frame with two columns, 
+#' Creates a new data frame with two columns,
 #' listing the classes present in the input data frame,
 #' and the number of observations for each class.
 #'
 #' @param data_frame A data frame or data frame extension (e.g. a tibble).
 #' @param class_col unquoted column name of column containing class labels
 #'
-#' @return A data frame with two columns. 
+#' @return A data frame with two columns.
 #'   The first column (named class) lists the classes from the input data frame.
 #'   The second column (named count) lists the number of observations for each class from the input data frame.
 #'   It will have one row for each class present in input data frame.
@@ -380,11 +380,11 @@ count_classes <- function(data_frame, class_col) {
 :::
 
 ```{r}
-test_that("`count_classes` should return a data frame, or tibble, 
-with the number of rows corresponding to the number of unique classes 
-in the `class_col` from the original dataframe. The new dataframe 
-will have a `class column` whose values are the unique classes, 
-and a `count` column, whose values will be the number of observations 
+test_that("`count_classes` should return a data frame, or tibble,
+with the number of rows corresponding to the number of unique classes
+in the `class_col` from the original dataframe. The new dataframe
+will have a `class column` whose values are the unique classes,
+and a `count` column, whose values will be the number of observations
 for each  class", {
   expect_s3_class(count_classes(two_classes_2_obs, class_labels),
                   "data.frame")
@@ -394,15 +394,15 @@ for each  class", {
                     two_classes_2_and_1_obs_output, ignore_attr = TRUE)
 })
 
-test_that("`count_classes` should return an empty data frame, or tibble, 
+test_that("`count_classes` should return an empty data frame, or tibble,
 if the input to the function is an empty data frame", {
   expect_equal(count_classes(one_class_2_obs, class_labels),
                     one_class_2_obs_output, ignore_attr = TRUE)
   expect_equal(count_classes(empty_df, class_labels),
                     empty_df_output, ignore_attr = TRUE)
 })
 
-test_that("`count_classes` should throw an error when incorrect types 
+test_that("`count_classes` should throw an error when incorrect types
 are passed to the `data_frame` argument", {
   expect_error(count_classes(two_classes_two_obs_as_list, class_lables))
 })