Skip to content

Latest commit

 

History

History
207 lines (168 loc) · 6.64 KB

README.md

File metadata and controls

207 lines (168 loc) · 6.64 KB

rtichoke

R-CMD-check CRAN status Lifecycle: experimental Codecov test coverage

For some reproducible examples please visit rtichoke blog!

Installation

You can install rtichoke from GitHub with:

# install.packages("devtools")
devtools::install_github("uriahf/rtichoke")

Overview:

  • rtichoke is designed to help analysts with exploration of performance metrics with a binary outcome. In order to do so it uses interactive visualization.

Getting started

Predictions and Outcomes as input

In order to use rtichoke you need to have

  • probs: Estimated Probabilities as predictions.
  • reals: Binary Outcomes.

There are 3 different cases and for each one of them rtichoke requires a different kind of input:

Singel Model:

The user is required to provide a list with one vector for the predictions and a list with one vector for the outcomes.

create_roc_curve(
  probs = list(example_dat$bad_model),
  reals = list(example_dat$outcome)
)

Models Comparison:

Why? In order to compare performance for several different models for the same population.

How? The user is required to provide a list with one vector of predictions for each model and a list with one vector for the outcome of the population.

create_roc_curve(
  probs = list(
    "Good Model" = example_dat$estimated_probabilities,
    "Bad Model" = example_dat$bad_model,
    "Random Guess" = example_dat$random_guess
  ),
  reals = list(rtichoke::example_dat$outcome)
)

Several Populations

Why? In order to compare performance for different populations, like in Train / Test split or in order to check the fairness of the algorithms.

How? The user is required to provide a list with one vector of predictions for each population and a list with one vector of outcomes for each population.

create_roc_curve(
  probs = list(
    "Train" = example_dat %>%
      dplyr::filter(type_of_set == "train") %>%
      dplyr::pull(estimated_probabilities),
    "Test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(estimated_probabilities)
  ),
  reals = list(
    "Train" = example_dat %>% dplyr::filter(type_of_set == "train") %>%
      dplyr::pull(outcome),
    "Test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(outcome)
  )
)

Performance Data as input

For some outputs in rtichoke you can alternatively prepare a performance data and use it as an input: instead of create_*_curve use plot_*_curve and instead of create_performance_table use render_performance_table:

one_pop_one_model_as_a_vector %>%
  plot_roc_curve()

Summary Report

In order to get all the supported outputs of rtichoke in one html file the user can call create_summary_report().

Getting help

If you encounter a bug please fill an issue with a minimal reproducible example, it will be easier for me to help you and it might help others in the future. Alternatively you are welcome to contact me personally: [email protected]