08-plotcaching.Rmd

# Plot Caching

[Blog: Shiny 1.2.0: Plot caching - November 18, 2018](https://resources.rstudio.com/rstudio-blog/shiny-1-2-0-plot-caching)

The Shiny 1.2.0 package release introduced _Plot Caching_, an important new tool for improving performance and scalability in Shiny applications.

**In a nutshell:** The term "caching" means that when time-consuming operations are performed, we can save (cache) the results so that the next time that operation is requested, instead of re-running that calculation, we instead go fetch the previously cached result. When applied appropriately, this "fetch" operation should take less time that the original calculation and improve application performace (and user experience) overall.

Shiny's reactive expressions do some amount of caching by default, and you can use more explicit techniques to cache various data operations. Examples include use of the `memoise` package, or manually saving intermediate data frames to disk as CSV or RDS.

Plots are a very common and (potentially) expensive to compute type of output object in Shiny applications, which makes them a great candidate for caching. In theory, you could use `renderImage` to accomplish this, but because Shiny's `renderPlot` function contains a lot of complex infrastructure code, it's actually quite a difficult task. 

Shiny v1.2.0 introduces a new function, `renderCachedPlot`, that makes it much easiter to add plot caching to your application.

## When to use plot caching

A shiny app is a good candidate for plot caching if:

1. The app has plot outputs that are time-consuming to generate
2. These plots are a significant fraction of the total amount of time the app spends thinking
3. Most users are likely to request the same few plots

### Using `renderCachedPlot`

![renderChachedPlot()](imgs/plotcaching/rendercachedplot.png)

The following example is a simple, but computationally expensive, plot output:

```
output$plot <- renderPlot({
  ggplot(diamonds, aes(carat, price, color = !!input$color_by)) +
    geom_point()
})
```

The `diamonds` dataset has 53,940 rows. This plot might take roughly 1580 milliseconds (1.58 seconds) to generate depending on the resources available. For a high traffic Shiny application in production, 1.58 seconds is likely slower than ideal.

Plot caching can be implemented in two steps:

1. Change `renderPlot` to `renderCachedPlot`
2. Provide a suitable `cacheKeyExpr`. This is an expression that Shiny will use to determine which invocations of `renderCachedPlot` should be considered equivalent to each other. In the example case, two plots with different `input$color_by` values can't be considered the "same" plot, so the `cacheKeyExpr` needs to be `input$color_by`

The example plot code would be updated like this:

```
output$plot <- renderCachedPlot({
  ggplot(diamonds, aes(carat, price, color = !!input$color_by)) +
    geom_point()
}, cacheKeyExpr = { input$color_by })
```

With these code changes, the first time a plot with a particular `input$color_by` is requested, it will take the normal amount of time. But the next time it is requested, it will be almost instant, as the previously rendered plot will be reused.

## Activity: Plot cache benchmarking

**First: Update your app code to use `renderCachedPlot`**

**Discussion:** 
_caching + shinyloadtest_

- How will introducing `renderCachedPlot` affect our load test experience?

_What is the performance comparison between tests?_

**Deliverable: Re-run Load Test**

- Redeploy the version of the app with `renderCachedPlot`
- Re-run the load test and compare the outputs (continue to follow along with `runloadtest.R`)

## Extended Topics

- [Shiny Article: Plot Caching by Winston Chang](http://shiny.rstudio.com/articles/plot-caching.html)

**Plot Caching on RStudio Connect**

Shiny can store cached plots in memory, on disk, or with another backend like [Redis](https://redis.io/). There are also a number of options for limiting the size of the cache. Applications deployed on RStudio Connect should use a disk cache and specify a subdirectory of the application directory as the location for the cache. To do so, add this code to the top of your application:

```
library(shiny)
shinyOptions(cache = diskCache("./cache"))
```

This option ensures that cached plots will be saved and used across the multiple R processes that RStudio Connect runs in support of an application. In addition, this configuration results in the cache being deleted and reset when new versions of the application are deployed.