generated from Sage-Bionetworks/docker-rstudio
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsurvey-analysis.Rmd
41 lines (25 loc) · 1.31 KB
/
survey-analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: "Post Challenge Survey Analysis"
author: "a name like Dorothy Vaughan"
date: "`r Sys.Date()`"
output:
html_document:
df_print: paged
code_fold: hide
toc: true
toc_float: true
---
## Introduction
Here, we look at some of the model and submission metadata and how they correlate with performance. This is using survey data where we have cleaned up some of the responses (e.g. eliminating duplicate responses, matching strings or generalizing responses so that they fit into fewer categories).
First, load packages and get data, as well as define function to plot categorical variable barplot.
```{r echo=TRUE, message=FALSE, warning=FALSE}
```
## Barplots of survey responses
### Spearman correlation
Let's look at the following survey answers (`interesting vars`) and how they correlate to the challenge metric (e.g. Spearman correlation). On this plot, each bar is the Spearman correlation of a specific submission, and the color of the bar is the response for that survey question. We might also test the significance of these responses using a test like the [Kruskal-Wallis](https://en.wikipedia.org/wiki/Kruskal%E2%80%93Wallis_one-way_analysis_of_variance) test.
```{r echo=TRUE, message=FALSE, warning=FALSE}
```
### RMSE
And repeat for RMSE:
```{r echo=TRUE, message=FALSE, warning=FALSE}
```