-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathAONR_stats_tutorial.Rmd
205 lines (160 loc) · 9.54 KB
/
AONR_stats_tutorial.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
---
title: 'Estimating AONR and EONR with R: A tutorial and case study'
output:
pdf_document: default
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
\newcommand{\kgha}{$\mathrm{kg \; ha^{-1}}$}
# Preliminaries
First we need to load a few packages
```{r preliminaries}
library(ggplot2)
library(nlraa)
library(car)
library(mgcv)
library(magrittr)
## Load function to compute the EONR
source("eonr.R")
```
# Introduction
This tutorial and case study are intended to illustrate the process of estimating AONR and EONR with a real dataset. The data here were obtained from Table 1 from a publication by Francis et al. (2021). When estimating the EONR it is important that the units for fertilizer and yield match. For that reason, we use $kg \; ha^{-1}$ for yield and for the fertilizer.
```{r read-data}
red <- read.csv("red_clover.csv")
## Create column with yield in kg/ha
red$Yield_kgha <- red$Yield * 1e3
## Visualize the data
ggplot(data = red, aes(x = Nrate, y = Yield_kgha)) +
geom_point() +
ylab("Yield (kg/ha)") +
xlab("N rate (kg/ha)") +
theme_bw()
```
# Simple Analysis
The first approach will be to fit the linear-plateau and quadratic-plateau (with 3 parameters) models and assess the impact of the uncertainty in the model choice. In terms for model fit, both models have identical AIC values and number of parameters. The R-squared is not computed or included here but it would be identical.
```{r first-fit}
fm.QP <- nls(Yield_kgha ~ SSquadp3xs(Nrate, a, b, xs), data = red)
fm.LP <- nls(Yield_kgha ~ SSlinp(Nrate, a, b, xs), data = red)
(ictab <- IC_tab(fm.QP, fm.LP)) ## Identical fit
ndat <- data.frame(Nrate = seq(min(red$Nrate), max(red$Nrate), length.out = 50))
prds.QP <- predict_nls(fm.QP, newdata = ndat, interval = "conf")
ndatA.QP <- cbind(ndat, prds.QP)
prds.LP <- predict_nls(fm.LP, newdata = ndat, interval = "conf")
ndatA.LP <- cbind(ndat, prds.LP)
ggplot() +
geom_point(data = red, aes(x = Nrate, y = Yield_kgha)) +
geom_line(data = ndatA.QP, aes(Nrate, Estimate, color = "QP")) +
geom_line(data = ndatA.LP, aes(Nrate, Estimate, color = "LP")) +
ylab("Yield (kg/ha)") +
xlab("N rate (kg/ha)") +
guides(color = guide_legend(title = "Model")) +
theme_bw()
```
## Computing AONR and EONR
Given that both models have identical fit, AIC and R-squares it is reasonable to instead apply model averaging to resolve this conflict. The AONR estimate for each model is simply the break-point, which is a parameter from the model. For LP it is `r round(coef(fm.LP)[3])` and for QP it is `r round(coef(fm.QP)[3])` \kgha. In order to compute the confidence intervals we could use the profile method or bootstrap. We choose the bootstrap method as it will be more generally applicable to other model fits later. It is important to note that in the code below the AIC-derived weights are used to compute the **AONR.avg**. The object **ictab** in this case has a column calles **weights** that when matrix-multiplied by the AONR for QP and LP, provides the average AONR.
```{r aonr-eonr, cache = TRUE}
## Extract AONR for both models
AONR.LP <- round(coef(fm.LP)[3])
AONR.QP <- round(coef(fm.QP)[3])
### Confidence intervals for AONR estimates ###
fm.QP.bt <- boot_nls(fm.QP, R = 5e3, cores = 4)
fm.LP.bt <- boot_nls(fm.LP, R = 5e3, cores = 4)
AONR.QP.ci <- confint(fm.QP.bt)
AONR.LP.ci <- confint(fm.LP.bt)
## Model averaging
prds.avg <- predict_nls(fm.LP, fm.QP, newdata = ndat, interval = "conf")
ndatA.avg <- cbind(ndat, prds.avg)
AONR.avg <- round(ictab$weight %*% c(coef(fm.QP)[3], coef(fm.LP)[3]))
prds.avg <- predict_nls(fm.LP, fm.QP, newdata = ndat, interval = "conf")
ndatA.avg <- cbind(ndat, prds.avg)
AONR.avg.ci <- confint(c(fm.QP.bt, fm.LP.bt))
ggplot() +
geom_point(data = red, aes(x = Nrate, y = Yield_kgha)) +
geom_line(data = ndatA.QP, aes(Nrate, Estimate, color = "QP")) +
geom_line(data = ndatA.LP, aes(Nrate, Estimate, color = "LP")) +
geom_line(data = ndatA.avg, aes(Nrate, Estimate, color = "avg")) +
geom_point(aes(x = coef(fm.LP)[3], y = 5000, color = "LP"), size = 2) +
geom_errorbarh(aes(xmin = AONR.LP.ci[3,1], xmax = AONR.LP.ci[3,2], y = 5000,
color = "LP")) +
geom_point(aes(x = coef(fm.QP)[3], y = 4800, color = "QP"), size = 2) +
geom_errorbarh(aes(xmin = AONR.QP.ci[3,1], xmax = AONR.QP.ci[3,2], y = 4800,
color = "QP")) +
geom_point(aes(x = AONR.avg, y = 4600, color = "avg"), size = 2) +
geom_errorbarh(aes(xmin = AONR.avg.ci[3,1], xmax = AONR.avg.ci[3,2], y = 4600,
color = "avg")) +
guides(color = guide_legend(title = element_blank())) +
theme_bw() + xlab("N (kg/ha)") + ylab("Yield (kg/ha)") +
ggtitle("Model averaging method")
```
Even though the differences are small, using model averaging better accounts for the uncertainty in the choice between the LP and QP models. Similarly, by using bootstrap we can combine the uncertainty in the bootstrap samples from the two models and compute the 95% confidence interval for the AONR for the average model.
```{r ANOR-table}
aonrs <- data.frame(model = c("LP", "QP", "avg"),
AONR = c(AONR.LP, AONR.QP, AONR.avg),
lower = round(c(AONR.LP.ci[3, 1], AONR.QP.ci[3, 1], AONR.avg.ci[3, 1])),
upper = round(c(AONR.LP.ci[3, 2], AONR.QP.ci[3, 2], AONR.avg.ci[3, 2])))
knitr::kable(aonrs,
caption = "AONR and confidece intervals for LP, QP and average method") %>%
kableExtra::kable_styling(full_width = FALSE)
```
We need to be careful when computing the EONR. First, we need a fertilizer to grain price ratio and we use a value of 5.6 in this case. For the LP model, the EONR is determined by the slope and when this slope is greater than the fertilizer to grain price ratio, then the EONR is equivalent to the AONR. For the QP model, the EONR will be somewhat lower than the AONR. For this we can use the **eonr** function which we sourced at the beginning of this tutorial (this function is available in github: https://github.com/femiguez/optimum_fertilizer).
```{r QP-EONR}
EONR.LP <- coef(fm.LP)[3] ## LP EONR
EONR.QP <- eonr(fm.QP, max.rate = coef(fm.QP)[3]) ## QP EONR
## Taking the mean in this case is sensible because the AIC weights
## for both models were identical. Otherwise, consider using AIC weights
## as in the example above
EONR.avg <- mean(c(EONR.LP, EONR.QP))
eonrs <- data.frame(model = c("LP", "QP", "avg"),
EONR = round(c(EONR.LP, EONR.QP, EONR.avg)))
knitr::kable(eonrs, caption = "EONR for LP, QP and average method") %>%
kableExtra::kable_styling(full_width = FALSE)
```
\newpage
## What about GAMS?
Generalized Additive Models are gaining traction in the analysis of agronomic data. They are one of the best options for certain types of data. However, when used for estimating the AONR (or EONR) they can be fairly unstable as the typical sample size is not sufficiently large. Having less than 15-20 N rates means that the fit will vary substantially depending on the dimension of the basis used. This is illustrated below. Different values of $k$ result in fairly different fits and therefore in different estimates for the EONR. We do not estimate the AONR as it is not defined for GAM models given that the function does not necessarily reach a plateau.
```{r gams}
fm.G3.kgha <- gam(Yield_kgha ~ s(Nrate, k = 3), data = red)
fm.G4.kgha <- gam(Yield_kgha ~ s(Nrate, k = 4), data = red)
fm.G5.kgha <- gam(Yield_kgha ~ s(Nrate, k = 5), data = red)
fm.G6.kgha <- gam(Yield_kgha ~ s(Nrate, k = 6), data = red)
eonr.G3 <- eonr(fm.G3.kgha)
eonr.G4 <- eonr(fm.G4.kgha)
eonr.G5 <- eonr(fm.G5.kgha)
eonr.G6 <- eonr(fm.G6.kgha)
prds.G3 <- predict_gam(fm.G3.kgha, newdata = ndat, interval = "conf")
ndatA.G3 <- cbind(ndat, prds.G3)
prds.G4 <- predict_gam(fm.G4.kgha, newdata = ndat, interval = "conf")
ndatA.G4 <- cbind(ndat, prds.G4)
prds.G5 <- predict_gam(fm.G5.kgha, newdata = ndat, interval = "conf")
ndatA.G5 <- cbind(ndat, prds.G5)
prds.G6 <- predict_gam(fm.G6.kgha, newdata = ndat, interval = "conf")
ndatA.G6 <- cbind(ndat, prds.G6)
ggplot() +
geom_point(data = red, aes(x = Nrate, y = Yield_kgha)) +
geom_line(data = ndatA.G3, aes(Nrate, Estimate, color = "GAM k = 3")) +
geom_line(data = ndatA.G4, aes(Nrate, Estimate, color = "GAM k = 4")) +
geom_line(data = ndatA.G5, aes(Nrate, Estimate, color = "GAM k = 5")) +
geom_line(data = ndatA.G6, aes(Nrate, Estimate, color = "GAM k = 6")) +
geom_point(aes(x = eonr.G3, y = 4000, color = "GAM k = 3"), size = 2.5) +
geom_point(aes(x = eonr.G4, y = 4000, color = "GAM k = 4"), size = 2.5) +
geom_point(aes(x = eonr.G5, y = 4000, color = "GAM k = 5"), size = 2.5) +
geom_point(aes(x = eonr.G6, y = 4000, color = "GAM k = 6"), size = 2.5) +
geom_text(aes(x = 200, y = 4600, label = "EONR estimates")) +
guides(color = guide_legend(title = "Model")) +
ggtitle("GAMs are unstable with small sample sizes") +
theme_bw()
```
The estimates for the EONR for these different model choices are:
```{r eonr-table}
gam.eonrs <- data.frame(model = c("k = 3", "k = 4", "k = 5", "k = 6"),
EONR = round(c(eonr.G3, eonr.G4, eonr.G5, eonr.G6)))
knitr::kable(gam.eonrs, caption = "EONR for different GAMS") %>%
kableExtra::kable_styling(full_width = FALSE)
```
## Additional Details
More details about this analysis can be found at: https://github.com/femiguez/optimum_fertilizer.
# Reference
Francis, HR, Ma, TF, Ruark, MD. Toward a standardized statistical methodology comparing optimum nitrogen rates among management practices: A bootstrapping approach. Agric Environ Lett 2021; 6:e20045. https://doi.org/10.1002/ael2.20045