-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path03-background.Rmd
340 lines (238 loc) · 14.4 KB
/
03-background.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
# Spatial Models {#background}
This section contains some of the statistical background behind why spatial models are used and how they work. Understanding this section is not essential, but it is extremely helpful. This section relies on information introduced in \@ref(intro), so please make sure you have read that section if you are new to empirical variograms and spatial statistics.
General linear statistical models are commonly modeled as thus:
$$Y_i = \beta_0 + X_i\beta_1 + \epsilon_i$$
$\beta_1$ is a slope describing the relationship between a continuous variable and the dependent variable, $Y_i$. If $X_i$ is a categorical variable, such as a crop variety, then there will be $p-1$ slopes estimated, where p is the number of unique treatments levels in $X$.
The error terms, $\epsilon_i$ are assumed normally distributed with a mean of zero and a variance of $\sigma^2$ :
$$e_i ~\sim N(0, \sigma^2)$$
The error terms, or residuals, are assumed to be *identically* and *independently* distributed (sometimes abbreviated "iid"). This implies a constant variance of the error terms and zero covariance between residuals.
If N = 3, the expanded model looks like this:
$$\left[ {\begin{array}{ccc} Y_1\\ Y_2\\ Y_3 \end{array} } \right] = \beta_0 +
\left[ {\begin{array}{ccc} X_1\\ X_2\\ X_3 \end{array} } \right] \beta_1 +
\left[ {\begin{array}{ccc} \epsilon_1\\ \epsilon_2\\ \epsilon_3 \end{array} } \right] $$
$$e_i ~\sim N \Bigg( 0,
\left[ {\begin{array}{ccc} \sigma^2 & 0 & 0 \\ 0 & \sigma^2 & 0\\ 0 & 0 & \sigma^2\end{array} } \right] \Bigg) $$
If spatial variation is present, the off-diagonals of the variance-covariance matrix are not zero - hence the error terms are not independently distributed. As a result, hypotheses test and parameter estimates from uncorrected linear models will provide erroneous results.
## Correlated error models
### Distance-based correlation error models
There are mathematical tools for modelling how error terms are correlated with each other based on pairwise physical distance between observations. These models can be used to weight observations. Often, the data are assumed to be *isotropic*, where distance but not direction impacts the spatial error correlation.
There are several methods for estimating the semivariance as a direct function of distance.
#### Exponential
$$ \gamma (h)\left\{ {\begin{array}{cc} 0 & h = 0\\ C_0+C_1 \left [ 1-e^{-(\frac{h}{r})} \right] & h > 0 \end{array} } \right. $$
where
$$ C_0 = nugget \\ C_1 = partial \: sill \\ r = range$$
```{r Exp-CE-fig, echo=FALSE, fig.cap='Exponential Model', out.width='70%', fig.asp=0.75, fig.align='center'}
c0 <- 0.5
sill = 1.7
psill = sill - c0
range = 0.5
x <- seq(0,2, by = 0.01)
y <- c0 + psill*(1-exp(-1*x/range))
plot(x, y, xlim = c(0,2), ylim = c(0, 2), axes = F, pch = 21, bg = "black", ann = F, type = "l", col = "dodgerblue", lwd = 2)
arrows(0, 0, 0, 2, code = 2)
arrows(0, 0, 2, 0, code = 2)
mtext(expression(paste(gamma, "(h)")), at = 2, side = 2, las = 2)
mtext("h", at = 2, side = 1)
abline(h = sill, lty = 2)
abline(v = range*3, lty = 2)
points(0, c0, pch = 21, bg = "white")
mtext(expression('C'[0]), side = 2, at = c0, las = 2)
mtext(expression('r'[p]), side = 1, at = range*3)
mtext(expression('C'[0]+'C'[1]), side = 2, at = sill, las = 2)
```
$3r = r_p$ is the "practical range", which is 95% of the true value for $C_1$.
#### Gaussian
(a squared version of the exponential model)
$$ \gamma (h)\left\{ {\begin{array}{cc} 0 & h = 0\\ C_0+C_1 \left [ 1-e^{-(\frac{h}{r})^2} \right] & h > 0 \end{array} } \right. $$
where
$$ C_0 = nugget \\ C_1 = partial \: sill \\ r = range$$
```{r Gau-CE-fig, echo=FALSE, fig.cap='Gaussian Model', out.width='70%', fig.asp=0.75, fig.align='center'}
c0 <- 0.5
sill = 1.7
psill = sill - c0
range = 0.75
x <- seq(0,2, by = 0.01)
y <- c0 + psill*(1-exp(-1*(x/range)^2))
plot(x, y, xlim = c(0,2), ylim = c(0, 2), axes = F, pch = 21, bg = "black", ann = F, type = "l", col = "dodgerblue", lwd = 2)
arrows(0, 0, 0, 2, code = 2)
arrows(0, 0, 2, 0, code = 2)
mtext(expression(paste(gamma, "(h)")), at = 2, side = 2, las = 2)
mtext("h", at = 2, side = 1)
abline(h = sill, lty = 2)
abline(v = range*sqrt(3), lty = 2)
points(0, c0, pch = 21, bg = "white")
mtext(expression('C'[0]), side = 2, at = c0, las = 2)
mtext(expression('r'[p]), side = 1, at = range*sqrt(3))
mtext(expression('C'[0]+'C'[1]), side = 2, at = sill, las = 2)
```
$\sqrt 3r = r_p$ is the "practical range", which is 95% of the true value for $C_1$.
#### Spherical
$$ \gamma (h) = \left\{ {\begin{array}{cc} 0 & h = 0\\ C_0+C_1 \left[ \frac{3h}{2r}-0.5\bigg( \frac{h}{r}\bigg)^3 \right] & 0 <h \leq r \\
C_0 + C_1 & h > r \end{array} } \right. $$
where
$$ C_0 = nugget \\ C_1 = partial \: sill \\ r = range$$
```{r Sph-CE-fig, echo=FALSE, fig.cap='Spherical Model', out.width='70%', fig.asp=0.75, fig.align='center'}
c0 <- 0.5
sill <- 1.7
psill <- sill - c0
range = 1.3
x <- seq(0,2, by = 0.01)
y <- ifelse(x <= range, c0 + psill*((3*x)/(2*range) - 0.5*(x/range)^3),
sill)
plot(x, y, xlim = c(0,2), ylim = c(0, 2), axes = F, pch = 21, bg = "black", ann = F, type = "l", col = "dodgerblue", lwd = 2)
arrows(0, 0, 0, 2, code = 2)
arrows(0, 0, 2, 0, code = 2)
mtext(expression(paste(gamma, "(h)")), at = 2, side = 2, las = 2)
mtext("h", at = 2, side = 1)
abline(h = sill, lty = 2)
abline(v = range, lty = 2)
points(0, c0, pch = 21, bg = "white")
mtext(expression('C'[0]), side = 2, at = c0, las = 2)
mtext('r', side = 1, at = range)
mtext(expression('C'[0]+'C'[1]), side = 2, at = sill, las = 2)
```
#### Other correlated error distance models
There are many more models - Matérn, Cauchy, Logistic - that may describe spatial correlation in a data set.
There are two addition models that have no range or sill, the linear model and power model. If your data fits these, consider doing a trend analysis.
#### Linear
$$ \gamma (h)=\left\{ {\begin{array}{cc} 0 & h = 0\\ C_0+C_1h & h > 0 \end{array} } \right. $$
where
$$ C_0 = nugget \\ C_1 = slope $$
```{r linear-CE-fig, echo=FALSE, fig.cap='Linear Error Model', out.width='70%', fig.asp=0.75, fig.align='center'}
c0 <- 0.5
plot(0, c0, xlim = c(0,2), ylim = c(0, 2), axes = F, ann = F)
lines(c(0, 2),c(c0, 1.7), xlim = c(0,2), ylim = c(0, 2), col = "dodgerblue", lwd = 2)
arrows(0, 0, 0, 2, code = 2)
arrows(0, 0, 2, 0, code = 2)
mtext(expression(paste(gamma, "(h)")), at = 2, side = 2, las = 2)
points(0, c0, pch = 21, bg = "white")
mtext("h", at = 2, side = 1)
mtext(expression('C'[0]), side = 2, at = c0, las = 2)
```
There is no sill or range in the linear model, so the variance will continue to increase as a function of distance.
#### Power
$$ \gamma (h)=\left\{ {\begin{array}{cc} 0 & h = 0\\ C_0+C_1h^\lambda & h > 0 \end{array} } \right. $$
where
$$ 0 \leq \lambda <\leq 2 \\ C_0 = nugget \\ C_1 = scaling \: factor $$
```{r power-CE-fig, echo=FALSE, fig.cap='Power Model', out.width='70%', fig.asp=0.75, fig.align='center'}
c0 <- 0.5
c1 <- 1
lambda <- 0.5
x <- seq(0,2, by = 0.01)
y <- c0 + c1*(x^lambda)
plot(x, y, xlim = c(0,2), ylim = c(0, 2), axes = F, pch = 21, bg = "black", ann = F, type = "l", col = "dodgerblue", lwd = 2)
arrows(0, 0, 0, 2, code = 2)
arrows(0, 0, 2, 0, code = 2)
mtext(expression(paste(gamma, "(h)")), at = 2, side = 2, las = 2)
mtext("h", at = 2, side = 1)
points(0, c0, pch = 21, bg = "white")
mtext(expression('C'[0]), side = 2, at = c0, las = 2)
```
When $\lambda = 1$, that is equivalent to the linear model. Example above is when $\lambda = 0.5$ (i.e. a square-root transformation) and $C_1 = 1$. There is also no sill or range in the power model.
#### Matérn
$$\gamma(h) = c_0 + c_1 \bigg( 1- \frac{1}{2^{\kappa -1}\Gamma(\kappa)}
\Big( \frac{h}{\alpha} \Big) ^{\kappa} K_\kappa
\Big( \frac{h}{\alpha} \Big) \bigg)$$
Where
$$ C_0 = nugget \\ C_1 = partial \: scale \\
\alpha = smoothing \: factor \\ \kappa = covariance \: parameter
$$
$\Gamma(\kappa)$ is the gamma function:
$$\Gamma(\kappa) = (\kappa -1)!$$
and $\kappa(\alpha)$ is a modified bessel function:
$$ K_\kappa(t) = \frac{\Gamma(\alpha)}{2} \big( \frac{t}{2} \big) ^{-\kappa}$$
```{r eval=FALSE, include=FALSE}
c0 <- 0.5
c1 <- 1.7
alpha = 0.5
kappa = 2
x <- seq(0,2, by = 0.01)
y <- ifelse(x <= range, c0 + psill*((3*x)/(2*range) - 0.5*(x/range)^3),
sill)
plot(x, y, xlim = c(0,2), ylim = c(0, 2), axes = F, pch = 21, bg = "black", ann = F, type = "l", col = "dodgerblue", lwd = 2)
arrows(0, 0, 0, 2, code = 2)
arrows(0, 0, 2, 0, code = 2)
mtext(expression(paste(gamma, "(h)")), at = 2, side = 2, las = 2)
mtext("h", at = 2, side = 1)
abline(h = sill, lty = 2)
abline(v = range, lty = 2)
points(0, c0, pch = 21, bg = "white")
mtext(expression('C'[0]), side = 2, at = c0, las = 2)
mtext('r', side = 1, at = range)
mtext(expression('C'[0]+'C'[1]), side = 2, at = sill, las = 2)
```
### Correlated error model for gridded data
Planned field experiments often have the advantage of being arranged in regular grid pattern that can be adequately described using Euclidean space. This simplifies aspects of understanding how error terms are related by distance since the data occur in evenly spaced increments. Furthermore, in many agricultural trials, there may be no interest in spatial interpolation between units. Some of the following models work with irregularly-spaced data, but the models below are simplified forms when the experimental units are arranged in regular grid.
For example, imagine an experiment consisting of 8 plots (plot = the experiment units) arranged in 2 rows, each with 4 ranges with this layout:
```{r plot-grid-fig, echo=FALSE, fig.asp=0.35, message=FALSE, warning=FALSE}
# plot(1, col = "white", xlim = c(0,4), ylim = c(0,2), ann = F, xaxt = "n", yaxt = "n") # xlab = "row", ylab = "ranges")
# grid(6, 10, lty = 1, col = "black")
# mtext("ranges", side = 2, at = 1.5)
# mtext("rows", side = 1, at = 2)
library(ggplot2)
dt <- data.frame(plot = paste0("plot ",1:8), row = rep(1:2, each = 4), range = rep(1:4, 2))
ggplot(dt, aes(x = range, y = row)) +
geom_tile(col = "black", fill = "white") +
geom_text(aes(label = plot)) +
scale_y_continuous(breaks = c(1, 2)) +
theme_classic()
```
The statistical model for that experiment:
$$
\left[ {\begin{array} \ Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{array} } \right] = \beta_0 +
\left[ {\begin{array} \ X_1 \\ X_2 \\ \vdots \\ X_n \end{array} } \right] \beta_1 +
\left[ {\begin{array} \ \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_n \end{array} } \right]
$$
$X\beta$ refer to independent variable effects. This same model is also often presented in an abbreviated matrix form:
$$ \mathbf{Y = X\beta + \epsilon}$$
#### First order auto-regressive model (AR1)
This assumes that variance can be modelled as exponential function based on unit distance (e.g. row or range), either in a single direction or anistropic.
The AR1 structure across 2 rows is modeled as thus:
$$\mathbf { V_{AR(1)row}} = \sigma^2
\left[ {\begin{array}{cc}
1 & \rho \\
\rho & 1
\end{array} } \right] \otimes \mathbf{I_4}$$
This covariance model describes the correlation of observations in the *row direction only.*
And similarly the AR1 structure across 4 ranges is modeled as thus:
$$ \mathbf {V_{AR(1)range}} = \sigma^2
\left[ {\begin{array}{cccc}
1 & \rho & \rho^2 & \rho^3 \\
\rho & 1 & \rho & \rho^2 \\
\rho^2 & \rho & 1 & \rho \\
\rho^3 & \rho^2 & \rho & 1
\end{array} } \right] \otimes \mathbf{I_2} $$
This covariance model describes the correlation of observations in the *range direction only.*
In combined AR1xAR1 model, the parameter, $\rho$ may need to estimated separated across the row and column directions depending on the shape of the plots and site-specific field variation. Very rectangular plots are likely to require a separate estimate of the $\rho$ parameter for each direction.
\begin{equation}
\mathbf {V_{AR(1)row} \otimes V_{AR(1)range}} = \sigma^2
\left[ {\begin{array}{cccc | cccc}
1 & \rho_1 & \rho_1^2 & \rho_1^3 & \rho_2 & \rho_1\rho_2 & \rho_1^2\rho_2 & \rho_1^3\rho_2 \\
\rho_1 & 1 & \rho_1 & \rho_1^2 & \rho_1\rho_2 & \rho_2 & \rho_1\rho_2 & \rho_1^2\rho_2\\
\rho_1^2 & \rho_1 & 1 & \rho_1 & \rho_1^2\rho_2 & \rho_1\rho_2 & \rho_2 & \rho_1\rho_2\\
\rho_1^3 & \rho_1^2 & \rho_1 & 1 & \rho_1^3\rho_2 & \rho_1^2\rho_2 & \rho_1\rho_2 & \rho_2 \\
\hline
\rho_2 & \rho_1\rho_2 & \rho_1^2\rho_2 & \rho_1^3\rho_2 & 1 & \rho_1 & \rho_1^2 & \rho_1^3 \\
\rho_1\rho_2 & \rho_2 & \rho_1\rho_2 & \rho_1^2\rho_2 & \rho_1 & 1 & \rho_1 & \rho_1^2 \\
\rho_1^2\rho_2 & \rho_1\rho_2 & \rho_2 & \rho_1\rho_2 & \rho_1^2 & \rho_1 & 1 & \rho_1 \\
\rho_1^3\rho_2 & \rho_1^2\rho_2 & \rho_1\rho_2 & \rho_2 &\rho_1^3 & \rho_1^2 & \rho_1 & 1 \\
\end{array} } \right]
\end{equation}
**Note**: $\rho_1$ and $\rho_2$ are the parameter estimate for $V_{AR1(1) row}$ and $V_{AR(1)range}$, respectively.
Please note that these error models are for modelling localised variation based on physical proximity. It is assumed that eventually a distance in the experiment can be reached in which 2 observations can be treated independent.
If there are spatial trends that extend along the entire scope of an experiment (for instance, due to position on a slope), then an additional trend analysis should be conducted.
## Spatial Regression methods
These approaches look use information from adjacent plots to adjust for spatial auto-correlation.
### Spatial autoregressive (SAR)
Sometimes called a "lag" model, the SAR model uses correlations with neighboring plots dependent variable to predict Y. The auto-regressive model explicitly models correlations between neighboring points.
$$\mathbf{Y = \rho W Y + X\beta + \epsilon} $$
While this may look strange, $\mathbf{W}$ is an $n$ x $n$ matrix weighting the neighbors with a diagonal of zero so the value at $i=j$, that is $Y_{ijk}$ itself, is not used on the right-hand side to predict $Y_{ijk}$ on the left-hand side of the equation. The error terms are assumed iid.
**On Weights**
Setting weights of neighbors is dealt with in the next chapter \@ref(rcbd-r).
## Trend analysis
### Row and column trends
Experiment wide-trends should be modeled with directional trend models. These are comparatively simple models:
$$Y_{ijk} = \beta_0 + X_{i1}\beta_1 + Row_{j2}\beta_2 + Range_{k3}\beta_3 +\epsilon_{ijk}$$
If the assumption of independent, normal, and identical errors are met, then this model will suffice. If spatial variation is still present, additional measures will need to be taken.
### Splines
There is a rich field of research on using localised splines to model field heterogeneity. They are similar to trend models where the row and/or column trends are modelled. These models are complex and hence not described here.