-
Notifications
You must be signed in to change notification settings - Fork 11
/
5-move.Rmd
1639 lines (1391 loc) · 80.5 KB
/
5-move.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Sites and states {#dispersal}
## Introduction
In this fifth chapter, you will learn about the Arnason-Schwarz model that allows estimating transitions between sites and states based on capture-recapture data. You will also see how to deal with uncertainty in the assignment of states to individuals.
## The Arnason-Schwarz (AS) model {#ASmodel}
In Chapter \@ref(survival), we got acquainted with the Cormack-Jolly-Seber (CJS) model which accommodates transitions between the states alive and dead while accounting for imperfect detection. It is often the case that besides being alive, more detailed information is collected on the state of animals when they are detected. For example, if the study area is split into several discrete sites, you may record where an animal is detected, the state being now alive in this particular site. The Arnason-Schwarz (AS) model can be viewed as an extension to the CJS model in which we estimate movements between sites on top of survival. The AS model is named after the two statisticians -- Neil Arnason and Carl Schwarz -- who came up with the idea.
### Theory
Let's assume for now that we have two sites, say 1 and 2. The way we usually think of analyzing the data is to start from the detections and non-detections and infer the transitions between sites and movements. Schematically, when a animal is detected in site 1 or site 2, it obviously means that it is alive in that site, whereas when it is not detected, it may be dead or alive in either site. Schematically, we have:
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'non-detection'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'detection in site 1'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'detection in site 2'), nudge_x = -0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'alive in site 1'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'alive in site 2'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1, label = 'dead'), nudge_x = 0.5, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'Observations', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'States', size = 10) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 1.5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 1), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1.5, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1, xend = 2, yend = 1.5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
Observations and states are indeed closely related, but we do not have a perfect states to observations correspondence, and the HMM framework will help you make the distinction clear, which in turn will make the modelling easier. Usually, we focus our energy on the observations but what we'd really like is to spend time thinking of the ecological processes that we observed imperfectly. In the HMM framework, when we are to build a model, we think of the states and their dynamic over time, and these states emit the observations we're given to make. Going back to the our example, when an animal is alive in either site, it may get detected in that site or go undetected. When an animal is dead, then it goes undetected for sure. Schematically, we obtain:
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'non-detection'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'detection in site 1'), nudge_x = 0.6, size = 7) +
geom_text(aes(2, 1, label = 'detection in site 2'), nudge_x = 0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'alive in site 1'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'alive in site 2'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'dead'), nudge_x = -0.6, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'Observations', size = 10) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 1.5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1.5, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1.5, xend = 2, yend = 1), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
We have $z = 1$ for alive in site 1, $z = 2$ for alive in site 2 and $z = 3$ for dead. We will code $y = 1$ for non-detected, $y = 2$ for detected in site 1 and $y = 3$ for detected in site 2. The parameters are:
- $\pi^r$ is the probability the a newly encountered individual is in state $r$;
- $p_t^r$ is the probability of detection at $t$ for a bird in site $r$ at $t$;
- $\phi_t^r$ is the survival probability for birds in site $r$ at between $t$ and $t+1$;
- $\psi_t^{rs}$ is the probability of being in site $s$ at time $t+1$ for animals that were in site $r$ at $t$ and have survived to $t+1$, in short movement conditional on survival.
These parameters can be modelled as functions of covariates as in Section \@ref(covariates). But for now we will drop the time index for simplicity.
We follow the presentation of HMM as in Chapter \@ref(hmmcapturerecapture). Let's start with the vector of initial states. At first encounter, an animal may be alive in site 1 with probability $\pi^1$ or in site 2 with the complementary probability $1 - \pi^{1}$, but it cannot be dead. Therefore, we have:
$$\begin{matrix}
& \\
\mathbf{\delta} =
\left ( \vphantom{ \begin{matrix} 12 \end{matrix} } \right .
\end{matrix}
\hspace{-1.2em}
\begin{matrix}
z_t=1 & z_t=2 & z_t=3 \\ \hdashline
\pi^1 & 1 - \pi^{1} & 0\\
\end{matrix}
\hspace{-0.2em}
\begin{matrix}
& \\
\left . \vphantom{ \begin{matrix} 12 \end{matrix} } \right )
\begin{matrix}
\end{matrix}
\end{matrix}$$
We move on to the transition matrix which connects the states at $t-1$ in rows to the states at $t$ in columns. For example, the probability of moving from site 1 at $t-1$ to site $2$ at $t$ is the product of the survival probability in site 1 over that time interval, times the probability of moving from site 1 to 2 for those animals who survived. We have:
$$\begin{matrix}
& \\
\mathbf{\Gamma} =
\left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right .
\end{matrix}
\hspace{-1.2em}
\begin{matrix}
z_{t}=1 & z_{t}=2 & z_{t}=3 \\ \hdashline
\phi^1 (1-\psi^{12}) & \phi^1 \psi^{12} & 1 - \phi^1\\
\phi^2 \psi^{21} & \phi^2 (1-\psi^{21}) & 1 - \phi^2\\
0 & 0 & 1
\end{matrix}
\hspace{-0.2em}
\begin{matrix}
& \\
\left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right )
\begin{matrix}
z_{t-1}=1 \\ z_{t-1}=2 \\ z_{t-1}=3
\end{matrix}
\end{matrix}$$
Finally, the observation matrix relates the states an animal is in at $t$ in rows to the observations at $t$ in columns. If an animal is dead ($z_t=3$), it cannot be detected ($\Pr(y_t=1|z_t=3)=\Pr(y_t=2|z_t=3)=0$ and $\Pr(y_t=3|z_t=3)=1$), whereas when it is alive in either site, it can be detected or not. We have:
$$\begin{matrix}
& \\
\mathbf{\Omega} =
\left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right .
\end{matrix}
\hspace{-1.2em}
\begin{matrix}
y_t=1 & y_t=2 & y_t=3 \\ \hdashline
1 - p^1 & p^1 & 0\\
1 - p^2 & 0 & p^2\\
1 & 0 & 0
\end{matrix}
\hspace{-0.2em}
\begin{matrix}
& \\
\left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right )
\begin{matrix}
z_{t}=1 \\ z_{t}=2 \\ z_{t}=3
\end{matrix}
\end{matrix}$$
The vector of initial state probabilities sums up to one, as well as the rows of the transition and observation matrices.
## Fitting the AS model {#ASmodelfitting}
### Geese data
To introduce this chapter, we will use data on the Canada goose (*Branta canadensis*; geese hereafter) kindly provided by Jay Hestbeck (Figure \@ref(fig:pixdipper)).
```{r pixgeese, echo=FALSE, out.width="100%", fig.cap="Canada goose (Branta canadensis). Credit: Max McCarthy.", fig.align='center'}
knitr::include_graphics("images/goose.jpg")
```
In total, 21277 geese were captured, marked with coded neck bands and recaptured between 1984 and 1989 in their wintering locations. Specifically, geese were monitored in the Atlantic flyway, in large areas along the East coast of the USA, namely 3 sites in the mid--Atlantic (New York, Pennsylvania, New Jersey), Chesapeake (Delaware, Maryland, Virginia), and Carolinas (North and South Carolina). Birds were adults and sub-adults when banded.
You may see the data below:
```{r echo = TRUE}
y <- read_csv("dat/geese.csv") %>% as.matrix()
head(y)
```
The six columns are years in which the geese were captured, banded and recapture. A 0 stands for a non-detection, and detections were coded in the 3 wintering sites 1, 2 and 3 for mid--Atlantic, Chesapeake and Carolinas respectively. This is only a subsample of 500 individuals of the whole dataset that I will use for illustration here.
### NIMBLE implementation
To write the NIMBLE code corresponding to the AS model, we will make our life easier and start with 2 sites only -- we drop the Carolinas wintering site for now. We replace all 3's by 0's in the dataset:
```{r}
y[y==3] <- 0
```
Also we consider parameters constant over time.
<!-- Note: You may code non-detections as $y_t = 2$, and the first column in the observation matrix should go last. -->
<!-- Quick answer about the -1 and the important issue of coding states and obs. I did this on purpose, to have folks think about the difference between observations and states (non-detection obs should not be confused with state for dead). This becomes even more crucial when we get to multievent models where several observations may be generated by a single state. I get the intuition argument perfectly, but I’d like them to fight against it at first, then once they’re comfortable with the difference, they may code obs/states as they see fit. Let’s see how it goes. I agree that we should mention that during the multistate lecture, in the spirit of « you’re free to code states and jobs the way you like ». I’ll add something. -->
We start with comments to define the quantities -- parameters, states and observations -- we will use in the NIMBLE code:
```{r eval = FALSE}
multisite <- nimbleCode({
# -------------------------------------------------
# Parameters:
# phi1: survival probability site 1
# phi2: survival probability site 2
# psi12: movement probability from site 1 to site 2
# psi21: movement probability from site 2 to site 1
# p1: detection probability site 1
# p2: detection probability site 2
# -------------------------------------------------
# States (z):
# 1 alive at site 1
# 2 alive at site 2
# 3 dead
# Observations (y):
# 1 not seen
# 2 seen at site 1
# 3 seen at site 2
# -------------------------------------------------
...
```
The we specify priors for the survival, transition and detection probabilities:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# Priors
phi1 ~ dunif(0, 1)
phi2 ~ dunif(0, 1)
psi12 ~ dunif(0, 1)
psi21 ~ dunif(0, 1)
p1 ~ dunif(0, 1)
p2 ~ dunif(0, 1)
...
```
We now write the vector of initial state probabilities:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# initial state probabilities
delta[1] <- pi1 # Pr(alive in 1 at t = first)
delta[2] <- 1 - pi1 # Pr(alive in 2 at t = first)
delta[3] <- 0 # Pr(dead at t = first) = 0
...
```
Actually, the initial state is known exactly: It is alive at site of initial capture, and $\pi^1$ is the proportion of individuals first captured in site 1, there is no need to make it explicit in the model and estimate it. Therefore, in the likelihood, instead of `z[i,first[i]] ~ dcat(delta[1:3])`, you can use `z[i,first[i]] <- y[i,first[i]] - 1` and just forget about $\pi^1$, for now. Note that the same trick applies to the CJS model.
We write the transition matrix:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# probabilities of state z(t+1) given z(t)
# (read as gamma[z(t),z(t+1)] = gamma[fromState,toState])
gamma[1,1] <- phi1 * (1 - psi12)
gamma[1,2] <- phi1 * psi12
gamma[1,3] <- 1 - phi1
gamma[2,1] <- phi2 * psi21
gamma[2,2] <- phi2 * (1 - psi21)
gamma[2,3] <- 1 - phi2
gamma[3,1] <- 0
gamma[3,2] <- 0
gamma[3,3] <- 1
...
```
In the same way, the observation matrix is:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# probabilities of y(t) given z(t)
# (read as omega[y(t),z(t)] = omega[Observation,State])
omega[1,1] <- 1 - p1 # Pr(alive 1 t -> non-detected t)
omega[1,2] <- p1 # Pr(alive 1 t -> detected site 1 t)
omega[1,3] <- 0 # Pr(alive 1 t -> detected site 2 t)
omega[2,1] <- 1 - p2 # Pr(alive 2 t -> non-detected t)
omega[2,2] <- 0 # Pr(alive 2 t -> detected site 1 t)
omega[2,3] <- p2 # Pr(alive 2 t -> detected site 2 t)
omega[3,1] <- 1 # Pr(dead t -> non-detected t)
omega[3,2] <- 0 # Pr(dead t -> detected site 1 t)
omega[3,3] <- 0 # Pr(dead t -> detected site 2 t)
...
```
At last, we are ready to specify the likelihood which, and this this the magic of HMM, is the same as the likelihood of the CJS model, only the vector of initial states, the transition and observation matrices were changed:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# likelihood
for (i in 1:N){
# latent state at first capture
z[i,first[i]] <- y[i,first[i]] - 1
for (t in (first[i]+1):K){
# z(t) given z(t-1)
z[i,t] ~ dcat(gamma[z[i,t-1],1:3])
# y(t) given z(t)
y[i,t] ~ dcat(omega[z[i,t],1:3])
}
}
})
```
We need to provide NIMBLE with constants, data, initial values, some parameters to monitor and MCMC details:
```{r eval = FALSE}
# occasions of first capture
first <- apply(y, 1, function(x) min(which(x !=0)))
# constants
my.constants <- list(first = first,
K = ncol(y),
N = nrow(y))
# data
my.data <- list(y = y + 1)
# initial values
zinits <- y # say states are observations, detections in A or B are taken as alive in same sites
zinits[zinits==0] <- sample(c(1,2), sum(zinits==0), replace = TRUE) # non-detections become alive in site A or B
initial.values <- function(){list(phi1 = runif(1, 0, 1),
phi2 = runif(1, 0, 1),
psi12 = runif(1, 0, 1),
psi21 = runif(1, 0, 1),
p1 = runif(1, 0, 1),
p2 = runif(1, 0, 1),
pi1 = runif(1, 0, 1),
z = zinits)}
# parameters to monitor
parameters.to.save <- c("phi1", "phi2","psi12", "psi21", "p1", "p2", "pi1")
# MCMC details
n.iter <- 5000
n.burnin <- 1000
n.chains <- 2
```
Now we may run NIMBLE:
```{r eval = FALSE}
mcmc.multisite <- nimbleMCMC(code = multisite,
constants = my.constants,
data = my.data,
inits = initial.values,
monitors = parameters.to.save,
niter = n.iter,
nburnin = n.burnin,
nchains = n.chains)
```
We may have a look to the results via a caterpillar plot:
```{r, echo = FALSE}
load(here::here("dat","geese_2sites.RData"))
```
```{r}
MCMCplot(mcmc.multisite)
```
Remember mid--Atlantic is site 1, and Chesapeake site 2. Detection in mid--Atlantic (around 0.5) is higher than in Cheasapeake (around 0.4) although it comes with more uncertainty (wider credible interval). Survival in both sites are similar estimated at around 0.6--0.7. Note that by going multisite, we could make these parameters site-specific and differences might reflect habitat quality for example. Now the novelty lies in our capability to estimate movements from site 1 to site 2 and from site 2 to site 1 from a winter to the next. The annual probability of remaining in the same site for two successive winters, used as a measure of site fidelity, was lower in the mid--Atlantic (around 0.7) than in the Chesapeake (0.9). The estimated probability of moving to the Chesapeake from the mid--Atlantic was three times as high as the probability of moving in the opposite direction.
We may also have a look to numerical summaries, which confirm our ecological interpretation of the model parameter estimates:
```{r echo = FALSE}
load(here::here("dat","geese_2sites.RData"))
```
```{r}
MCMCsummary(mcmc.multisite, round = 2)
```
You could add time-dependence to the demographic parameters, e.g. survival and movements, and assess the effect of winter harshness with some temporal covariates. Individual covariates could also be considered.
Before we move to the next section, I will illustrate the use of `nimbleEcology` to fit the AS to the geese data with 2 sites (see Section \@ref(nimblemarginalization)). Using the function `dHMM()` which implements HMM with time-independent observation and transition matrices, we have:
```{r eval = FALSE}
# read in data
geese <- read_csv("geese.csv", col_names = TRUE)
y <- as.matrix(geese)
# drop Carolinas
y[y==3] <- 0 # act as if there was no detection in site 3 Carolinas
mask <- apply(y, 1, sum)
y <- y[mask!=0,] # remove rows w/ 0s only
# get occasions of first encounter
get.first <- function(x) min(which(x != 0))
first <- apply(y, 1, get.first)
# filter out individuals that are first captured at last occasion.
# These individuals do not contribute to parameter estimation,
# and also they cause problems with nimbleEcology
mask <- which(first!=ncol(y))
y <- y[mask, ] # keep only these
first <- first[mask]
# NIMBLE code
multisite.marginalized <- nimbleCode({
# -------------------------------------------------
# Parameters:
# phi1: survival probability site 1
# phi2: survival probability site 2
# psi12: movement probability from site 1 to site 2
# psi21: movement probability from site 2 to site 1
# p1: recapture probability site 1
# p2: recapture probability site 2
# pi1: prop of being in site 1 at initial capture
# -------------------------------------------------
# States (z):
# 1 alive at 1
# 2 alive at 2
# 3 dead
# Observations (y):
# 1 not seen
# 2 seen at site 1
# 3 seen at site 2
# -------------------------------------------------
# priors
phi1 ~ dunif(0, 1)
phi2 ~ dunif(0, 1)
psi12 ~ dunif(0, 1)
psi21 ~ dunif(0, 1)
p1 ~ dunif(0, 1)
p2 ~ dunif(0, 1)
pi1 ~ dunif(0, 1)
# probabilities of state z(t+1) given z(t)
gamma[1,1] <- phi1 * (1 - psi12)
gamma[1,2] <- phi1 * psi12
gamma[1,3] <- 1 - phi1
gamma[2,1] <- phi2 * psi21
gamma[2,2] <- phi2 * (1 - psi21)
gamma[2,3] <- 1 - phi2
gamma[3,1] <- 0
gamma[3,2] <- 0
gamma[3,3] <- 1
# probabilities of y(t) given z(t)
omega[1,1] <- 1 - p1 # Pr(alive 1 t -> non-detected t)
omega[1,2] <- p1 # Pr(alive 1 t -> detected 1 t)
omega[1,3] <- 0 # Pr(alive 1 t -> detected 2 t)
omega[2,1] <- 1 - p2 # Pr(alive 2 t -> non-detected t)
omega[2,2] <- 0 # Pr(alive 2 t -> detected 1 t)
omega[2,3] <- p2 # Pr(alive 2 t -> detected 2 t)
omega[3,1] <- 1 # Pr(dead t -> non-detected t)
omega[3,2] <- 0 # Pr(dead t -> detected 1 t)
omega[3,3] <- 0 # Pr(dead t -> detected 2 t)
# likelihood
# initial state probs
for(i in 1:N) {
init[i, 1:3] <- gamma[ y[i, first[i] ] - 1, 1:3 ] # first state propagation
}
for (i in 1:N){
y[i,(first[i]+1):K] ~ dHMM(init = init[i,1:3], # count data from first[i] + 1
probObs = omega[1:3,1:3], # observation matrix
probTrans = gamma[1:3,1:3], # transition matrix
len = K - first[i], # nb of occasions
checkRowSums = 0) # do not check whether elements in a row sum up to 1
}
})
# constants
my.constants <- list(first = first,
K = ncol(y),
N = nrow(y))
# data
my.data <- list(y = y + 1)
# initial values
initial.values <- function(){list(phi1 = runif(1, 0, 1),
phi2 = runif(1, 0, 1),
psi12 = runif(1, 0, 1),
psi21 = runif(1, 0, 1),
p1 = runif(1, 0, 1),
p2 = runif(1, 0, 1))}
# parameters to monitor
parameters.to.save <- c("phi1", "phi2","psi12", "psi21", "p1", "p2")
# MCMC details
n.iter <- 5000
n.burnin <- 1000
n.chains <- 2
# run NIMBLE
mcmc.multisite.marginalized <- nimbleMCMC(code = multisite.marginalized,
constants = my.constants,
data = my.data,
inits = initial.values,
monitors = parameters.to.save,
niter = n.iter,
nburnin = n.burnin,
nchains = n.chains)
```
We obtain:
```{r echo = FALSE}
load(here::here("dat","twositemarginalized.Rdata"))
```
```{r}
MCMCsummary(mcmc.multisite.marginalized, round = 2)
```
There are slight differences in these parameters estimates compared to those we obtained earlier. This is probably due to the effective sample sizes being much bigger here (by a factor 3) with the marginalized likelihood for the same number of MCMC iterations.
### Goodness of fit {#gofas}
Like for the CJS model, we need to ensure that the AS model provides a good fit to the data. To do so, both posterior predictive checks and classical procedures can be used. I illustrate the use of posterior predictive checks in Section \@ref(ppchecks). With regard to classical tests, the goodness-of-fit testing procedures we covered in Section \@ref(gof) for the CJS model have been extended to the AS model. These procedure are also available in the package `R2ucare`. While the transience and trap-dependence tests can be used on each site, it is worth mentionning a test that is specific to the AS model with several sites. The "where before where after" (WBWA) tests for memory, in other words it's a test for the Markovian property of the HMM in the particular context of capture-recapture. WBWA quantifies whether there is a relationship between where an animal has last been seen and where it will next be seen again. If a relationship exists, positive or negative, this suggests memory in the movements, and the probability for an animal of moving to a site at $t$, given it is present in a site at $t-1$ should be made dependent of where it was at $t-2$. This is called a second-order Markov process.
Going back to the geese data, the WBWA test is implemented as follows on the whole dataset:
```{r eval = TRUE}
library(R2ucare)
geese <- read_csv2("dat/allgeese.csv") %>% as.matrix()
y <- geese[,1:6]
size <- geese[,7]
wbwa <- test3Gwbwa(y, size)
wbwa$test3Gwbwa
```
There is clearly a strong (not to say significant) positive relationship judging by the value of the statistic. I will demonstrate how to account for this memory issue in an extension of the AS model in a case study in Section \@ref(memorymodel).
## What if more than 2 sites?
So far we have considered two sites only, for the sake of simplicity. Indeed when going for three sites (or more), a difficulty arises. While the movement probabilities still need to be between 0 and 1, the sum of all probabilities of moving from a site should also sum to one. This is because an individual alive in site 1 for example, has to stay in 1, move to 2 or move to 3, it has no other choice. This is no problem when you have only two sites because the probability of moving from 1 to 2 is always estimated between 0 and 1, and its complementary, the probability of moving from 2 to 1 will be too. When you have three sites, it might happen that the sum of the estimates for $\psi^{12}$ and $\psi^{13}$ is larger than one, even though they're both between 0 and 1, which would make $\psi^{11} = 1 - \psi^{12} - \psi^{13}$ negative.
There are basically two methods to fulfill both constraints, either to assign a Dirichlet (which is pronounced deer-eesh-lay) prior to the movement probabilities, or to use a multinomial logit link function.
### Dirichlet prior
The Dirichlet distribution extends the Beta distribution multivariate we have seen previously (Figure \@ref(fig:betadistribution)) for values between 0 and 1 that add up to 1. Going back to our example with 3 sites, we would like to come up with a prior for parameters of moving from 1, which are $\psi^{11}$, $\psi^{12}$ and $\psi^{13}$. The Dirichlet distribution of dimension 3 has a vector of parameters $(\alpha_1, \alpha_2, \alpha_3)$ that controls its shape. The sum of all $\alpha$'s can be interpreted as a measure of precision: The higher the sum, the more peaked is the distribution on the mean value, the mean value along each dimension being the ratio of the corresponding over the sum of all $\alpha$'s. When all $\alpha$'s are equal, the distribution is symmetric (Figure \@ref(fig:dirichletdistribution)). Its mean is $(1/3, 1/3, 1/3)$ in our example with 3 parameters, whatever the value of $\alpha$. When $\alpha_1 = \alpha_2 = \alpha_3 = 1$, we obtain a uniform marginal distribution for the $\psi$'s, while values below 1 ($\alpha_1 = \alpha_2 = \alpha_3 = 0.1$) result in the distribution concentrating in the corners (skewed U-shaped forms) and values above 1 ($\alpha_1 = \alpha_2 = \alpha_3 = 10$) result in unimodal marginal distributions.
<!-- The probability density function for dirichlet is $f(x) = \displaystyle{\frac{1}{\text{B}({\mathbf \alpha})} \prod_{i=1}^K{x_i^{\alpha_i-1}}}$ where $\text{B}({\mathbf \alpha}) = \displaystyle{\frac{\prod_{i=1}^K{\Gamma({\alpha_i})}}{\Gamma(\sum_{i=1}^K{\alpha_i})}}$ and ${\mathbf \alpha} = (\alpha_1, \ldots, \alpha_K)$ the concentration parameters and $K$ the dimension of the space where $x$ takes values (Figure \@ref(fig:dirichlet)). -->
<!-- <https://en.wikipedia.org/wiki/Dirichlet_distribution#Occurrence_and_applications> -->
<!-- <https://stats.stackexchange.com/questions/130248/what-is-a-dirichlet-prior> <https://mc-stan.org/docs/functions-reference/dirichlet-distribution.html> <https://www.cs.helsinki.fi/u/ahonkela/dippa/node95.html> <https://rpubs.com/JanpuHou/295096> <https://www.biorxiv.org/content/10.1101/711317v2.full.pdf> <https://www.statisticshowto.com/dirichlet-distribution/> <https://research.wu.ac.at/ws/portalfiles/portal/17761231/Report125.pdf> -->
(ref:captiondirichlet) The Dirichlet distribution as a prior for $(\psi^{11}, \psi^{12}, \psi^{13})$ with vector of parameters $\alpha$. Here all components of $\alpha$ are equal which makes the distribution symmetric, and its mean is $(1/3, 1/3, 1/3)$. When $\alpha = 1$, the prior for the $\psi$'s is uniform (middle panel), unimodal when $\alpha = 10$ (right panel) and concentrated in the corners (0 and 1) when $\alpha = 0.1$ (left panel).
```{r dirichletdistribution, echo = TRUE, message=FALSE, warning=FALSE, fig.cap='(ref:captiondirichlet)'}
library(gtools) # to make the rdirichlet() function available
library(ggtern) # to visually represent multidim prob distribution
set.seed(123) # for reproducibility
n <- 1000 # number of values drawn from Dirichlet distribution
alpha1 <- c(.1, .1, .1)
p1 <- rdirichlet(n, alpha1)
alpha2 <- c(1, 1, 1)
p2 <- rdirichlet(n, alpha2)
alpha3 <- c(10, 10, 10)
p3 <- rdirichlet(n, alpha3)
df <- cbind(rbind(p1, p2, p3), c(rep("alpha = c(0.1, 0.1, 0.1)", n),
rep("alpha = c(1, 1, 1)", n),
rep("alpha = c(10, 10, 10)", n))) %>%
as_tibble() %>%
mutate(x = as.numeric(V1),
y = as.numeric(V2),
z = as.numeric(V3),
alpha = V4)
df %>%
ggtern(aes(x = x, y = y, z = z)) +
stat_density_tern(aes(fill=..level.., alpha=..level..),
geom = 'polygon',
bdl = 0.005) + # a 2D kernel density estimation of the distribution
scale_fill_viridis_b() +
geom_point(alpha = 0.3, pch = "+") +
theme_showarrows() +
scale_T_continuous(breaks = seq(0, 1, by = 0.2),
labels = seq(0, 1, by = 0.2)) +
scale_L_continuous(breaks = seq(0, 1, by = 0.2),
labels = seq(0, 1, by = 0.2)) +
scale_R_continuous(breaks = seq(0, 1, by = 0.2),
labels = seq(0, 1, by = 0.2)) +
labs(x = "",
y = "",
z = "",
Tarrow = "psi11",
Larrow = "psi12",
Rarrow = "psi13") +
guides(color = "none", fill = "none", alpha = "none") +
facet_wrap(~alpha, ncol = 3)
```
Going back to our example, in NIMBLE, we consider a Dirichlet prior for each triplet of movement parameters, from site 1 ($\psi^{11}$, $\psi^{12}$ and $\psi^{13}$), from site 2 ($\psi^{21}$, $\psi^{22}$ and $\psi^{23}$) and from site 3 ($\psi^{31}$, $\psi^{32}$ and $\psi^{33}$).
We start by setting the scene with comments:
```{r eval=FALSE}
...
# -------------------------------------------------
# Parameters:
# phi1: survival probability site 1
# phi2: survival probability site 2
# phi3: survival probability site 3
# psi11 = psi1[1]: movement probability from site 1 to site 1 (reference)
# psi12 = psi1[2]: movement probability from site 1 to site 2
# psi13 = psi1[3]: movement probability from site 1 to site 3
# psi21 = psi2[1]: movement probability from site 2 to site 1
# psi22 = psi2[2]: movement probability from site 2 to site 2 (reference)
# psi23 = psi2[3]: movement probability from site 2 to site 3
# psi31 = psi3[1]: movement probability from site 3 to site 1
# psi32 = psi3[2]: movement probability from site 3 to site 2
# psi33 = psi3[3]: movement probability from site 3 to site 3 (reference)
# p1: recapture probability site 1
# p2: recapture probability site 2
# p3: recapture probability site 3
# -------------------------------------------------
# States (z):
# 1 alive at 1
# 2 alive at 2
# 2 alive at 3
# 3 dead
# Observations (y):
# 1 not seen
# 2 seen at 1
# 3 seen at 2
# 3 seen at 3
...
```
```{r eval = FALSE}
multisite <- nimbleCode({
...
# transitions: Dirichlet priors
psi1[1:3] ~ ddirch(alpha[1:3]) # psi11, psi12, psi13
psi2[1:3] ~ ddirch(alpha[1:3]) # psi21, psi22, psi23
psi3[1:3] ~ ddirch(alpha[1:3]) # psi31, psi32, psi33
...
```
Then we use these parameters (which now respect the constraints) to define the transition matrix:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# probabilities of state z(t+1) given z(t)
gamma[1,1] <- phi1 * psi1[1]
gamma[1,2] <- phi1 * psi1[2]
gamma[1,3] <- phi1 * psi1[3]
gamma[1,4] <- 1 - phi1
gamma[2,1] <- phi2 * psi2[1]
gamma[2,2] <- phi2 * psi2[2]
gamma[2,3] <- phi2 * psi2[3]
gamma[2,4] <- 1 - phi2
gamma[3,1] <- phi3 * psi3[1]
gamma[3,2] <- phi3 * psi3[2]
gamma[3,3] <- phi3 * psi3[3]
gamma[3,4] <- 1 - phi3
gamma[4,1] <- 0
gamma[4,2] <- 0
gamma[4,3] <- 0
gamma[4,4] <- 1
...
```
When we fit this model to the geese dataset with the detections in the Carolinas wintering site back in, and with `alpha <- c(1, 1, 1)` passed to the constants, we obtain the following results:
```{r echo = FALSE}
load(here::here("dat","geese_3sites_dirichlet.RData"))
```
```{r}
MCMCsummary(mcmc.multisite, round = 2)
```
Survival probabilities are similar among sites, but the detection probability in Carolinas seems much lower than in the two other wintering sites. The estimated probability of moving to the Chesapeake from the Carolinas is 2 times as high as the probability of moving in the opposite direction.
In theory, you could include covariates as in \@ref(covariates) through the $\alpha$ parameters and the use a log link function (to ensure $\alpha > 0$), e.g. $\log(\alpha) = \beta_1 + \beta_2 x$. However, NIMBLE does not allow that. Fortunately, there is another way to specify the Dirichlet distribution through a ratio of gamma distributions that allows to incorporate covariates.
The gamma distribution is continuous. It has two parameters $\alpha$ and $\theta$ that control its shape and scale (Figure \@ref(fig:gammadistribution)). Another parameterization considers its shape and rate which is the inverse of the scale.
(ref:captiongamma) The distribution gamma($\alpha$,$\theta$) for different values of $\alpha$ and $\theta$. The shape argument $\alpha$ determines the overall shape while the scale parameter $\theta$ only affects the scale values (compare the values on X- and Y-axes between the bottom and upper panels). The exponential and chi-square distributions are particular cases of the gamma distribution. If the parameter shape is close to zero, the gamma is very similar to the exponential (bottom and upper left panels). If the parameter shape is large, then the gamma is similar to the chi-squared distribution (bottom and upper right panels).
```{r gammadistribution, echo = FALSE, fig.cap='(ref:captiongamma)'}
# https://blog.revolutionanalytics.com/2015/10/parameters-and-percentiles-the-gamma-distribution.html
plotGamma <- function(shape=2, rate=0.5, to=0.99, p=c(0.1, 0.9), cex=1, ...){
to <- qgamma(p = to, shape = shape, rate = rate)
curve(dgamma(x, shape, rate), from = 0, to = to, n = 500, type="l",
main = sprintf("gamma(%1.1f, %1.1f)", shape, rate),
bty = "n", xaxs = "i", yaxs = "i", col = "blue", xlab = "", ylab = "",
las = 1, lwd = 2, cex = cex, cex.axis = cex, cex.main = cex, ...)
}
par(mfrow = c(2, 3))
plotGamma(1, 0.1)
plotGamma(3, 0.1)
plotGamma(6, 0.1)
plotGamma(1, 1)
plotGamma(3, 1)
plotGamma(6, 1)
```
If we have three independent random variables $Y_1, Y_2$ and $Y_3$ distributed as gamma distributions with shape parameters the $\alpha$'s and same scale parameter $\theta$, that is $Y_j \sim \text{Gamma}(\alpha_j, \theta)$, then it can be shown that the vector $(Y_1/V, Y_2/V, Y_3/V)$ is Dirichlet with vector of parameters the $\alpha$'s, where $V$ is the sum of the $Y$'s (which is also gamma distributed). In NIMBLE, this suggests using $\theta = 1$ to get a uniform prior:
```{r eval = FALSE}
...
# transitions: Dirichlet priors with Gamma formulation
for (s in 1:3){
lpsi1[s] ~ dgamma(alpha[s], 1) # Y1, Y2, Y3 for psi11, psi12, psi13
psi1[s] <- lpsi1[s]/V1 # psi11, psi12, psi13
lpsi2[s] ~ dgamma(alpha[s], 1) # Y'1, Y'2, Y'3 for psi21, psi22, psi23
psi2[s] <- lpsi2[s]/V2 # psi21, psi22, psi23
lpsi3[s] ~ dgamma(alpha[s], 1) # Y''1, Y''2, Y''3 for psi31, psi32, psi33
psi3[s] <- lpsi3[s]/V3 # psi31, psi32, psi33
}
V1 <- sum(lpsi1[1:3])
V2 <- sum(lpsi2[1:3])
V3 <- sum(lpsi3[1:3])
...
```
From there, we can express the shape parameter of the gamma distribution (precisely the $\alpha$'s here) as a function of covariates as in $\log(\alpha) = \beta_1 + \beta_2 x$ in the spirit of a generalized linear model with a gamma response.
<!-- <https://en.wikipedia.org/wiki/Dirichlet_distribution#Related_distributions>, <https://en.wikipedia.org/wiki/Dirichlet_distribution#Random_variate_generation>, <https://groups.google.com/g/nimble-users/c/4XrJbRLUOyY/m/iDLeAIovAQAJ> -->
### Multinomial logit {#multinomiallogit}
<!-- <https://socialwork.wayne.edu/research/pdf/multi-nomial-logistic-regression.pdf>, <https://www3.nd.edu/~rwilliam/stats3/Mlogit1.pdf>, <https://stats.oarc.ucla.edu/stata/dae/multinomiallogistic-regression/> and <https://en.wikipedia.org/wiki/Multinomial_logistic_regression>. -->
Another possibility to build a prior that ensures the movement probabilities are between 0 and 1 and sum up to 1 is to extend the logit link we used for the CJS model in Section \@ref(covariates). Remember we had $\text{logit}(\phi) = \beta$, we specified a prior on $\beta$ say $\beta \sim N(0,1.5)$ then got a prior on $\phi$ by back-transforming $\beta$ with $\phi = \text{logit}^{-1}(\beta)$.
Going back to our example with $3$ sites, and focusing on the movement probabilities say, from site 1, we first choose a reference (or pivot) site, say 1, then $\log\left(\displaystyle{\frac{\psi^{12}}{\psi^{11}}}\right) = \beta_2$ and $\log\left(\displaystyle{\frac{\psi^{13}}{\psi^{11}}}\right) = \beta_3$. Interestingly, when exponentiated, the $\beta$'s here can be interpreted as the increase in the odds of moving versus staying on site resulting from a one-unit increase in the covariate. Any of the sites can be chosen to be the reference, this will not change the likelihood and you will get the same results. Now we specify a normal prior distribution for the $\beta$'s. Eventually, to back-transform, we use $\psi^{12} = \displaystyle{\frac{\exp(\beta_2)}{1+\displaystyle{\exp(\beta_2)+\displaystyle{\exp(\beta_3)}}}}$ and $\psi^{13} = \displaystyle{\frac{\exp(\beta_3)}{1+\displaystyle{\exp(\beta_2)+\displaystyle{\exp(\beta_3)}}}}$. The reference parameter, here $\psi^{11}$, is calculated as $\psi^{11} = \displaystyle{\frac{1}{1 + \displaystyle{\exp(\beta_2)+\displaystyle{\exp(\beta_3)}}}}$, or simply as the complementary probability $\psi^{11} = 1 - \psi^{12} - \psi^{13}$.
Note that when there are only 2 sites instead of 3 or more, then the multinomial logit reduces to the logit link.
In NIMBLE, we write:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# transitions: multinomial logit
for (i in 1:2){
# normal priors on logit of all but one movement prob
beta1[i] ~ dnorm(0, sd = 1.5)
beta2[i] ~ dnorm(0, sd = 1.5)
beta3[i] ~ dnorm(0, sd = 1.5)
# constrain the transitions such that their sum is < 1
psi1[i] <- exp(beta1[i]) / (1 + exp(beta1[1]) + exp(beta1[2]))
psi2[i] <- exp(beta2[i]) / (1 + exp(beta2[1]) + exp(beta2[2]))
psi3[i] <- exp(beta3[i]) / (1 + exp(beta3[1]) + exp(beta3[2]))
}
# reference movement probability
psi1[3] <- 1 - psi1[1] - psi1[2]
psi2[3] <- 1 - psi2[1] - psi2[2]
psi3[3] <- 1 - psi3[1] - psi3[2]
...
```
Then we use these parameters (which now respect the constraints) to define the transition matrix:
```{r eval = FALSE}
multisite <- nimbleCode({
...
# probabilities of state z(t+1) given z(t)
gamma[1,1] <- phi1 * psi1[1]
gamma[1,2] <- phi1 * psi1[2]
gamma[1,3] <- phi1 * psi1[3]
gamma[1,4] <- 1 - phi1
gamma[2,1] <- phi2 * psi2[1]
gamma[2,2] <- phi2 * psi2[2]
gamma[2,3] <- phi2 * psi2[3]
gamma[2,4] <- 1 - phi2
gamma[3,1] <- phi3 * psi3[1]
gamma[3,2] <- phi3 * psi3[2]
gamma[3,3] <- phi3 * psi3[3]
gamma[3,4] <- 1 - phi3
gamma[4,1] <- 0
gamma[4,2] <- 0
gamma[4,3] <- 0
gamma[4,4] <- 1
...
```
You may check that the results are very similar to those we obtained with the Dirichlet prior:
```{r echo = FALSE}
load(here::here("dat","geese_3sites_logit.RData"))
```
```{r}
MCMCsummary(mcmc.multisite, round = 2)
```
Both the Dirichlet prior and the multinomial logit link give the same results, and there is not much difference in terms of runtime or quality of convergence, so you should use the option you feel the most comfortable with.
## Sites may be states {#states}
So far, we have considered geographical locations (or sites) to refine the alive information when an animal is detected. However, it was quickly realized that sites could actually be states defined by physiology or behavior, hence opening up an avenue for applications of capture-recapture models in many fields of ecology.
Examples of states include:
+ Epidemiological or disease states: sick/healthy, uninfected/infected/recovered;
+ Morphological states: small/medium/big, light/medium/heavy;
+ Life-history states: e.g. breeder/non-breeder, failed breeder, first-time breeder;
+ Developmental states: e.g. juvenile/subadult/adult;
+ Social states: e.g. solitary/group-living, subordinate/dominant;
+ Death states: e.g. alive, dead from harvest, dead from natural causes.
In brief, states are individual, time-specific discrete covariates.
### Titis data
To illustrate this section, we will consider data collected between 1942 and 1956 by Lance Richdale on the Sooty shearwaters (*Ardenna grisea*), also known as titis (Figure \@ref(fig:pixtiti)).
```{r pixtiti, echo=FALSE, out.width="100%", fig.cap="Sooty shearwater (*Ardenna grisea*). Credit: John Harrison.", fig.align='center'}
knitr::include_graphics("images/titi.jpg")
```
You may see the data below:
```{r echo = TRUE}
titis <- read_csv2("dat/titis.csv",
col_names = FALSE)
titis %>%
rename(year_1942 = X1,
year_1943 = X2,
year_1944 = X3,
year_1949 = X4,
year_1952 = X5,
year_1953 = X6,
year_1956 = X7)
```
<!-- ```{r echo = FALSE} -->
<!-- knitr::include_graphics("images/sooty.jpg") -->
<!-- ``` -->
In total, 1013 titis were captured, marked and recaptured on a small colony on Whero Island in southern New Zealand. These data were previously analyzed by Richard Scofield who kindly provided us with the data.
Following the way the data were collected, four states were originally considered: Alive breeder; Accompanied by another bird in a burrow; Alone in a burrow; On the surface; Dead. For simplicity, we pooled all alive states (except breeder) together in a non-breeder state (NB) that includes failed breeders (birds that had bred previously – skip reproduction or divorce) and pre-breeders (birds that had yet to breed). Because burrows were not checked before hatching, some birds in the category NB might have already failed. Therefore birds in the breeder state (B) should be seen as successful breeders, and those in the NB state as nonbreeders plus prebreeders and failed breeders.
In summary, we code the states as 1 for alive and breeding, 2 for alive and non-breeding and 3 for dead. To make the modelling process more self-explanatory, we will use letters B, NB and D but you should keep in mind that these are actually numbers. Observations are non-detected coded as 1, and detected as breeder or non-breeder coded as 2 and 3 respectively.
The aim here is to study life-history trade--offs. Specifically, we ask the questions: Does breeding affect survival? Does breeding in current year affect breeding next year?
### The AS model for states
Basically, the AS model with states is the same model as in the geese example with two sites, see Sections \@ref(ASmodel) and \@ref(ASmodelfitting).
If we let $\phi^B$ and $\phi^{NB}$ be the survival probabilities of breeders and non-breeders, $\psi^{NBB}$ the probability of becoming breeder for a non-breeder, and $\psi^{BNB}$ the probability of skipping reproduction, then the transition matrix is:
$$\begin{matrix}
& \\
\mathbf{\Gamma} =
\left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right .
\end{matrix}
\hspace{-1.2em}
\begin{matrix}
z_t=B & z_t=NB & z_t=D \\ \hdashline
\phi^B (1-\psi^{BNB}) & \phi^B \psi^{BNB} & 1 - \phi^B\\
\phi^{NB} \psi^{NBB} & \phi^{NB} (1-\psi^{NBB}) & 1 - \phi^{NB}\\
0 & 0 & 1
\end{matrix}
\hspace{-0.2em}
\begin{matrix}
& \\
\left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right )
\begin{matrix}
z_{t-1}=B \\ z_{t-1}=NB \\ z_{t-1}=D
\end{matrix}
\end{matrix}$$
The costs or reproduction would reflect in future reproduction if breeders have a lower probability of breed next year than non-breeders $\psi^{BB} = 1 - \psi^{BNB} < \psi^{NBB}$ or in survival if the survival of breeders is lower than that of non-breeders $\phi^B < \phi^{NB}$.
The observation matrix is:
$$\begin{matrix}
& \\
\mathbf{\Omega} =
\left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right .
\end{matrix}
\hspace{-1.2em}
\begin{matrix}
y_t=1 & y_t=2 & y_t=3 \\ \hdashline
1 - p^B & p^B & 0\\
1 - p^{NB} & 0 & p^{NB}\\
1 & 0 & 0
\end{matrix}
\hspace{-0.2em}
\begin{matrix}
& \\
\left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right )
\begin{matrix}
z_{t}=B \\ z_{t}=NB \\ z_{t}=D
\end{matrix}
\end{matrix}$$
where $p^B$ and $p^{NB}$ are the detection proabilities of non-breeders and breeders respectively.
### NIMBLE implementation
We first write the NIMBLE code, which is exactly the same as in the geese example with two sites, see Section \@ref(ASmodelfitting).
First some definitions which we have as comments in the code:
```{r eval = FALSE}
multistate <- nimbleCode({
# -------------------------------------------------
# Parameters:
# B is for breeder, NB for non-breeder
# phiB: survival probability state B
# phiNB: survival probability state NB
# psiBNB: transition probability from B to NB
# psiNBB: transition probability from NB to B
# pB: detection probability B
# pNB: detection probability NB
# -------------------------------------------------
# States (z):
# 1 alive B
# 2 alive NB
# 3 dead
# Observations (y):
# 1 not seen
# 2 seen as B
# 3 seen as NB
# -------------------------------------------------
...
```
Then the priors:
```{r eval = FALSE}
multistate <- nimbleCode({
...
# Priors
phiB ~ dunif(0, 1)
phiNB ~ dunif(0, 1)
psiBNB ~ dunif(0, 1)
psiNBB ~ dunif(0, 1)
pB ~ dunif(0, 1)
pNB ~ dunif(0, 1)
...
```
The transition matrix:
```{r eval = FALSE}
multistate <- nimbleCode({
...
# probabilities of state z(t+1) given z(t)
gamma[1,1] <- phiB * (1 - psiBNB)
gamma[1,2] <- phiB * psiBNB
gamma[1,3] <- 1 - phiB
gamma[2,1] <- phiNB * psiNBB
gamma[2,2] <- phiNB * (1 - psiNBB)
gamma[2,3] <- 1 - phiNB
gamma[3,1] <- 0
gamma[3,2] <- 0
gamma[3,3] <- 1
...
```
The observation matrix:
```{r eval = FALSE}
multistate <- nimbleCode({
...
# probabilities of y(t) given z(t)
omega[1,1] <- 1 - pB # Pr(alive B t -> non-detected t)
omega[1,2] <- pB # Pr(alive B t -> detected B t)
omega[1,3] <- 0 # Pr(alive B t -> detected NB t)
omega[2,1] <- 1 - pNB # Pr(alive NB t -> non-detected t)
omega[2,2] <- 0 # Pr(alive NB t -> detected B t)
omega[2,3] <- pNB # Pr(alive NB t -> detected NB t)
omega[3,1] <- 1 # Pr(dead t -> non-detected t)
omega[3,2] <- 0 # Pr(dead t -> detected N t)
omega[3,3] <- 0 # Pr(dead t -> detected NB t)
...
```
And the likelihood:
```{r eval = FALSE}
multistate <- nimbleCode({
...
# likelihood
for (i in 1:N){
# latent state at first capture
z[i,first[i]] <- y[i,first[i]] - 1
for (t in (first[i]+1):K){
# z(t) given z(t-1)
z[i,t] ~ dcat(gamma[z[i,t-1],1:3])
# y(t) given z(t)
y[i,t] ~ dcat(omega[z[i,t],1:3])
}
}
})
```
We run NIMBLE and get the following results:
```{r echo = FALSE}
load(here::here("dat","titis.RData"))
```
```{r}
MCMCsummary(mcmc.multistate, round = 2)
```
Non-breeder individuals seem to have a survival higher than breeder individuals, suggesting a trade-off between reproduction and survival. Let's compare graphically the survival of breeder and non-breeder individuals. First we gather the values generated for $\phi^B$ and $\phi^{NB}$ for the two chains:
```{r}
phiB <- c(mcmc.multistate$chain1[,"phiB"], mcmc.multistate$chain2[,"phiB"])
phiNB <- c(mcmc.multistate$chain1[,"phiNB"], mcmc.multistate$chain2[,"phiNB"])
df <- data.frame(param = c(rep("phiB", length(phiB)),
rep("phiNB", length(phiB))),
value = c(phiB, phiNB))
```
Then, we plot the two posterior distributions:
```{r}
df %>%
ggplot(aes(x = value, fill = param)) +
geom_density(color = "white", alpha = 0.6, position = 'identity') +
scale_fill_manual(values = c("#69b3a2", "#404080")) +
labs(fill = "", x = "survival")
```
There is little overlap between the two distributions, suggesting an actual trade--off. A formal test of the trade-off would consist in fitting the model with survival irrespective of the state, and compare its WAIC value to the model we just fitted.
What about a potential trade-off on reproduction?
```{r}
psiBNB <- c(mcmc.multistate$chain1[,"psiBNB"], mcmc.multistate$chain2[,"psiBNB"])
psiBB <- 1 - psiBNB
psiNBB <- c(mcmc.multistate$chain1[,"psiNBB"], mcmc.multistate$chain2[,"psiNBB"])
df <- data.frame(param = c(rep("psiBB", length(phiB)),
rep("psiNBB", length(phiB))),
value = c(psiBB, psiNBB))
df %>%
ggplot(aes(x = value, fill = param)) +
geom_density(color = "white", alpha = 0.6, position = 'identity') +
scale_fill_manual(values = c("#69b3a2", "#404080")) +
labs(fill = "", x = "breeding probabilities")
```
There is no overlap whatsoever, so the two transition probabilities are clearly different. Interestingly, breeder individuals do much better than non-breeder individuals. This failure at detecting a trade-off is probably due to individual heterogeneity that should be accounted for. We will do just that in a case study in Section \@ref(casestudytradeoff).
<!-- See M. Paquet suggestion: Pouvoir utiliser le modèle à la fois pour simuler des données, puis ensuite pour les ajuster au modèle (le tout sans avoir à réécrire le modèle!). Exemple: nodesToSim <- model$getDependencies(c("parameter_name1","parameter_name2"), self = F, downstream = T), # compile Cmodel <- compileNimble(model), #simulate Cmodel$simulate(nodesToSim) -->