-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathbv57_exploratory_density.qmd
275 lines (210 loc) · 10 KB
/
bv57_exploratory_density.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
---
title: "Analysis of UVP data along a transect"
---
# How does abundance change along the transect?
### 1 - We need to load in and parse through the data
We need to first import the data and select the core information that we need. The UVP project folder has all the casts we want and then some. We only want the BATS Validation cruise - bv57. We'll need to remove the hs casts.
#### Setting up the requisites
```{r echo = F, warning=F}
rm(list = ls())
library(EcotaxaTools)
library(tibble)
library(ggplot2)
```
#### Importing data and trimming list to keep only bv casts
```{r}
uvp_dat <- ecopart_import("~/BV_transect_export") #load in initial ecopart object
# we need to trim out the hs casts
# One way to do that would be to hard-code indices for those but that's not reliable
#So I'm using regex
#get index of hs data for both par and zoo files
hs_index_par <- grep("hs",names(uvp_dat$par_files))
hs_index_zoo <- grep('hs', names(uvp_dat$zoo_files))
uvp_dat$par_files <- uvp_dat$par_files[-hs_index_par]
uvp_dat$zoo_files <- uvp_dat$zoo_files[-hs_index_zoo]
```
### 2 - Identifying the meta-data
We want to now get a little idea of the data that we have. The main questions we have are how many locations are at each site, which are day vs night, and how deep does each cast go?
#### First, let's look at the time of day for these casts
All these data are in the meta file. Something to keep in mind is that UVP time is recorded in UTC - so you need to get sunrise/sunset times from [NOAA's Solar Calculator](https://gml.noaa.gov/grad/solcalc/sunrise.html). I'm not super precise with it, I just go with the month at longitude.
```{r}
# I'm writing a function for myself for later
get_time <- function(full_date) {
time_stamp <- sapply(strsplit(as.character(full_date)," "),
`[[`,2)
return(as.POSIXct(time_stamp, format = '%H:%M:%S'))
}
#my function above will just get the time and set it to todays date but that is fine
uvp_dat$meta$time_of_day <- sapply(get_time(uvp_dat$meta$sampledate), timeOfDay,'10:18:00','21:48:00',1) #this will add time of day to our metadata file
```
There's actually very few night casts, let's just go with the day casts then for this project.
#### Now we want to check the max-depth for all our casts
```{r}
# a quick funciton to find max depth
get_max_d <- function(zoo_file) {
max_d <- zoo_file$depth_including_offset[which.max(zoo_file$depth_including_offset)]
return(max_d)
}
max_d <- rep(NA, length(uvp_dat$zoo_files))
for(i in 1:length(uvp_dat$zoo_files)) {
max_d[i] <- get_max_d(uvp_dat$zoo_files[[i]])
}
depth_tibble <- tibble(profileid = names(uvp_dat$zoo_files),
max_d = max_d)
uvp_dat$meta <- as_tibble(merge(uvp_dat$meta,
depth_tibble))
tibble(Profile_id = uvp_dat$meta$profileid,
lat = uvp_dat$meta$latitude,
tod = uvp_dat$meta$time_of_day,
max_d = uvp_dat$meta$max_d)
```
Looking at this data, we have several casts going to at least 500m, some going to 1200m, and several going to 5000m. So what we can do is look at four ocean regions - epipelagic (0-200), upper mesopelagic (200-500), lower mesopelagic (500 - 1200), bathypelagic (1200 - 4000).
Next steps will be to calculate concentration for each profile, then look at integrated densities. But first we need to rename our data to core taxonomic groups.
#### First let's look at which taxonomic categories we have:
```{r}
list_taxa <- function(uvp_dat) {
tdf <- do.call(rbind, uvp_dat$zoo_files)
return(tdf$name)
} #should add name column option
cbind(names(table(list_taxa(uvp_dat))),
table(list_taxa(uvp_dat)) / length(list_taxa(uvp_dat)))
```
From this, let's look at: Rhizaria, Copepods, Trichodesmium, Chaetognath, Annelida, Eumalacostraca. One issue we want to correct is that darksphere is classified as living and we want to change that to be non-living
```{r}
names <- c('living', "Rhizaria","Copepoda","Trichodesmium",
"Chaetognatha",'Annelida',"Eumalacostraca",'not-living','darksphere')
uvp_dat <- uvp_dat %>%
add_zoo(names_to, 'name', names, suppress_print = T)
darksphere_switch <- function(zoo_file) {
new_name <- zoo_file$name
new_name[which(new_name == 'darksphere')] <- 'not-living'
return(new_name)
}
uvp_dat <- add_zoo(uvp_dat, darksphere_switch, 'name')
tibble(names(table(list_taxa(uvp_dat))),
table(list_taxa(uvp_dat)) / length(list_taxa(uvp_dat)))
```
### 3 - Calculating the concentration & Integrating
We now want to calculate the concentration of different taxa. We'll do this in 20m bins.
```{r}
conc_list <- vector('list',length(uvp_dat$zoo_files))
names(conc_list) <- names(uvp_dat$zoo_files)
for(i in 1:length(conc_list)) {
conc_list[[i]] <- uvp_conc(uvp_dat,names(conc_list)[i],seq(0,max(uvp_dat$zoo_files[[i]]$depth_including_offset),20))
conc_list[[i]]$mp <- get_bin_limtis(conc_list[[i]]$db)$mp
}
```
With this, we can now want the integrated abundance for our four ocean zones. We'll set up a holder shell first. Then fill it out with needed information about each cast which goes into it. We'll reference the meta data folder a lot for this:
```{r}
intg_abund <- vector('list', 4)
names(intg_abund) <- c('euphotic','upper_meso',
'lower_meso','bathy')
#a function I might use later
#' @param max_d a vector of max_depths corresponding to casts
#' @param depth_breaks numeric vector with zone limits
#' @param zone_labels character vector of names for lower limits
assign_zones <- function(max_d, depth_breaks, zone_labels) {
if(length(depth_breaks) != length(zone_labels)) {
stop('depth_breaks and zone_labels must be equal length')
}
d_match <- sapply(max_d, nearest, depth_breaks)
zones <- zone_labels[match(d_match, depth_breaks)]
return(zones)
}
uvp_dat$meta$zone <- sapply(uvp_dat$meta$max_d, assign_zones,
c(500,1200,4000),
c(2,3,4))
for(i in 1:length(intg_abund)) {
intg_abund[[i]]$d_lim <- c(0,200,500,1200,4000)[c(i, i+1)]
intg_abund[[i]] <- c(intg_abund[[i]], vector('list', 5))
names(intg_abund[[i]]) <- c('d_lim','Trichodesmium','Rhizaria',
'Eumalacostraca','Copepoda', 'Chaetognatha')
for(j in 2:length(intg_abund[[i]])) {
intg_abund[[i]][[j]]$profile_id <- uvp_dat$meta$profileid[which(uvp_dat$meta$zone >= i)]
}
}
```
Now we have a list shell set up to integrate abundance for each of our taxa of interest in each zone. This will be a fairly large loop since we're iterating the process many times over. This is definitely the quick and dirty way to get things done. In the future I might want to build out functions for each of these tasks then apply them to each individual taxa
```{r}
#ugly functions to avoid ridiculous looping
subset_taxa <- function(df, taxa) {
rdf <- df[df$taxa == taxa,]
return(rdf)
}
subset_d <- function(df, d_lower, d_upper) {
rdf <- df[df$mp > d_lower &
df$mp < d_upper, ]
return(rdf)
}
org_loop <- function(df, lim_list) {
intg <- trapz_integarate(df$mp, df$conc_m3,
lim_list$d_lim[1], lim_list$d_lim[2],
subdivisions = 500)
return(intg$value)
}
nrow_check <- function(df) {
if(nrow(df) == 0) {
return(FALSE)
} else {
return(TRUE)
}
}
for(i in 1:length(intg_abund)) {
for(j in 2:length(intg_abund[[i]])) {
temp_conc_list <- lapply(conc_list, subset_taxa, names(intg_abund[[i]])[j])
temp_conc_list <- lapply(temp_conc_list, subset_d, intg_abund[[i]]$d_lim[1], intg_abund[[i]]$d_lim[2])
temp_conc_list <- temp_conc_list[which(names(temp_conc_list) %in% intg_abund[[i]][[j]]$profile_id)]
if(any(!(sapply(temp_conc_list,nrow_check)))) {
drop <- which(!(sapply(temp_conc_list,nrow_check)))
temp_conc_list <- temp_conc_list[-drop]
intg_abund[[i]][[j]]$profile_id <- intg_abund[[i]][[j]]$profile_id[-drop]
}
intg_taxa <- sapply(temp_conc_list, org_loop, intg_abund[[i]])
intg_abund[[i]][[j]]$num_m2 <- as.numeric(intg_taxa[match(names(intg_taxa),intg_abund[[i]][[j]]$profile_id)])
}
}
```
Great! Now we have a big set of data with integrated abundances in each four depth zone. Now all we have to do is plot them according to their latitudes. Looking at the data however, it seems that we have really limited data for Eumalacostraca and Chaetognatha. So let's just look at Copepods, Rhizaria, and Trichodesmium.
### 4 - Plotting our data
Before we plot the data, it is nice to get everything into a clean plotting dataframe. I'm thinking that it'll be best to plot all three taxa on panels based on the depth
```{r}
#set the taxa we are intested in plotting
plot_taxa <- c('Trichodesmium', 'Rhizaria', 'Copepoda')
mush_into_df <- function(obj) {
rdf <- as.data.frame(obj[[1]])
rdf$taxa <- rep(names(obj), nrow(rdf))
return(rdf)
}
plot_list <- vector('list', 4)
names(plot_list) <- c('Euphotic', 'Upper_Meso','Lower_Meso','Bathy')
for(i in 1:length(plot_list)) {
for(j in 1:length(intg_abund[[i]][plot_taxa])) {
plot_list[[i]][[j]] <- mush_into_df(intg_abund[[i]][plot_taxa[j]])
}
plot_list[[i]] <- do.call(rbind, plot_list[[i]])
plot_list[[i]] <- merge(plot_list[[i]],
data.frame(profile_id = uvp_dat$meta$profileid,
lat = uvp_dat$meta$latitude))
}
```
Great! Now everything is all set up for us to plot our figures based on latitude
```{r}
plot_output <- vector('list', 4)
for(i in 1:length(plot_output)) {
plot_list[[i]] <- plot_list[[i]][!is.na(plot_list[[i]]$num_m2),]
plot_output[[i]] <- ggplot(plot_list[[i]],aes(y = num_m2, x = lat, col = taxa)) +
geom_point(size = 2) +
geom_smooth(se = F, method = 'loess',
size = 1, alpha = .05, span = 1)+
coord_flip()+
labs(x = 'Latitude', y = 'Integrated Abudance [Number/m2]',
subtitle = names(plot_list)[i],
col = "")+
theme_bw()
}
print(plot_output)
```
Cool we have some figures to start working with. I'm saving a bit of data to and providing here. It is an RDS so to load it you just type readRDS('filename.rds').
```{r}
saveRDS(plot_list, './item01_integrated-abundances-by-zone')
```