-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2022_colourlog.Rmd
402 lines (341 loc) · 12.9 KB
/
2022_colourlog.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
---
title: "colourlog: generate plots of colourlogs from core photographs"
author:
- name : Ilja J. Kocken
affiliation: Utrecht University
email: [email protected]
orcid: 0000-0003-2196-8718
output: html_document
<!-- bibliography: TODO.bib -->
<!-- csl: style.cls -->
---
```{r setup, include=FALSE}
options(
crayon.enabled = FALSE
)
knitr::opts_chunk$set(
comment = '#', fig.width = 6, fig.height = 6, strip.white = TRUE
)
```
Here we try to generate an image of the colourlog that you have generated based
on core photographs!
# assumed pre-work
This script assumes that you have already successfully applied the DeCrack
routine in Matlab, which gets colour information from the core photograph after
removing cracks [@cite:Zeeden2015].
In short, the procedure is to first clean up the core photographs manually (in
Photoshop or similar).
1. download the core photographs from the [janus database](http://iodp.tamu.edu/janusweb/imaging/photo.shtml)
2. split them into sections that you manually align, so that the image
distortion is minimized
3. clean up each section, by colouring bioturbation signs and water sample foam
black
4. run the DeCrack matlab script in matlab
# load necessary libraries
```{r}
library(tidyverse)
library(readxl)
library(plotly)
library(ggtextures)
library(patchwork)
```
# read in the decracked data
The output of the matlab script is stored in .dat files where
1 is unaltered data
2 is after removed cracks data
## create a list of column names
Column names are described in table 1 in [@cite:Zeeden2015].
```{r}
names <- c("distance_from_core_top1",
"mean_greyscale_values1",
"standard_deviation_of_greyscale_values1",
"median_of_greyscale_values1",
"mean_red_values1",
"standard_deviation_of_red_values1",
"median_of_red_values1",
"mean_green_values1",
"standard_deviation_of_green_values1",
"median_of_green_values1",
"mean_blue_values1",
"standard_deviation_of_blue_values1",
"median_of_blue_values1",
"distance_from_core_top2",
"mean_greyscale_values2",
"standard_deviation_of_greyscale_values2",
"median_of_greyscale_values2",
"mean_red_values2",
"standard_deviation_of_red_values2",
"median_of_red_values2",
"mean_green_values2",
"standard_deviation_of_green_values2",
"median_of_green_values2",
"mean_blue_values2",
"standard_deviation_of_blue_values2",
"median_of_blue_values2")
```
## read in a single output file to see if it works with these column names
```{r}
dat1 <- read_delim("dat/decrackoutput/959A30X-1.dat",
delim = ' ',
trim_ws = TRUE,
col_names = names)
glimpse(dat1)
```
That seems to work!
## List all the datafiles
```{r}
path <- "dat/decrackoutput"
files <- list.files(path,
pattern = '.dat$',
full.names = FALSE)
```
## Get metadata from filename
We create a [tibble](https://tibble.tidyverse.org/)/dataframe with one file per
line, and extract all the relevant metadata from the filename into separate
columns.
```{r}
main <- tibble(file=files) |>
mutate(site = str_extract(file, "^\\d+") |>
parse_integer(),
hole = str_extract(file, "[A-D]"),
type = str_extract(file, "X"),
core = str_extract(file, "\\d+X-") |>
str_replace("X-", "") |>
parse_integer(),
section = str_extract(file, "-\\d+") |>
str_replace("-", "") |>
parse_integer())
main
```
## Read colour data
This now uses the above tibble and reads in the file like we did for `dat1`
above. Except now each row is a [nested tibble](https://tidyr.tidyverse.org/articles/nest.html).
```{r}
main <- main |>
mutate(data = map(file,
~ read_delim(paste0(path, '/', .x),
delim = " ",
trim_ws = TRUE,
col_names = names)))
```
We unnest the data so that we go back to a flat tibble, where all the files are
listed below each other. Note that the rows that were unique to each file are
now repeated as many times as there are data in the file.
```{r}
flat <- main |> unnest(cols=data)
flat
```
Let's take a quick look at what we have now:
```{r}
flat |>
ggplot(aes(x = mean_greyscale_values2, y = distance_from_core_top2)) +
geom_path() +
scale_y_reverse() +
facet_grid(cols = vars(section))
```
Good! We have the colour information for each section!
# load the core images
First we list all the cropped section images, and get their metadata from the filenames.
```{r}
imgs <- tibble(file = list.files("dat/croppedsections", pattern = "*.png")) |>
# this is the second time we do this, should probably make it a function
mutate(site=str_extract(file, "^\\d+") |>
parse_integer(),
hole=str_extract(file, "[A-D]"),
type=str_extract(file, "X"),
core=str_extract(file, "\\d+X-") |>
str_replace("X-", "") |>
parse_integer(),
section=str_extract(file, "-\\d+") |>
str_replace("-", "") |>
parse_integer())
```
Note that there are just a few section images here, so we can easily plot them
all, but if you use more than this you should shrink them first before trying
to plot them, otherwise your R session might crash! See [my corepics repository](https://github.com/japhir/corepics).
```{r fig.width = 3, fig.height = 8}
imgs |>
## slice(1) |> # try it out for a single image
ggplot(aes(xmin = 0, xmax = 1, ymin = 0, ymax = 1.50)) +
# here we turn the filename into the filepath, relative to this project
geom_textured_rect(aes(image = paste0("dat/croppedsections/", file))) +
scale_y_reverse() +
facet_grid(cols = vars(section), rows = vars(core))
```
Currently we do not have the information for the height/depth of these images, however.
# load section depths
We need to now link the colour log data to depth information, and ultimately to age information.
So we first read in the metadata, which we downloaded from [Janus database](https://web.iodp.tamu.edu/janusweb/coring_summaries/coresumm.cgi).
```{r}
coresumm <- read_tsv("dat/coresumm.tsv")
coresumm
```
<!-- This still lacks the newest depth scale, however, so it's better to use a custom file. -->
<!-- ```{r} -->
<!-- coresumm <- read_csv("dat/SectionSummary.csv") -->
<!-- ``` -->
Now that we have the metadata, we can attach it to the images.
```{r}
imgs <- imgs |>
tidylog::left_join(coresumm |>
mutate(Sc = ifelse(str_detect(Sc, "CC"), "8", Sc) |>
parse_integer()),
by = c('site' = 'Site',
'hole' = 'H',
'core' = 'Cor',
'type' = 'T',
'section' = 'Sc'))
```
and make proper plots of our sections
```{r fig.width = 2}
corepics <- imgs |>
## slice(1) |> # try it out for a single image
ggplot(aes(xmin = 0, xmax = 1,
ymin = `Top(mbsf)` + `LL(m)`,
ymax = `Top(mbsf)`)) +
scale_y_reverse() +
# here we turn the filename into the filepath, relative to this project
geom_textured_rect(aes(image = paste0("dat/croppedsections/", file)),
# below settings are essential so that section images get
# stretched!
nrow = 1, ncol = 1,
img_width = unit(1, "null"),
img_height = unit(1, "null"),
interpolate = FALSE) +
facet_grid(cols = vars(hole)) +
labs(y = "Depth (mbsf)")
corepics +
# show the core and section number
geom_text(aes(x = 1.2, label = paste0(core, "-", section),
y = `Top(mbsf)` + 0.5 * `LL(m)`),
size = 2, hjust = 1)
```
# agemodel
But if we want to plot it against age, we need to apply an agemodel.
Read in the agemodel and plot it.
Unfortunately the agemodel isn't published yet, so here's a very low-res fake agemodel.
```{r}
## agem <- read_excel("dat/Agemodel.xlsx")
agem <- tribble( ~ depth_mbsf, ~ Age_GTS12,
260, 15,
290, 20,
315, 22,
323, 22.7,
330, 24,
350, 25)
agem |>
ggplot(aes(x = Age_GTS12, y = depth_mbsf)) +
## geom_smooth(span = .05) +
geom_line() +
geom_point() +
# annotate cores we have pics for
geom_rug(aes(y = `Top(mbsf)`),
colour = "grey",
inherit.aes = FALSE,
data = imgs) +
scale_y_reverse()
```
Apply the agemodel. I'm assuming that the top_age and bot_age are in mbsf,
because the rmcd column in the agemodel is empty!
```{r}
imgs <- imgs |>
mutate(top_age = approx(x = agem$depth_mbsf, y = agem$Age_GTS12,
xout = `Top(mbsf)`)$y,
bot_age = approx(x = agem$depth_mbsf, y = agem$Age_GTS12,
xout = `Top(mbsf)` + `LL(m)`)$y)
```
Add the age information to the colourlog
```{r}
newflat <- flat |>
# tidylog is nice because it prints some messages about how the join went.
tidylog::left_join(coresumm |>
mutate(Sc = ifelse(str_detect(Sc, "CC"), "8", Sc) |>
parse_integer()),
by = c('site' = 'Site',
'hole' = 'H',
'core' = 'Cor',
'type' = 'T',
'section' = 'Sc'))
glimpse(newflat)
```
## calculate depth
The above metadata gives us the depth of each top of the section, but our
"samples" are obviously continuously spaced throughout each section.
```{r}
depthcalc <- newflat %>%
# add the section depth (in mbsf) and the top depth (in cm).
mutate(depth = `Top(mbsf)` + distance_from_core_top2 * 0.01)
# inspect the new depth
depthcalc %>%
select(`Top(mbsf)`, distance_from_core_top2, depth) |>
tail()
```
We now have our entire colourlog and the associated core depth!
## apply the agemodel
Make sure you know which column holds the age (is it in ka or in Ma?) and which one has the depth.
```{r}
depthcalc <- depthcalc |>
mutate(age = approx(agem$depth_mbsf, agem$Age_GTS12,
xout = depth)$y)
```
# calculate colour
The colourlog output is not a colour that R knows how to plot yet, so we have
to convert the separate red, green, and blue (RGB) values into a single
character that defines the colour (as you may recognize from Photoshop).
```{r}
colourlog <- depthcalc |>
# play around with the maxColorValue here! I think it should be 255, from memory.
mutate(colour = rgb(mean_red_values2,
mean_green_values2,
mean_blue_values2,
maxColorValue = 255))
```
Create a plot to show data
```{r fig.width = 4}
colourplot <- colourlog |>
## arrange(depth) |>
ggplot(aes(y = depth,
x = mean_greyscale_values2,
colour= colour,
core = core, section = section)) +
# there are many nice ways to plot this data!
# this will just draw the familiar line
## geom_line(colour = "black", orientation = "y") +
## geom_point() +
# this makes it easier to see
## geom_area(colour = "black", orientation = "y") +
# the rug is like a cleaned-up average colour log
geom_rug(aes(y = depth, colour = colour), sides = 'l') +
# make the colours extend to the mean grayscale value
# this is a bit hacky because it's drawing segments for each step
# if you zoom in very far, you should increase the segment thickness
geom_segment(aes(x = 0, xend = mean_greyscale_values2,
y = depth, yend = depth)) +
scale_colour_identity() +
scale_y_reverse()
colourplot
```
Save plot and data
```{r fig.width = 4.5}
# make sure that corepics and colourplot have the same depth scale
(corepics +
coord_cartesian(ylim = c(312, 274)) +
labs(y = "Depth (mbsf)") +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
panel.grid = element_blank())) +
(colourplot +
labs(x = "Mean grayscale value") +
coord_cartesian(ylim = c(312, 274)) +
# remove the axis from the right part
theme(axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank())) +
plot_layout(ncol = 2, widths = c(.1, .9))
ggsave('imgs/colouranalysis.pdf', width = 10, height = 25, units = c('cm'))
write_csv(colourlog, file = "out/colourdata.csv")
```
To do the same against depth, we've applied the agemodel already to both the core photographs and to the data, so this shouldn't be too much trouble!
# references
Zeeden, C., Hilgen, F., Röhl, U., Seelos, K., & Lourens, L. (2015). Sediment color as a tool in cyclostratigraphy – a new application for improved data acquisition and correction from drill cores. Newsletters on Stratigraphy, 48(3), 277–285. https://doi.org/10.1127/nos/2015/0064