forked from ciesin-geospatial/TOPSTSCHOOL-water
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathwsim-gldas-vis.qmd
203 lines (140 loc) · 5.87 KB
/
wsim-gldas-vis.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
---
title: "WSIM-GLDAS Dataset Exploration and Visualizations"
author: "Joshua Brinks & Elaine Fatumi"
date: "December, 6 2023"
---
## TO DO
* Write the actual code and narrative.
* Determine the region and time period of focus to draw in our use cases/human focused stories.
* Determine the method of exploration.
+ Mimic our process?
* 12 month integration panels of the CON USA from 2000-2014 to identify areas of interest.
* Zoom in to locations of interest and switch to 1-month integration for the years identified in the previous step.
## Introduction
* Raster/vector visualization background?
+ General
+ Water resource specific
* Package background
+ Basic plotting with stars/sf
+ more advanced plotting with ggplot/ggmap
## Setup
## Load Data
We'll start again with the WSIM-GLDAS 12 month integration anomaly file and quickly subset it to the continental United States.
```{r warning=FALSE}
#| code-fold: true
wsim_gldas <- stars::read_stars("composite_anom_12mo.nc", proxy = FALSE)
keeps<-seq(lubridate::ymd("2000-12-01"),
lubridate::ymd("2014-12-01"),
by = "year")
# filter using that vector
wsim_gldas <- dplyr::filter(wsim_gldas, time %in% keeps)
# you may want to clear your memory if your system is limited
gc()
wsim_deficit <- wsim_gldas['deficit']
# generate a vector of dates for subsetting
usa <- httr::GET("https://www.geoboundaries.org/api/current/gbOpen/USA/ADM1/")
usa <- httr::content(usa)
usa <- sf::st_read(usa$gjDownloadURL)
drops<-
c("Alaska", "Hawaii",
"American Samoa",
"Puerto Rico",
"Commonwealth of the Northern Mariana Islands",
"Guam",
"United States Virgin Islands")
usa<-usa[!(usa$shapeName %in% drops),]
wsim_deficit_usa<-wsim_deficit[usa]
```
Now we'll verify this with `print()` and `plot()`.
```{r}
print(wsim_deficit_usa)
```
The output shows that we've selected a single attribute ('deficit') and 15 time-steps in the 'time' dimension.
```{r warning=FALSE, message=FALSE}
wsim_deficit_usa |>
dplyr::slice(index = 15, along = "time") |>
plot(reset = FALSE, breaks = c(0,-5,-10,-20,-40,-50))
plot(sf::st_geometry(usa),
add = TRUE,
lwd = 3,
fill = NA,
border = 'purple')
```
## Exploratory Histogram
Create histogram of raster values for a single time step.
Get the values out of the raster and create a histogram.
```{r}
# filter for the first time-step in the file
usa1 <-
wsim_deficit_usa |> dplyr::slice(time, 1)
# extract the values into a data.frame
usa1<-as.data.frame(as.numeric(wsim_deficit_usa$deficit))
# appropriately name the values (it was lost in the example)
names(usa1)<-"Deficit"
```
Check the values
```{r}
ggplot2::ggplot(usa1, ggplot2::aes(Deficit))+
ggplot2::geom_histogram(na.rm = TRUE)
```
There are some bad outliers, we can just zoom into the majority of values by setting x-axis limits.
```{r}
ggplot2::ggplot(usa1, ggplot2::aes(Deficit))+
ggplot2::geom_histogram(na.rm = TRUE)+
ggplot2::xlim(c(-10,0))
```
Extreme values or other items of note might require additional visualization or other data exploration.
## Multi-Panel Time Series
Create a multipanel time series of 12 month integration CONUSA; similar to what we used to identify our case studies. Each panel will represent 1 year.**
Load in a CONUSA geojson from geoBoundaries. Copy methods from previous vignette.
```{r warning = FALSE}
wsim_deficit_usa |>
plot(reset = FALSE,
col = leg_colors<-c(
'#9B0039',
# -50 to -40
'#D44135',
# -40 to -20
'#FF8D43',
# -20 to -10
'#FFC754',
# -10 to -5
'#FFEDA3',
# -5 to -3
'#FFFFFF'))
```
Once hot spots are easily identified pick a region of interest to zoom in on using the 1 month integration dataset.
Load in the 1 month integration dataset and subset/index the dataset to the region of interest (copy code from previous vignette). Use `dplyr::slice` or other method to pull out just the 12 months from the year of interest. Code demonstrating these techniques in previous vignette.
```{r}
wsim_gldas_1mo <- stars::read_stars("composite_anom_1mo.nc", proxy = FALSE)
print(wsim_gldas_1mo)
```
Create a multi-panel figure with each panel representing 1 month to identify the most intense months of drought or flooding. Starting with this one maybe use `ggplot` and a nice palette, legend, and panel headings. Will probably have to use some sort of faceting to make individual panels (might not be possible).
```r
```
## Point Location Time Series
Visualize an individual cell with particular extreme or maybe volatile values. Use Google Maps to identify the latitude/longitude of a place of interest. Maybe an urban center or other important location in the region that suffered from the extreme event.
Create a vector with the point location.
```r
```
Use `stars::extract` to extract raster values in the stack at the point location.
```r
```
The resulting data frame of time series values should be inspected. It may also need to be converted from wide format to long format so it may be plotted in `ggplot`. Use either pivot wider/longer from `dplyr` or cast/melt from `data.table`.
```r
```
Once in the proper format, plot using `ggplot`.
```r
```
## Population Exposure Plot
Use Gridded Population of the World and `exactextractr` to determine the number of people exposed to a given anomaly for each month of the year.
Background info on GPW would be appropriate. Same with `exactextractr` and zonal statistics.
Load in GPW data and the `exactextractr` package
```r
```
Perform the time series zonal summary.
This might be a bit tricky; been a while for me. Have to look up the proper code. Dan has good examples on the exactextractr package website.
Resulting data.frame will probably need to be transformed to long (just like before), so it can be plotted.
```r
```
Now plot the data in ggplot. I have some existing code I can pull to help with the plotting--or at least make it fancy.