-
Notifications
You must be signed in to change notification settings - Fork 36
/
Copy path07_image_data.Rmd
322 lines (219 loc) · 12.9 KB
/
07_image_data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
# Image data
*This chapter has substantially benefitted from contributions by Michael Tüting ([MTueting](https://github.com/MTueting))*.
With the advent of broadband internet and efforts in digitizing analogue archives a large part of the world's available data is stored in the form of images (and moving images). Subsequent advances in computer vision (machine learning techniques focused on images and videos) have made usage of images for economic and social science purposes accessible to non-computer science researchers. Examples of how image data is used in economic research involve the quantification of urban appearance (based on the analysis of street-level images in cities, @naik_etal2016), the digitization of old text documents via optical character recognition (@cesarini_etal2016), and the measurement of economic development/wealth with nighttime light-intensity based on satellite image data (@hodler_raschky2014). In the following subsections, you will first explore the most common image formats, and how the data behind digital images is structured and stored.
## Image data structure and storage
There are two important variants of how digital images are stored: raster images (for example, jpg files), and vector-based images (for example, eps files). In terms of data structure, the two formats differ quite substantially:
- The data structure behind raster images basically defines an image as a matrix of pixels, as well as the color of each pixel. Thereby, screen colors are combinations of the three base colors red, green, blue (RGB). Technically speaking, a raster image thus consists of an array of three matrices with the same dimension (one for each base color). The values of each array element are then indicating the intensity of each base color for the corresponding pixel. For example, a black pixel would be indicated as (0,0,0). Raster images play a major role in essentially all modern computer vision applications related to social science/economic research. Photos, videos, satellite imagery, and scans of old documents all stored as raster images.
- Vector images are essentially text files that store the coordinates of points on a surface and how these dots are connected (or not) by lines. Some of the commonly used formats to store vector images are based on XML (for example SVG files). The most common encounter of vector image files in a data analytics/research context are typically storing map data. Streets, borders, rivers, etc. can be defined as polygons (consisting of individual lines/vectors).
Given the fundamental differences in how the data behind these two types of images is structured, practically handling such data in R differs quite substantially between the two formats with regard to the R packages used and the representation of the image object in RAM.
## Raster images in R
This is meant to showcase some of the most frequent tasks related to images in R.
```{r}
# Load two common packages
library(raster)
```
### Basic data structure
Recall that raster images are stored as arrays ($X\times Y \times Z$). $X$ and $Y$ define the number of pixels in each column and row (width and height) of the image and $Z$ the number of layers. Greyscale images usually have only one layer, whereas most colored images have 3 layers (such as in the case of RGB images). The following figure illustrates this point based on a $3 \times 3$ pixels bitmap image.
```{r rgb, echo=FALSE, out.width = "75%", fig.align='center', fig.cap= "(ref:rgb)", purl=FALSE}
include_graphics("img/rgb_structure.svg")
```
(ref:rgb) Illustration of a bitmap file data structure (with RGB schema). Panel A shows a 3x3 pixels bitmap image as it is shown on screen. Panel B illustrates the corresponding bitmap file's data structure: a three-dimensional array with 3x3 elements represents the image matrix in the three base screen colors red, green, and blue. As a reading example, consider the highlighted upper left pixel in A with the correspondingly highlighted values in the RGB array in B. The color of this pixel is RGB code 255 (full red), 51 (some green), and 51 (some blue): (255,51,51).
To get a better feeling for the corresponding data structure in R, we start with generating RGB-images step-by-step in R. First we generate three matrices (one for each base color), arrange these matrices in an array, and then save the plots to disk.
#### Example 1: Generating a red image (RGB code: 255,0,0)
```{r}
# Step 1: Define the width and height of the image
width = 300;
height = 300
# Step 2: Define the number of layers (RGB = 3)
layers = 3
# Step 3: Generate three matrices corresponding to Red, Green, and Blue values
red = matrix(255, nrow = height, ncol = width)
green = matrix(0, nrow = height, ncol = width)
blue = matrix(0, nrow = height, ncol = width)
# Step 4: Generate an array by combining the three matrices
image.array = array(c(red, green, blue), dim = c(width, height, layers))
dim(image.array)
# Step 5: Create RasterBrick
image = brick(image.array)
print(image)
# Step 6: Plot RGB
plotRGB(image)
# Step 7: (Optional) Save to disk
png(filename = "red.png", width = width, height = height, units = "px")
plotRGB(image)
dev.off()
```
#### Example 2: Generating a green image (RGB code: 0, 255, 0)
```{r}
# Step 1: Define the width and height of the image
width = 300;
height = 300
# Step 2: Define the number of layers (RGB = 3)
layers = 3
# Step 3: Generate three matrices corresponding to Red, Green, and Blue values
red = matrix(0, nrow = height, ncol = width)
green = matrix(255, nrow = height, ncol = width)
blue = matrix(0, nrow = height, ncol = width)
# Step 4: Generate an array by combining the three matrices
image.array = array(c(red, green, blue), dim = c(width, height, layers))
dim(image.array)
# Step 5: Create RasterBrick
image = brick(image.array)
print(image)
# Step 6: Plot RGB
plotRGB(image)
# Step 7: (Optional) Save to disk
png(filename = "blue.png", width = width, height = height, units = "px")
plotRGB(image)
dev.off()
```
#### Example 3: Generating a blue image (RGB code: 0, 0, 255)
```{r}
# Step 1: Define the width and height of the image
width = 300;
height = 300
# Step 2: Define the number of layers (RGB = 3)
layers = 3
# Step 3: Generate three matrices corresponding to Red, Green, and Blue values
red = matrix(0, nrow = height, ncol = width)
green = matrix(0, nrow = height, ncol = width)
blue = matrix(255, nrow = height, ncol = width)
# Step 4: Generate an array by combining the three matrices
image.array = array(c(red, green, blue), dim = c(width, height, layers))
dim(image.array)
# Step 5: Create RasterBrick
image = brick(image.array)
print(image)
# Step 6: Plot RGB
plotRGB(image)
# Step 7: (Optional) Save to disk
png(filename = "green.png", width = width, height = height, units = "px")
plotRGB(image)
dev.off()
```
#### Example 4: Generating a random RGB image
```{r}
# Step 1: Define the width and height of the image
width = 300;
height = 300
# Step 2: Define the number of layers (RGB = 3)
layers = 3
# Step 3: Draw random color intensities from a standard-normal distribution
shades_of_red = rnorm(n = width*height, mean = 0, sd = 1)
shades_of_green = rnorm(n = width*height, mean = 0, sd = 1)
shades_of_blue = rnorm(n = width*height, mean = 0, sd = 1)
```
The color intensity must be in the range 0 to 255, however, our values are
standard normally distributed around 0:
```{r}
plot(density(shades_of_red))
```
We first normalize them to a range of 0 to 1 using the formula:
$z_i = \frac{x_i - min(x)}{max(x)-min(x)}$
and subsequently multiply by 255:
```{r}
# Step 4: Normalize to 0,255 range of values
shades_of_red = (shades_of_red - min(shades_of_red))/(max(shades_of_red)-min(shades_of_red))*255
shades_of_green = (shades_of_green - min(shades_of_green))/(max(shades_of_green)-min(shades_of_green))*255
shades_of_blue = (shades_of_blue - min(shades_of_blue))/(max(shades_of_blue)-min(shades_of_blue))*255
plot(density(shades_of_red))
# Step 5: Generate three matrices corresponding to Red, Green, and Blue values
red = matrix(shades_of_red, nrow = height, ncol = width)
green = matrix(shades_of_green, nrow = height, ncol = width)
blue = matrix(shades_of_blue, nrow = height, ncol = width)
# Step 6: Generate an array by combining the three matrices
image.array = array(c(red, green, blue), dim = c(width, height, layers))
dim(image.array)
# Step 7: Create RasterBrick
image = brick(image.array)
print(image)
# Step 8: Plot RGB
plotRGB(image)
```
From the examples above, you recognize that in order to generate/manipulate raster images in R, we basically generate/manipulate matrices/arrays (i.e., structures very common to any data analytics task in R). Several additional R packages come with pre-defined functions for more advanced image manipulations in R.
### Advanced Image Manipulation using ImageMagick
```{r}
# load package
library(magick)
# We can import images from the web
frink <- image_read("https://jeroen.github.io/images/frink.png")
print(frink)
# Rotate
image_rotate(frink, 45)
# Flip
image_flip(frink)
# Flop
image_flop(frink)
```
<!-- Working with GIFs -->
<!-- ```{r } -->
<!-- # Download earth gif -->
<!-- earth <- image_read("https://jeroen.github.io/images/earth.gif") -->
<!-- # Retrieve information -->
<!-- length(earth) # The gif is made from 44 individual images -->
<!-- # Working with pipelines -->
<!-- ## Let us make the earth spin the wrong way! -->
<!-- earth %>% -->
<!-- image_flop() -->
<!-- # Adding Frink to the GIF -->
<!-- frink.resize = frink %>% -->
<!-- image_resize('400x400!') %>% -->
<!-- image_animate() -->
<!-- new_gif = image_animate(c(frink.resize, earth)) -->
<!-- length(new_gif) -->
<!-- new_gif -->
<!-- ``` -->
### Optical Character Recognition
A common context to encounter image data in empirical economic research is the digitization of old texts. In that context, oprical character recognition (OCR) is used to extract text from scanned images. For example, in a setting where an archive of old newspapers is digitized in order to analyze historical news reports. R provides a straightforward approach to OCR in which the input is an image file (e.g., a png-file) and the output is a character string.
```{r, message=FALSE, warning=FALSE}
# For Optical Character Recognition
library(tesseract)
img <- image_read("https://s3.amazonaws.com/libapps/accounts/30502/images/new_york_times.png")
print(img)
headline <-
image_crop(image = img, geometry = '806x180')
headline
# Extract text
headline_text <- image_ocr(headline)
cat(headline_text)
```
```{r echo=FALSE, message=FALSE, warning=FALSE}
# clean up
detach("package:raster", unload=TRUE)
```
## Vector Images in R
An alternative to storing figures in matrix/bitmap-based image-files are vector-based graphics. Vector-based formats are not stored in the form of arrays/matrices that contain the color information of each pixel of an image. Instead they define the shapes, colors and coordinates of the objects shown in an image. Typically such vector-based images are computer drawings, plots, maps, and blueprints of technical infstructure (and not, for example, photos). There are different specific file-formats to store vector-based graphics, but typically they use a nesting structure and basic syntax that is similar to or a version of XML. That is, in order to work with the basic data contained in such files, we often can use familiar functions like `read_xml()`. To translate the basic vector-image data into images and modify these images additional packages such as `magick` are available. The following code example demonstrates these points. We first import the raw vector-image data of the R logo (stored as a SVG-file) as a XML-file into R. This allows us to have a closer look at the underlying data structure of such files. Then, we import it as an actual vector-based graphic via `image_read_svg()`
```{r}
# Common Packages for Vector Files
library(magick)
# Download and read svg image from url
URL <- "https://upload.wikimedia.org/wikipedia/commons/1/1b/R_logo.svg"
Rlogo_xml <- read_xml(URL)
# Data structure
Rlogo_xml
xml_structure(Rlogo_xml)
# Raw data
Rlogo_text <- as.character(Rlogo_xml)
# Plot
svg_img = image_read_svg(Rlogo_text)
image_info(svg_img)
```
```{r eval=FALSE}
svg_img
```
```{r rlogo, echo=FALSE, out.width = "15%", fig.align='center', purl=FALSE}
include_graphics("img/R_logo.svg.png")
```
<!-- ### Example: Manipulating Output width and height of SVGs -->
<!-- ```{r} -->
<!-- # Find information about output width and height in original source code -->
<!-- # Take a look at the first line of the source code: -->
<!-- Rlogo_raw[1] -->
<!-- # We want to have an image of size 400 x 800 instead: -->
<!-- Rlogo_raw[1] <- "<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" preserveAspectRatio=\"xMidYMid\" width=\"400\" height=\"800\" viewBox=\"0 0 724 561\">" -->
<!-- # Save as magick object -->
<!-- svg_new <- image_read_svg(paste0(Rlogo_raw, collapse ="")) -->
<!-- image_info(svg_new) -->
<!-- svg_new -->
<!-- ``` -->
<!-- ### Example: Maps -->