diff --git a/docs/examples/image_features/assets/boxes.jpg b/docs/examples/image_features/assets/boxes.jpg new file mode 100644 index 00000000..28940395 Binary files /dev/null and b/docs/examples/image_features/assets/boxes.jpg differ diff --git a/docs/examples/image_features/assets/hog.gif b/docs/examples/image_features/assets/hog.gif new file mode 100644 index 00000000..960f49f5 Binary files /dev/null and b/docs/examples/image_features/assets/hog.gif differ diff --git a/docs/examples/image_features/assets/human1.png b/docs/examples/image_features/assets/human1.png new file mode 100644 index 00000000..d0719604 Binary files /dev/null and b/docs/examples/image_features/assets/human1.png differ diff --git a/docs/examples/image_features/assets/human2.png b/docs/examples/image_features/assets/human2.png new file mode 100644 index 00000000..7ae713f4 Binary files /dev/null and b/docs/examples/image_features/assets/human2.png differ diff --git a/docs/examples/image_features/assets/human3.png b/docs/examples/image_features/assets/human3.png new file mode 100644 index 00000000..6f3b9dc8 Binary files /dev/null and b/docs/examples/image_features/assets/human3.png differ diff --git a/docs/examples/image_features/assets/humans.jpg b/docs/examples/image_features/assets/humans.jpg new file mode 100644 index 00000000..660addc4 Binary files /dev/null and b/docs/examples/image_features/assets/humans.jpg differ diff --git a/docs/examples/image_features/assets/not-human1.jpg b/docs/examples/image_features/assets/not-human1.jpg new file mode 100644 index 00000000..ad41b5e4 Binary files /dev/null and b/docs/examples/image_features/assets/not-human1.jpg differ diff --git a/docs/examples/image_features/assets/scores.png b/docs/examples/image_features/assets/scores.png new file mode 100644 index 00000000..d1021133 Binary files /dev/null and b/docs/examples/image_features/assets/scores.png differ diff --git a/docs/examples/image_features/brief.jl b/docs/examples/image_features/brief.jl new file mode 100644 index 00000000..cec89f96 --- /dev/null +++ b/docs/examples/image_features/brief.jl @@ -0,0 +1,84 @@ +# --- +# cover: assets/brief.gif +# title: BRIEF Descriptor +# description: This demo shows BRIEF descriptor +# author: Anchit Navelkar; Ashwani Rathee +# date: 2021-07-12 +# --- + +# `BRIEF` (Binary Robust Independent Elementary Features) is an efficient feature point descriptor. +# It is highly discriminative even when using relatively few bits and is computed using simple +# intensity difference tests. BRIEF does not have a sampling pattern thus pairs can be chosen +# at any point on the `SxS` patch. + +# To build a BRIEF descriptor of length `n`, we need to determine `n` pairs `(Xi,Yi)`. +# Denote by `X` and `Y` the vectors of point `Xi` and `Yi`, respectively. + +# In ImageFeatures.jl we have five methods to determine the vectors `X` and `Y` : + +# - `random_uniform` : `X` and `Y` are randomly uniformly sampled +# - `gaussian` : `X` and `Y` are randomly sampled using a Gaussian distribution, meaning that locations that are closer to the center of the patch are preferred +# - `gaussian_local` : `X` and `Y` are randomly sampled using a Gaussian distribution where first `X` is sampled with a standard deviation of `0.04*S^2` and then the `Yi’s` are sampled using a Gaussian distribution – Each `Yi` is sampled with mean `Xi` and standard deviation of `0.01 * S^2` +# - `random_coarse` : `X` and `Y` are randomly sampled from discrete location of a coarse polar grid +# - `center_sample` : For each `i`, `Xi` is `(0, 0)` and `Yi` takes all possible values on a coarse polar grid + +# As with all the binary descriptors, BRIEF’s distance measure is the number of +# different bits between two binary strings which can also be computed as the sum +# of the XOR operation between the strings. + +# BRIEF is a very simple feature descriptor and does not provide scale or rotation +# invariance (only translation invariance). To achieve those, see ORB, BRISK, +# FREAK examples + +# ## Example + +# Let us take a look at a simple example where the BRIEF descriptor is used to match +# two images where one has been translated by `(100, 200)` pixels. We will use the +# `lighthouse` image from the [TestImages](https://github.com/timholy/TestImages.jl) +# package for this example. + + +# Now, let us create the two images we will match using BRIEF. + +using ImageFeatures, TestImages, Images, ImageDraw, CoordinateTransformations + +img = testimage("sudoku") +img1 = Gray.(img) +trans = Translation(-50, -50) +img2 = warp(img1, trans, axes(img1)) + +# To calculate the descriptors, we first need to get the keypoints. For this tutorial, +# we will use the FAST corners to generate keypoints (see `fastcorners`). + +keypoints_1 = Keypoints(fastcorners(img1, 12, 0.4)) +keypoints_2 = Keypoints(fastcorners(img2, 12, 0.4)) + + +# To create the BRIEF descriptor, we first need to define the parameters by calling +# the `BRIEF` constructor. + +brief_params = BRIEF(size = 256, window = 10, seed = 123) + +# Now pass the image with the keypoints and the parameters to the +# `create_descriptor` function. + +desc_1, ret_keypoints_1 = create_descriptor(img1, keypoints_1, brief_params) +desc_2, ret_keypoints_2 = create_descriptor(img2, keypoints_2, brief_params) + +# The obtained descriptors can be used to find the matches between the two images +# using the `match_keypoints` function. + +matches = match_keypoints(ret_keypoints_1, ret_keypoints_2, desc_1, desc_2, 0.1) + +# We can use the [ImageDraw.jl](https://github.com/JuliaImages/ImageDraw.jl) package +# to view the results. + +grid = hcat(img1, img2) +offset = CartesianIndex(0, size(img1, 2)) +map(m -> draw!(grid, LineSegment(m[2] + offset,m[1] )), matches) +grid + + +# `grid` shows the results + +save("assets/brief.gif", cat(img1, img2, grid[1:512,1:512], grid[1:512,513:1024]; dims=3); fps=1) #src diff --git a/docs/examples/image_features/brisk.jl b/docs/examples/image_features/brisk.jl new file mode 100644 index 00000000..0a1ef85b --- /dev/null +++ b/docs/examples/image_features/brisk.jl @@ -0,0 +1,89 @@ +# --- +# cover: assets/brisk.gif +# title: BRISK Descriptor +# description: This demo shows BRISK descriptor +# author: Anchit Navelkar; Ashwani Rathee +# date: 2021-07-12 +# --- + +# The *BRISK* (Binary Robust Invariant Scalable Keypoints) descriptor has a predefined +# sampling pattern as compared to `BRIEF` or `ORB`. +# Pixels are sampled over concentric rings. For each sampling point, a small patch +# is considered around it. Before starting the algorithm, the patch is smoothed +# using gaussian smoothing. + +# Two types of pairs are used for sampling, short and long pairs. +# Short pairs are those where the distance is below a set threshold distmax while the +# long pairs have distance above distmin. Long pairs are used for orientation and +# short pairs are used for calculating the descriptor by comparing intensities. + +# BRISK achieves rotation invariance by trying the measure orientation of the keypoint +# and rotating the sampling pattern by that orientation. This is done by first +# calculating the local gradient `g(pi,pj)` between sampling pair `(pi,pj)` where +# `I(pj, pj)` is the smoothed intensity after applying gaussian smoothing. + +# `g(pi, pj) = (pi - pj) . I(pj, j) -I(pj, j)pj - pi2` + +# All local gradients between long pairs and then summed and the `arctangent(gy/gx)` +# between `y` and `x` components of the sum is taken as the angle of the keypoint. +# Now, we only need to rotate the short pairs by that angle to help the descriptor +# become more invariant to rotation. +# The descriptor is built using intensity comparisons. For each short pair if the +# first point has greater intensity than the second, then 1 is written else 0 is +# written to the corresponding bit of the descriptor. + +# ## Example + +# Let us take a look at a simple example where the BRISK descriptor is used to +# match two images where one has been translated by `(50, 40)` pixels and then +# rotated by an angle of 75 degrees. We will use the `lighthouse` image from the +# [TestImages](https://github.com/timholy/TestImages.jl) package for this example. + +# First, let us create the two images we will match using BRISK. + +using ImageFeatures, TestImages, Images, ImageDraw, CoordinateTransformations, Rotations +using MosaicViews + +img = testimage("lake_color") + +# Original Image + +img1 = Gray.(img) +rot = recenter(RotMatrix(5pi/6), [size(img1)...] .÷ 2) # a rotation around the center +tform = rot ∘ Translation(-50, -40) +img2 = warp(img1, tform, axes(img1)) +mosaicview(img, img1, img2; nrow=1) + +# To calculate the descriptors, we first need to get the keypoints. For this +# tutorial, we will use the FAST corners to generate keypoints (see `fastcorners`). + + +features_1 = Features(fastcorners(img1, 12, 0.35)) +features_2 = Features(fastcorners(img2, 12, 0.35)) + +# To create the BRISK descriptor, we first need to define the parameters by +# calling the `BRISK` constructor. + + +brisk_params = BRISK() + + +# Now pass the image with the keypoints and the parameters to the +# `create_descriptor` function. + +desc_1, ret_features_1 = create_descriptor(img1, features_1, brisk_params) +desc_2, ret_features_2 = create_descriptor(img2, features_2, brisk_params) + +# The obtained descriptors can be used to find the matches between the two +# images using the `match_keypoints` function. + +matches = match_keypoints(Keypoints(ret_features_1), Keypoints(ret_features_2), desc_1, desc_2, 0.1) + +# We can use the [ImageDraw.jl](https://github.com/JuliaImages/ImageDraw.jl) package to view the results. + +grid = hcat(img1, img2) +offset = CartesianIndex(0, size(img1, 2)) +map(m -> draw!(grid, LineSegment(m[1], m[2] + offset)), matches) +grid + +save("assets/brisk.gif", cat(img, img2, grid[1:512,1:512], grid[1:512,513:1024]; dims=3); fps=1) #src diff --git a/docs/examples/image_features/freak.jl b/docs/examples/image_features/freak.jl new file mode 100644 index 00000000..2371d475 --- /dev/null +++ b/docs/examples/image_features/freak.jl @@ -0,0 +1,76 @@ +# --- +# cover: assets/freak.gif +# title: FREAK Descriptor +# description: This demo shows FREAK descriptor +# author: Anchit Navelkar; Ashwani Rathee +# date: 2021-07-12 +# --- + +# `FREAK` (Fast Retina Keypoint) has a defined sampling pattern like `BRISK`. +# It uses a retinal sampling grid with more density of points near the centre +# with the density decreasing exponentially with distance from the centre. + +# FREAK’s measure of orientation is similar to `BRISK` but instead of using +# long pairs, it uses a set of predefined 45 symmetric sampling pairs. The set of +# sampling pairs is determined using a method similar to `ORB`, by finding +# sampling pairs over keypoints in standard datasets and then extracting the most +# discriminative pairs. The orientation weights over these pairs are summed and the +# sampling window is rotated by this orientation to some canonical orientation to +# achieve rotation invariance. + +# The descriptor is built using intensity comparisons of a predetermined set of 512 +# sampling pairs. This set is also obtained using a method similar to the one described +# above. For each pair if the first point has greater intensity than the second, +# then 1 is written else 0 is written to the corresponding bit of the descriptor. + +# ## Example + +# Let us take a look at a simple example where the FREAK descriptor is used to +# match two images where one has been translated by `(50, 40)` pixels and then +# rotated by an angle of 75 degrees. We will use the `lighthouse` image from +# the [TestImages](https://github.com/timholy/TestImages.jl) package for this example. + +# First, let us create the two images we will match using FREAK. + +using ImageFeatures, TestImages, Images, ImageDraw, CoordinateTransformations, Rotations + +img = testimage("peppers_color") + +# Original + +img1 = Gray.(img) +rot = recenter(RotMatrix(5pi/6), [size(img1)...] .÷ 2) # a rotation around the center +tform = rot ∘ Translation(-50, -40) +img2 = warp(img1, tform, axes(img1)) + +# To calculate the descriptors, we first need to get the keypoints. For this +# tutorial, we will use the FAST corners to generate keypoints (see `fastcorners`). + +keypoints_1 = Keypoints(fastcorners(img1, 12, 0.35)) +keypoints_2 = Keypoints(fastcorners(img2, 12, 0.35)) + + +# To create the FREAK descriptor, we first need to define the parameters +# by calling the `FREAK` constructor. + +freak_params = FREAK() + +# Now pass the image with the keypoints and the parameters to the `create_descriptor` function. + +desc_1, ret_keypoints_1 = create_descriptor(img1, keypoints_1, freak_params) +desc_2, ret_keypoints_2 = create_descriptor(img2, keypoints_2, freak_params) + +# The obtained descriptors can be used to find the matches between the two +# images using the `match_keypoints` function. + +matches = match_keypoints(ret_keypoints_1, ret_keypoints_2, desc_1, desc_2, 0.1) + +# We can use the [ImageDraw.jl](https://github.com/JuliaImages/ImageDraw.jl) +# package to view the results. + +grid = hcat(img1, img2) +offset = CartesianIndex(0, size(img1, 2)) +map(m -> draw!(grid, LineSegment(m[1], m[2] + offset)), matches) +grid + +save("assets/freak.gif", cat(img, img2, grid[1:512,1:512], grid[1:512,513:1024]; dims=3); fps=1) #src diff --git a/docs/examples/image_features/hog.jl b/docs/examples/image_features/hog.jl new file mode 100644 index 00000000..32ad84bc --- /dev/null +++ b/docs/examples/image_features/hog.jl @@ -0,0 +1,155 @@ +# --- +# cover: assets/hog.gif +# title: Object Detection using HOG +# description: This demo shows HOG descriptor +# author: Anchit Navelkar; Ashwani Rathee +# date: 2021-07-12 +# --- + +# In this tutorial, we will use Histogram of Oriented Gradient (HOG) feature +# descriptor based linear SVM to create a person detector. We will first create +# a person classifier and then use this classifier with a sliding window to +# identify and localize people in an image. + +# The key challenge in creating a classifier is that it needs to work with +# variations in illumination, pose and occlusions in the image. To achieve this, +# we will train the classifier on an intermediate representation of the image +# instead of the pixel-based representation. Our ideal representation (commonly +# called feature vector) captures information which is useful for classification +# but is invariant to small changes in illumination and occlusions. HOG descriptor +# is a gradient-based representation which is invariant to local geometric and +# photometric changes (i.e. shape and illumination changes) and so is a good +# choice for our problem. In fact HOG descriptors are widely used for object detection. + +# Download the script to get the training data [here](https://drive.google.com/file/d/11G_9zh9N-0veQ2EL5WDGsnxRpihsqLX5/view?usp=sharing). +# Download tutorial.zip, decompress it and run get_data.bash. (Change the +# variable `path_to_tutorial` in preprocess.jl and path to julia executable +# in get_data.bash). This script will download the required datasets. We will +# start by loading the data and computing HOG features of all the images. + +# ```julia +# using Images, ImageFeatures + +# path_to_tutorial = "" # specify this path +# pos_examples = "$path_to_tutorial/tutorial/humans/" +# neg_examples = "$path_to_tutorial/tutorial/not_humans/" + +# n_pos = length(readdir(pos_examples)) # number of positive training examples +# n_neg = length(readdir(neg_examples)) # number of negative training examples +# n = n_pos + n_neg # number of training examples +# data = Array{Float64}(undef, 3780, n) # Array to store HOG descriptor of each image. Each image in our training data has size 128x64 and so has a 3780 length +# labels = Vector{Int}(undef, n) # Vector to store label (1=human, 0=not human) of each image. + +# for (i, file) in enumerate([readdir(pos_examples); readdir(neg_examples)]) +# filename = "$(i <= n_pos ? pos_examples : neg_examples )/$file" +# img = load(filename) +# data[:, i] = create_descriptor(img, HOG()) +# labels[i] = (i <= n_pos ? 1 : 0) +# end +# ``` + +# Basically we now have an encoded version of images in our training data. +# This encoding captures useful information but discards extraneous information +# (illumination changes, pose variations etc). We will train a linear SVM on this data. + +# ```julia +# using LIBSVM + +# #Split the dataset into train and test set. Train set = 2500 images, Test set = 294 images. +# random_perm = randperm(n) +# train_ind = random_perm[1:2500] +# test_ind = random_perm[2501:end] + +# model = svmtrain(data[:, train_ind], labels[train_ind]); +# ``` + +# Now let's test this classifier on some images. + +# ```julia +# img = load("$pos_examples/per00003.ppm") +# descriptor = Array{Float64}(3780, 1) +# descriptor[:, 1] = create_descriptor(img, HOG()) + +# predicted_label, _ = svmpredict(model, descriptor); +# print(predicted_label) # 1=human, 0=not human + +# # Get test accuracy of our model +# predicted_labels, decision_values = svmpredict(model, data[:, test_ind]); +# @printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[test_ind])) * 100 # test accuracy should be > 98% +# ``` + +# Try testing our trained model on more images. You can see that it performs quite well. +# Image + + +# | ![Original](assets/human1.png) | ![Original](assets/human2.png) | +# |:------:|:---:| +# | predicted_label = 1 | predicted_label = 1 | + +# | ![Original](assets/human3.png) | ![Original](assets/not-human1.jpg) | +# |:------:|:---:| +# | predicted_label = 1 | predicted_label = 0 | + +# Next we will use our trained classifier with a sliding window to localize persons in an image. + +# ![Original](assets/humans.jpg) + +# ```julia +# img = load("path_to_tutorial/tutorial/humans.jpg") +# rows, cols = size(img) + +# scores = Array{Float64}(22, 45) +# descriptor = Array{Float64}(3780, 1) + +# #Apply classifier using a sliding window approach and store classification score for not-human at every location in score array +# for j = 32:10:cols-32 +# for i = 64:10:rows-64 +# box = img[i-63:i+64, j-31:j+32] +# descriptor[:, 1] = create_descriptor(box, HOG()) +# predicted_label, s = svmpredict(model, descriptor) +# scores[Int((i - 64) / 10)+1, Int((j - 32) / 10)+1] = s[1] +# end +# end +# ``` + +# ![](assets/scores.png) + +# You can see that classifier gave low score to not-human class (i.e. +# high score to human class) at positions corresponding to humans in +# the original image. +# Below we threshold the image and supress non-minimal values to get +# the human locations. We then plot the bounding boxes using `ImageDraw`. + +# ```julia +# using ImageDraw, ImageView + +# scores[scores.>0] = 0 +# object_locations = findlocalminima(scores) + +# rectangles = [ +# [ +# ((i[2] - 1) * 10 + 1, (i[1] - 1) * 10 + 1), +# ((i[2] - 1) * 10 + 64, (i[1] - 1) * 10 + 1), +# ((i[2] - 1) * 10 + 64, (i[1] - 1) * 10 + 128), +# ((i[2] - 1) * 10 + 1, (i[1] - 1) * 10 + 128), +# ] for i in object_locations +# ]; + +# for rec in rectangles +# draw!(img, Polygon(rec), RGB{N0f8}(0, 0, 1.0)) +# end +# imshow(img) +# ``` + +# ![](assets/boxes.jpg) + +# In our example we were lucky that the persons in our image had roughly +# the same size (128x64) as examples in our train set. We will generally +# need to take bounding boxes across multiple scales (and multiple +# aspect ratios for some object classes). + +using FileIO #src +img1 = load("assets/humans.jpg") #src +img2 = load("assets/boxes.jpg") #src +save("assets/hog.gif", cat(img1[1:342,1:342], img2[1:342,1:342]; dims=3); fps=1) #src + diff --git a/docs/examples/image_features/orb.jl b/docs/examples/image_features/orb.jl new file mode 100644 index 00000000..5fe210a7 --- /dev/null +++ b/docs/examples/image_features/orb.jl @@ -0,0 +1,74 @@ +# --- +# cover: assets/orb.gif +# title: ORB Descriptor +# description: This demo shows ORB descriptor +# author: Anchit Navelkar; Ashwani Rathee +# date: 2021-07-12 +# --- + +# The `ORB` (Oriented Fast and Rotated Brief) descriptor is a somewhat similar to `BRIEF`. +# It doesn’t have an elaborate sampling pattern as `BRISK` or `FREAK`. + +# However, there are two main differences between ORB and BRIEF: + +# - ORB uses an orientation compensation mechanism, making it rotation invariant. +# - ORB learns the optimal sampling pairs, whereas BRIEF uses randomly chosen sampling pairs. + +# The ORB descriptor uses the intensity centroid as a measure of orientation. +# To calculate the centroid, we first need to find the moment of a patch, which is given by +# `Mpq = x,yxpyqI(x,y)`. The centroid, or ‘centre of mass' is then given by `C=(M10M00, M01M00)`. + +# The vector from the corner’s center to the centroid gives the orientation of the patch. +# Now, the patch can be rotated to some predefined canonical orientation before calculating +# the descriptor, thus achieving rotation invariance. + +# ORB tries to take sampling pairs which are uncorrelated so that each new pair will bring +# new information to the descriptor, thus maximizing the amount of information the descriptor +# carries. We also want high variance among the pairs making a feature more discriminative, +# since it responds differently to inputs. To do this, we consider the sampling pairs over +# keypoints in standard datasets and then do a greedy evaluation of all the pairs in order +# of distance from mean till the number of desired pairs are obtained i.e. the size of the descriptor. + +# The descriptor is built using intensity comparisons of the pairs. For each pair if the +# first point has greater intensity than the second, then 1 is written else 0 is written +# to the corresponding bit of the descriptor. + +# ## Example + +# Let us take a look at a simple example where the ORB descriptor is used to match two +# images where one has been translated by `(50, 40)` pixels and then rotated by an angle +# of 75 degrees. We will use the `lighthouse` image from the [TestImages](https://github.com/JuliaImages/TestImages.jl) package for this example. + +# First, let us create the two images we will match using ORB. + +using ImageFeatures, TestImages, Images, ImageDraw, CoordinateTransformations, Rotations + +img = testimage("cameraman") +img1 = Gray.(img) +rot = recenter(RotMatrix(5pi/6), [size(img1)...] .÷ 2) # a rotation around the center +tform = rot ∘ Translation(-50, -40) +img2 = warp(img1, tform, axes(img1)) + +# The ORB descriptor calculates the keypoints as well as the descriptor, unlike `BRIEF`. +# To create the ORB descriptor, we first need to define the parameters by calling the `ORB` constructor. + +orb_params = ORB(num_keypoints = 1000) + +# Now pass the image with the parameters to the `create_descriptor` function. + +desc_1, ret_keypoints_1 = create_descriptor(img1, orb_params) +desc_2, ret_keypoints_2 = create_descriptor(img2, orb_params) + +# The obtained descriptors can be used to find the matches between the two +# images using the `match_keypoints` function. + +matches = match_keypoints(ret_keypoints_1, ret_keypoints_2, desc_1, desc_2, 0.2) + +# We can use the [ImageDraw.jl](https://github.com/JuliaImages/ImageDraw.jl) package to view the results. + +grid = hcat(img1, img2) +offset = CartesianIndex(0, size(img1, 2)) +map(m -> draw!(grid, LineSegment(m[1], m[2] + offset)), matches) +grid + +save("assets/orb.gif", cat(img1, img2, grid[1:512,1:512], grid[1:512,513:1024]; dims=3); fps=1) #src