AlbumentationsXL is a Python package for image augmentations fork of the Albumentations package, altered to accomodate large images using the libvips and pyvips libraries.
Although albumentations covers a wide range of augmentations in a user-friendly format, it is not well-suited for large image inputs. Since image operations are performed using Numpy, OpenCV and/or Pillow, images are loaded directly into memory, and this can cause out of memory errors on large inputs and transformations.
To solve this, libvips and its Python wrapper Pyvips can be used as the image handler, which is designed to process large images with low memory footprint, while remaining fast in execution. In doing so, the AlbumentationsXL package has the ease-of-use of Albumentations, while being capable to handle inputs of any size. Several fields that often deal with large images are:
- Digital pathology:
- Remote sensing (satelite imagery)
images in this domain are often .tif,.mrxs,.svs, etc. Pyvips can deal with most of these extensions (reading) without any problem, as long as the proper libvips dependencies are used.
Albumentations requires Python 3.10 or higher. To install the latest version from PyPI:
pip install -U albumentationsxl
Additionally, libvips 8.14.5 or higher, along with its Python bindings, pyvips 2.2.1 or higher, are needed as it is the core image handler. More information on how to build and install libvips/pyvips can be found on their respective documentation.
The full documentation is available at https://albumentations.ai/docs/.
import albumentationsxl as A
import pyvips
# Declare an augmentation pipeline
transform = A.Compose([
A.RandomCrop(width=256, height=256),
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(p=0.2),
])
# Read an image with OpenCV and convert it to the RGB colorspace
image = pyvips.Image.new_from_file("image.jpg")
# Augment an image
transformed = transform(image=image)
transformed_image = transformed["image"].numpy() # pyvips pipelines are not executed until they reach a target pipeline
Pixel-level transforms will change just an input image and will leave any additional targets such as masks, bounding boxes, and keypoints unchanged. The list of pixel-level transforms:
- AdvancedBlur
- Blur
- ChannelDropout
- ColorJitter
- Emboss
- FromFloat
- GaussNoise
- GaussianBlur
- HueSaturationValue
- MedianBlur
- MotionBlur
- Normalize
- RandomBrightnessContrast
- RandomGamma
- Sharpen
Spatial-level transforms will simultaneously change both an input image as well as additional targets such as masks, bounding boxes, and keypoints. The following table shows which additional targets are supported by each transform.
Transform | Image | Masks | BBoxes | Keypoints |
---|---|---|---|---|
Affine | ✓ | ✓ | ✓ | ✓ |
CenterCrop | ✓ | ✓ | ✓ | ✓ |
CoarseDropout | ✓ | ✓ | ✓ | |
Crop | ✓ | ✓ | ✓ | ✓ |
ElasticTransform | ✓ | ✓ | ✓ | |
Flip | ✓ | ✓ | ✓ | ✓ |
GridDropout | ✓ | ✓ | ||
HorizontalFlip | ✓ | ✓ | ✓ | ✓ |
Lambda | ✓ | ✓ | ✓ | ✓ |
LongestMaxSize | ✓ | ✓ | ✓ | ✓ |
NoOp | ✓ | ✓ | ✓ | ✓ |
PadIfNeeded | ✓ | ✓ | ✓ | ✓ |
RandomCrop | ✓ | ✓ | ✓ | ✓ |
RandomRotate90 | ✓ | ✓ | ✓ | ✓ |
RandomScale | ✓ | ✓ | ✓ | ✓ |
Resize | ✓ | ✓ | ✓ | ✓ |
Rotate | ✓ | ✓ | ✓ | ✓ |
SmallestMaxSize | ✓ | ✓ | ✓ | ✓ |
Transpose | ✓ | ✓ | ✓ | ✓ |
VerticalFlip | ✓ | ✓ | ✓ | ✓ |