This is a dynamic codec plugin for Blosc2 that allows to compress and decompress images using the High Throughput JPEG 2000 standard. The HT version brings increased performance when compared to the traditional JPEG 2000. For details, check the HTJ2K whitepaper.
To provide this feature this plugin uses the OpenHTJ2K library.
The Blosc2 OpenHTJ2K plugin is distributed as a Python wheel:
pip install blosc2-openhtj2k
There are wheels for Linux, macOS and Windows. As of now, only the x86-64 architecture is supported.
The examples are not distributed with the wheel, but you can just clone the project:
git clone https://github.com/Blosc/blosc2_openhtj2k.git
cd blosc2_openhtj2k
In the examples folder there are the compress and decompress scripts, both take two required arguments, for the input and output file:
-
compress.py
takes as first argument the path to an image, and the second argument the path to the output file, which should end by the.b2nd
extension. -
decompress.py
takes as first argument the path to the Blosc2 file generated bycompress.py
, and second argument the path to the output image.
To try out these scripts first install the required software:
pip install blosc2-openhtj2k
pip install Pillow
Then you can run the scripts, from the examples folder, for example:
cd examples
python compress.py kodim23.png /tmp/kodim23.b2nd
python decompress.py /tmp/kodim23.b2nd /tmp/kodim23.png
Note that the examples cannot be run from the project's root, because it will fail to
import blosc2_openhtj2k
, since there's a directory with that name.
For details on the arguments these commands accept call them with the --help
option.
Below follows more detailed docs on how to accomplish the compression and decompression.
To compress an image first we need to load it, and to transform it to a Numpy array, then Blosc2 will compress that array.
For loading the image, and getting the Numpy array, we are going to use the Pillow library:
from PIL import Image
im = Image.open(args.inputfile)
np_array = np.asarray(im)
Before feeding this array to Blosc2, we need to massage it a bit, because its structure
is different from what is expected by the OpenHTJ2K plugin. As can be seen in the
compress.py
script, these are the transformations required, with comments:
# Transpose the array so the channel (color) comes first
# Change from (height, width, channel) to (channel, width, height)
np_array = np.transpose(np_array, (2, 1, 0))
# Make the array C-contiguous
np_array = np_array.copy()
# The library expects 4 bytes per color (xx 00 00 00), so change the type
np_array = np_array.astype('uint32')
It's possible to configure the OpenHTJ2K plugin with a number of options, this step is optional. For example:
import blosc2_openhtj2k
blosc2_openhtj2k.set_params_defaults(
transformation=0, # 0:lossy 1:lossless (default is 1)
)
Once the options above are set, these remain for all future calls, until they are changed again.
Note that:
-
We must tell Blosc2 to use the OpenHTJ2K codec, passing its corresponding id
BLOSC_CODEC_OPENHTJ2K
. -
At this time the plugin does not support multithreading, so the number of threads must be explicitly defined to 1.
-
OpenHTJ2K expects to work with images, so this plugin won't work well when combined with regular Blosc2 filters (
SHUFFLE
,BITSHUFFLE
,BYTEDELTA
...), nor with split mode, because they change the image completely; so these must be reset.
With that, we typically define the compression and decompression parameters like this:
nthreads = 1
cparams = {
'codec': blosc2.Codec.OPENHTJ2K,
'nthreads': nthreads,
'filters': [],
'splitmode': blosc2.SplitMode.NEVER_SPLIT,
}
dparams = {'nthreads': nthreads}
Now we can call the command that will compress the image using Blosc2 and the OpenHTJ2K plugin:
bl_array = blosc2.asarray(
np_array,
chunks=np_array.shape,
blocks=np_array.shape,
cparams=cparams,
dparams=dparams,
)
Note that:
-
We set the chunk and block shape to match the image size, as this is well tested.
-
If you pass the
urlpath
the Blosc2 array will be saved to the given path, for example `urlpath=/tmp/image.b2nd'.
If the Blosc2 array was saved to a file with a different program, we will need to read it first:
array = blosc2.open(args.inputfile)
Now decompressing it is easy, we will get a Numpy array:
np_array = array[:]
But prior to obtain the image, we must undo the transformations done when compressing:
# Get back 1 byte per color, change dtype from uint32 to uint8
np_array = np_array.astype('uint8')
# Get back the original shape: height, width, channel
np_array = np.transpose(np_array, (2, 1, 0))
Now we can get the Pillow image from the Numpy array:
im = Image.fromarray(np_array)
Which can be saved or displayed:
im.save(...)
im.show()
This plugin can also be called from a regular C program. For an example, look at: src/test_j2k.c