Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_image now accepts file objects that support being read #1423

Merged
merged 9 commits into from
Jan 10, 2025

Conversation

PyWoody
Copy link
Contributor

@PyWoody PyWoody commented Jan 9, 2025

What has been done

With this PR, deepface.commons.image_utils.load_image now accepts file objects that support being read.

Objects that support being read but not seeked are handled by reading the data into a new io.BytesIO object which is automatically closed before returning the data or raising an exception.

Examples of new functionalities are...

PIL Image

Below is an example of opening a PIL.Image in memory, rotating the image, writing the modified image to an in-memory io.BytesIO object, and then performing a DeepFace.represent calculation on the in-memory object.

Modifications of this ilk could be converting images to greyscale, resizing, cropping, etc., all done in memory while not having to constantly write to disk.

from deepface import DeepFace
from PIL import Image

representations = []
with Image.open(r'/path/to/image.png') as img:
    for rotation in range(4):
        pil_io = io.BytesIO()
        rotated_img = img.rotate(rotation * 90)
        rotated_img.save(pil_io, format='PNG')
        rp = DeepFace.represent(pil_io, enforce_detection=False)
        representations.append(rp)
        pil_io.close()

Streaming object

I know deepface.commons.image_utils.load_image already accepts a URL as a string but it can be more convenient to control how the request is made yourself than by the library, i.e., allowing redirects, controlling timeout, handling tokens with Sessions, etc.

import requests

from deepface import DeepFace

r = requests.get('https://raw.githubusercontent.com/serengil/deepface/refs/heads/master/tests/dataset/img1.jpg', stream=True)
r.raise_for_status()
representations = DeepFace.represent(r.raw)

As a File Object

with open(r'/path/to/image.png', 'rb') as f:
    representation = DeepFace.represent(f)

or

representation = DeepFace.represent(open(r'/path/to/image.png', 'rb'))

This functionality, while already provided by passing the filepath as a string, closer mirrors the functionality of other Python libraries, such as how json.load[0], pickle.load[1] and PIL.Image.open[2] operate.

From a ZipArchive

from zipfile import ZipFile

with ZipFile('some_archive.zip') as z:
    with z.open(r'/path/to/image.png') as f:
        representation = DeepFace.represent(f)

[0] https://docs.python.org/3/library/json.html#json.load
[1] https://docs.python.org/3/library/pickle.html#pickle.load
[2] https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.open

How to test

A new test at deepface.tests.test_represent.test_standard_represent_with_io_object has been created to ensure representations loaded with the new deepface.commons.image_utils.load_image_from_io_object are identical to the default implementation. The new test also ensures objects that do not support seek, e.g., streaming objects, are also properly handled and supported.

make lint && make test

tests/test_represent.py Outdated Show resolved Hide resolved
def test_standard_represent_with_io_object():
img_path = "dataset/img1.jpg"
defualt_embedding_objs = DeepFace.represent(img_path)
io_embedding_objs = DeepFace.represent(open(img_path, 'rb'))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if i pass a text file as

io_obj = io.BytesIO(open("requirements.txt", 'rb').read())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that'll work as io.BytesIO supports .read.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not okay then, we should be able to pass only image files to deepface functionalities.

Copy link
Contributor Author

@PyWoody PyWoody Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm sorry, I misunderstood what you were asking.

An error would be raised by np.frombuffer(obj.read(), np.uint8) or cv2.imdecode(nparr, cv2.IMREAD_COLOR), just as if you were trying to pass any other non-supported filetype as a string/filepath to load_image.

I only meant it would pass the hasattr(img, "read") and callable(img.read)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to call load_image_from_io_object with requirements.txt but it didn't throw any exception

Copy link
Contributor Author

@PyWoody PyWoody Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I had been following how load_image_from_web handled malformed images, but, if you want to raise the error in the function, I could update load_image_from_io_object to raise a ValueError if cv2.imdecode returns None like how load_image_from_file_storage is currently implemented.

It does appear modules in deepface.modules all already do None checks when loading the img_path, which load_image_from_io_object does follow in convention.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please raise an error if it is not an image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_image_from_io_object now raises a ValueException if cv2.imdecode returns None for objects that aren't images, which is in line with how load_image_from_file_storage handles non-image objects.

@serengil
Copy link
Owner

serengil commented Jan 9, 2025

TBH, i am not sure about this functionality.

Would you please create an issue first and let us to discuss first?

@PyWoody
Copy link
Contributor Author

PyWoody commented Jan 9, 2025

Sure, I have to step out for a minute but I can create the issue when I get back.

@serengil
Copy link
Owner

serengil commented Jan 9, 2025

Sure, I have to step out for a minute but I can create the issue when I get back.

No, not for this, before creating a PR. Otherwise, not merging comprehensive PR makes me sad.

@PyWoody
Copy link
Contributor Author

PyWoody commented Jan 9, 2025

No worries. I'll make sure to do that next time.

It's no problem if you don't want to merge the PR. This is a feature I need for my personal use case anyways. I thought I should share in case others needed it, so there's no wasted effort.


# Confirm non-image io objects raise exceptions
with pytest.raises(ValueError, match='Failed to decode image'):
DeepFace.represent(io.BytesIO(open(__file__, 'rb').read()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of passing file, can we send the path of the requirements.txt here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, should be all set now.

@serengil
Copy link
Owner

@PyWoody
Copy link
Contributor Author

PyWoody commented Jan 10, 2025

Nope, that's how PIL does it and the others I checked. I just wanted to double-check first. These should be all set now.

@serengil
Copy link
Owner

Would you please the types in docstring, too?

e.g. https://github.com/serengil/deepface/blob/master/deepface/DeepFace.py#L87

@PyWoody
Copy link
Contributor Author

PyWoody commented Jan 10, 2025

DeepFace.py and modules/detection.py have had their respective docstrings updated.

@serengil
Copy link
Owner

Failed at linting stage:

************* Module deepface.DeepFace
deepface/DeepFace.py:179:0: C0301: Line too long (108/100) (line-too-long)
deepface/DeepFace.py:284:0: C0301: Line too long (108/100) (line-too-long)
deepface/DeepFace.py:386:0: C0301: Line too long (108/100) (line-too-long)

-----------------------------------
Your code has been rated at 9.99/10

Besides, do you mind to add description in docsting? For instance, the description of the argument here - https://github.com/serengil/deepface/blob/master/deepface/DeepFace.py#L88

@PyWoody
Copy link
Contributor Author

PyWoody commented Jan 10, 2025

Ok, I'll update the descriptions. The current commit should fix the docstring issue. I didn't notice it regressed on my local during the make lint stage.

@serengil serengil merged commit 29200f4 into serengil:master Jan 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants