Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_image now accepts file objects that support being read #1423

Merged
merged 9 commits into from
Jan 10, 2025
40 changes: 36 additions & 4 deletions deepface/commons/image_utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# built-in dependencies
import os
import io
from typing import Generator, List, Union, Tuple
from typing import IO, Generator, List, Union, Tuple
import hashlib
import base64
from pathlib import Path
Expand Down Expand Up @@ -77,11 +77,11 @@ def find_image_hash(file_path: str) -> str:
return hasher.hexdigest()


def load_image(img: Union[str, np.ndarray]) -> Tuple[np.ndarray, str]:
def load_image(img: Union[str, np.ndarray, IO[bytes]]) -> Tuple[np.ndarray, str]:
"""
Load image from path, url, base64 or numpy array.
Load image from path, url, file object, base64 or numpy array.
Args:
img: a path, url, base64 or numpy array.
img: a path, url, file object, base64 or numpy array.
Returns:
image (numpy array): the loaded image in BGR format
image name (str): image name itself
Expand All @@ -91,6 +91,14 @@ def load_image(img: Union[str, np.ndarray]) -> Tuple[np.ndarray, str]:
if isinstance(img, np.ndarray):
return img, "numpy array"

# The image is an object that supports `.read`
if hasattr(img, 'read') and callable(img.read):
if isinstance(img, io.StringIO):
raise ValueError(
'img requires bytes and cannot be an io.StringIO object.'
)
return load_image_from_io_object(img), 'io object'

if isinstance(img, Path):
img = str(img)

Expand Down Expand Up @@ -120,6 +128,30 @@ def load_image(img: Union[str, np.ndarray]) -> Tuple[np.ndarray, str]:
return img_obj_bgr, img


def load_image_from_io_object(obj: IO[bytes]) -> np.ndarray:
"""
Load image from an object that supports being read
Args:
obj: a file like object.
Returns:
img (np.ndarray): The decoded image as a numpy array (OpenCV format).
"""
try:
_ = obj.seek(0)
except (AttributeError, TypeError, io.UnsupportedOperation):
seekable = False
obj = io.BytesIO(obj.read())
else:
seekable = True
try:
nparr = np.frombuffer(obj.read(), np.uint8)
img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
return img
finally:
if not seekable:
obj.close()


def load_image_from_base64(uri: str) -> np.ndarray:
"""
Load image from base64 string.
Expand Down
16 changes: 16 additions & 0 deletions tests/test_represent.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# built-in dependencies
import io
import cv2

# project dependencies
Expand All @@ -18,6 +19,21 @@ def test_standard_represent():
logger.info("✅ test standard represent function done")


def test_standard_represent_with_io_object():
img_path = "dataset/img1.jpg"
defualt_embedding_objs = DeepFace.represent(img_path)
serengil marked this conversation as resolved.
Show resolved Hide resolved
io_embedding_objs = DeepFace.represent(open(img_path, 'rb'))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if i pass a text file as

io_obj = io.BytesIO(open("requirements.txt", 'rb').read())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that'll work as io.BytesIO supports .read.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not okay then, we should be able to pass only image files to deepface functionalities.

Copy link
Contributor Author

@PyWoody PyWoody Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm sorry, I misunderstood what you were asking.

An error would be raised by np.frombuffer(obj.read(), np.uint8) or cv2.imdecode(nparr, cv2.IMREAD_COLOR), just as if you were trying to pass any other non-supported filetype as a string/filepath to load_image.

I only meant it would pass the hasattr(img, "read") and callable(img.read)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to call load_image_from_io_object with requirements.txt but it didn't throw any exception

Copy link
Contributor Author

@PyWoody PyWoody Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I had been following how load_image_from_web handled malformed images, but, if you want to raise the error in the function, I could update load_image_from_io_object to raise a ValueError if cv2.imdecode returns None like how load_image_from_file_storage is currently implemented.

It does appear modules in deepface.modules all already do None checks when loading the img_path, which load_image_from_io_object does follow in convention.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please raise an error if it is not an image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_image_from_io_object now raises a ValueException if cv2.imdecode returns None for objects that aren't images, which is in line with how load_image_from_file_storage handles non-image objects.

assert defualt_embedding_objs == io_embedding_objs

# Confirm non-seekable io objects are handled properly
io_obj = io.BytesIO(open(img_path, 'rb').read())
io_obj.seek = None
no_seek_io_embedding_objs = DeepFace.represent(io_obj)
assert defualt_embedding_objs == no_seek_io_embedding_objs

logger.info("✅ test standard represent with io object function done")


def test_represent_for_skipped_detector_backend_with_image_path():
face_img = "dataset/img5.jpg"
img_objs = DeepFace.represent(img_path=face_img, detector_backend="skip")
Expand Down
Loading