Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme memory usage loading OBB dataset #1762

Open
2 tasks done
patel-zeel opened this issue Jan 1, 2025 · 0 comments
Open
2 tasks done

Extreme memory usage loading OBB dataset #1762

patel-zeel opened this issue Jan 1, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@patel-zeel
Copy link
Contributor

patel-zeel commented Jan 1, 2025

Search before asking

  • I have searched the Supervision issues and found no similar bug report.

Bug

I tried loading the DOTAv1 dataset processed by Ultralytics from this URL: https://github.com/ultralytics/assets/releases/download/v0.0.0/DOTAv1.zip with sv.DetectionDataset.from_yolo but it was unsuccessful on Google colab. When I loaded a subset of 100 instances on our compute server, it consumed nearly 36 GB of memory. Free Google colab instance comes with ~12 GB memory only so it was bound to fail there.

I investigated further and found the following reason for the extreme memory usage:

  1. _with_mask function is called at the following position and it sets with_masks argument to True for OBB.
    with_masks = _with_mask(lines=lines)
  2. with_masks argument is then passed to the following function with other arguments:
    def yolo_annotations_to_detections(
    lines: List[str],
    resolution_wh: Tuple[int, int],
    with_masks: bool,
    is_obb: bool = False,
    ) -> Detections:
  3. Due to the effect of the with_masks argument, the following lines are executed:
    mask = _polygons_to_masks(polygons=polygons, resolution_wh=resolution_wh)
    return Detections(class_id=class_id, xyxy=xyxy, data=data, mask=mask)
  4. The 2D shape of the mask is the same as the (w, h) resolution of the image. If an image has N detections, the mask shape is (N, w, h), which is not space-friendly.

The size of the mask is the main reason for extreme storage usage.

Environment

Google colab:

  • Supervision: 0.25.1
  • Python: 3.10.12
  • Ubuntu: 22.04

Minimal Reproducible Example

Colab to reproduce the results.

Additional

I have not taken a deeper look, but a naive solution could be to generate masks only when needed and then discard them for efficient memory usage.

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@patel-zeel patel-zeel added the bug Something isn't working label Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant