Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Panoptic Quality (PQ) #408

Draft
wants to merge 43 commits into
base: main
Choose a base branch
from

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Jan 30, 2023

This PR adds the panoptic quality (PQ) metric, based on the original implementation.

Unlike most metrics that only require 2 things to be provided to the add_batch method (which are the predictions and the references), the panoptic quality metric requires 2 additional things to be provided, namely the predicted segments_info and the ground truth segments_info (which contain more information about the predicted and ground truth segmentation map, respectively).

Refer to this notebook for evaluating Mask2Former on the COCO panoptic validation set using this metric.

To do:

  • decide on API; what should be in the ground truth and predicted segments info? Both should ideally contain the same keys
  • support multiprocessing, which currently doesn't work

@NielsRogge NielsRogge requested a review from alaradirik January 30, 2023 14:47
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@NielsRogge NielsRogge marked this pull request as draft January 30, 2023 14:53
@alaradirik
Copy link

@NielsRogge thank you for working on this! I think it'll really make it easier to use the segmentation models as well.

I'm guessing researchers will evaluate their models on public datasets such as COCO and ADE20K, and other users will use their custom datasets and will want to evaluate using a minimal setup. My proposal is as follows:

  • Ground truth annotations include id, category_id, iscrowd, and optionally area (set to None by default) keys. if bbox is included in the dictionary, we simply ignore it.
  • The iscrowd key is very specific to the COCO dataset, perhaps we can look for a was_fused key and map the iscrowd key to it if the was_fused key is not included in the ground truth annotations.
  • If there is no area key in the ground truth or prediction annotations, we perform Connected Components Analysis on each (h, w) panoptic segmentation map to compute the area of each instance (fused or unfused). cv2.connectedComponentsWithStats returns the area of each instance.
  • Predicted segments include the id, category_id, was_fused and the optional area keys.

I'm in favor of changing label_id to category_id in our post-processing methods and keeping was_fused as it is since iscrowd is not a descriptive attribute name and users would have a harder time preparing their custom datasets for evaluation .

cc @amyeroberts

@NielsRogge
Copy link
Contributor Author

NielsRogge commented Feb 1, 2023

A few remarks:

Ground truth annotations include id, category_id, iscrowd, and optionally area (set to None by default) keys. if bbox is included in the dictionary, we simply ignore it.

I'm not sure it's possible to define optional features for the inputs of add_batch, cc @lvwerra. Here, we'd like to have an optional key in the reference_annotations.

If there is no area key in the ground truth or prediction annotations, we perform Connected Components Analysis on each (h, w) panoptic segmentation map to compute the area of each instance (fused or unfused). cv2.connectedComponentsWithStats returns the area of each instance.

I would definitely avoid having a cv2 dependency, as this library is pretty painful to install and it creates an additional dependency.

Predicted segments include the id, category_id, was_fused and the optional area keys.

The was_fused isn't used either by the metric, for the moment.

Ideally the same keys should be present in the ground truth and predicted segments_info (it's a bit weird to have different keys in both).

@lvwerra
Copy link
Member

lvwerra commented Feb 1, 2023

I'm not sure it's possible to define optional features for the inputs of add_batch, cc @lvwerra. Here, we'd like to have an optional key in the reference_annotations.

Yes, you can! The features can be a list of different formats and evaluate figures out which ones match the provided data (checkout bleu for example). You will just need to be consistent across all batches.

@lvwerra
Copy link
Member

lvwerra commented Mar 14, 2023

Let me know if you need another review on this @NielsRogge.

@NielsRogge
Copy link
Contributor Author

The metric is actually in a ready state, the final API just needs to be decided (which keys need to be in the predicted vs ground truth annotations). cc @alaradirik

@alaradirik
Copy link

alaradirik commented Mar 15, 2023

The metric is actually in a ready state, the final API just needs to be decided (which keys need to be in the predicted vs ground truth annotations). cc @alaradirik

Sorry about that, I thought I replied to your remarks!

I'm in favor of excluding keys that are not used for the metric computation - iscrowd/was_fused, area, bbox. Users can still include these in the ground truth annotations for the sake of convenience and we can drop them during the actual computation.

So the ground truth annotations would have the id, category_id, optionally iscrowd, area and bboxkeys and the predictions would have the id andcategory_id keys. We can exclude the optional / unused keys from the documentation and simply add a remark that additional metadata will be ignored.

What do you think? @NielsRogge @lvwerra @amyeroberts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants