Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APT results #11

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

APT results #11

wants to merge 8 commits into from

Conversation

mkabra
Copy link

@mkabra mkabra commented Jan 23, 2024

Hi,

We have benchmarked three multi-animal networks from APT on Deeplabcut Datasets. We tested our submission by running "python -m benchmark". We get the following output which we believe suggests that the submission is working:

benchmark method                    version           RMSE  mAP
trimouse  DLCRNet_ms4 (30k)         2.3.8     NaN  NaN
          EfficientNet B7_s4 (30k)  2.3.8     NaN  NaN
          ResNet50 (30k)            2.3.8     NaN  NaN
parenting DLCRNet_ms4 (30k)         2.3.8     NaN  NaN
          EfficientNet B7 (30k)     2.3.8     NaN  NaN
          EfficientNet B7_s4 (30k)  2.3.8     NaN  NaN
marmosets DLCRNet (200k)            2.3.8     NaN  NaN
          DLCRNet_ms4 (200k)        2.3.8     NaN  NaN
          EfficientNet B7 (200k)    2.3.8     NaN  NaN
          EfficientNet B7_s4 (200k) 2.3.8     NaN  NaN
fish      DLCRNet_ms4 (30k)         2.3.8     NaN  NaN
          EfficientNet B7_s4 (30k)  2.3.8     NaN  NaN
          ResNet50_s4 (30k)         2.3.8     NaN  NaN
trimouse  GRONe                     2.3.8     NaN  NaN
          MMPose-CiD                2.3.8     NaN  NaN
          DeTR+GRONe                2.3.8     NaN  NaN
marmosets GRONe                     2.3.8     NaN  NaN
          MMPose-CiD                2.3.8     NaN  NaN
          DeTR+GRONe                2.3.8     NaN  NaN
parenting GRONe                     2.3.8     NaN  NaN
          MMPose-CiD                2.3.8     NaN  NaN
          DeTR+GRONe                2.3.8     NaN  NaN
fish      GRONe                     2.3.8     NaN  NaN
          MMPose-CiD                2.3.8     NaN  NaN
          DeTR+GRONe                2.3.8     NaN  NaN

The documentation for creating the submission though is slightly out of date. We had to make the following changes to get the test to work

  • Add __init__.py to benchmark/submissions
  • Run python -m benchmark from DEEPLABCUT conda environment. This is probably obvious but it could help other users if it were mentioned explicitly.
  • Change the imports in the .py file to
from deeplabcut import benchmark
from deeplabcut.benchmark.benchmarks import TriMouseBenchmark, MarmosetBenchmark, ParentingMouseBenchmark, FishBenchmark

instead of

import benchmark
from benchmark.benchmarks import TriMouseBenchmark
  • In the class definition of the submissions, we do not need to inherit DLCBenchMixin

Hope everything else is as expected.
Best,
Mayank

@stes
Copy link
Collaborator

stes commented Jan 23, 2024

Hi @mkabra, thanks for flagging!

I will look into the proposed updates for the docs, and will confirm here again if your submission is working.

@stes
Copy link
Collaborator

stes commented Feb 15, 2024

Hi @mkabra , thanks for adding the most recent comments. Looking into this in the next days and will try to get back by early next week.

Thanks again for the contribution!

@MMathisLab
Copy link
Member

bump @stes

@n-poulsen
Copy link
Collaborator

@mkabra sorry for the really slow response time - there are still a few issues in your code that I've fixed on my end, and I'll push those changes.

Meanwhile, do you have scores available for your predictions? I've noticed our docs were incorrect and mentioned that the results should be given in the format

      return {
         "path/to/image.png" : (
            # animal 1
            {
               "snout" : (0, 1),
               "leftear" : (2, 3),
               ...
            },
            # animal 2
            {
               "snout" : (0, 1),
               "leftear" : (2, 3),
               ...
            },
         ),
         ...
      }

when they should be given in the format

      return {
         "path/to/image.png" : (
            # animal 1
            {
               "pose": {
                 "snout" : (12, 17),
                 "leftear" : (15, 13),
                 ...
               },
               "score": 0.9172,
            },
            ...
         ),
         ...
      }

To compute evaluation metrics, the model confidence is very important. I've been able to evaluate your model with random scores, but to get the true performance I would also need the score for each individual.

@n-poulsen
Copy link
Collaborator

@mkabra I just pushed the updated code (with 413b8e5) which changes the json format to have "pose" and "score" for each individual, and the path to the data files (which need to be from the root of the repository).

The scores for each prediction is generated with the first prediction for each being given the highest score, the 2nd the 2nd highest, etc. (updating the json files to individual scores will mean this is no longer needed - and produce the correct evaluation results).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants