ML Pipeline v2 #684

mihow · 2025-01-21T05:14:22Z

Summary

Bringing our ML pipeline out of beta.

List of Major Changes

New Processing Services section. Allows users to add an endpoint themselves and serve machine learning pipelines from their own compute infrastructure or local machine. When a Service is added, the pipelines it has available are registered as options that the project's users can choose for processing images.
Begin storing label lists / category maps that the algorithms use. This model is called AlgorithmCategoryMap and maps the class index from the last model layer to the actual categories they represent. The categories have both a simple text label "Species name" as well as an object of metadata with an optional GBIF key, taxon rank, etc. These are used to create taxon entries in the Antenna database as well as show the top 3,5,10, N number of predictions from a model rather than just the top 1 which is saved as the occurrence's determination.
Moves job logs to their own field on the job model to reduce DB writes and overwriting the status field when writing logs.
Job status and progress fields have been fixed so that the status & progress (# images done) are not toggling between states, or ending on a strange value.
Update what is considered a job failure: If more than 50% of images in a job fail, or if any result saving fails.
Add "task_type" field to algorithms so we can determine if its a classification model vs. a detection model and so on.
Handle results from multiple algorithms. The best score from all (non-intermediate) algorithms is used to determine the species determination of an occurrence (not just the latest prediction). Identifiers can now see and Agree with the result from any classifier, including an intermediate classifier like Moth/Non-Moth.
Faster & more stable saving of results. Results from the ML backend are now saved more efficiently after each batch of images and error handling is greatly improved.

Related Issues

Closes #607
Closes #310

Unblocks several tickets coming up. Including:

Using the Moth/non-moth intermediate model as a solo / terminal classifier Make a pipeline for only the binary classifier #621
Allow reprocessing same detections with new classifier (without running detector again)
Can apply post-processing filters of species by region (filter and re-weight the scores)
Can do genus & higher taxon roll-ups
Can use confidence score algorithms other than softmax on the Antenna side (e.g. temperature calibrated)
Can better send current results back to backend to determine reprocessing needs
Ready for detection tracking across frames!

Screenshots

Deployment Notes

Needs to be deployed with corresponding ML backend changes here:
RolnickLab/ami-data-companion#67

Checklist

I have tested these changes appropriately.
I have added and/or modified relevant tests.
I updated relevant documentation or comments.
I have verified that this PR follows the project's coding standards.
Any dependent changes have already been merged to main.

…es action

…ub.com/RolnickLab/ami-platform into feat/ml-pipeline-registry

… main

…o feat/ml-pipeline-registry

netlify · 2025-01-21T05:15:34Z

✅ Deploy Preview for ami-dev canceled.

Name	Link
🔨 Latest commit	`59b4867`
🔍 Latest deploy log	https://app.netlify.com/sites/ami-dev/deploys/67964e1675c7480008670562

…eline-v2

…enna into feat/ml-pipeline-v2

…eline-v2

sentry-io · 2025-01-26T19:03:03Z

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

‼️ ConnectionError: Error while reading from redis:6379 : (104, 'Connection reset by peer') ami.jobs.tasks.run_job View Issue
‼️ OperationalError: the connection is closed ami.jobs.tasks.run_job View Issue
‼️ ConnectionError: Error while reading from redis:6379 : (104, 'Connection reset by peer') ami.jobs.tasks.run_job View Issue
‼️ OperationalError: the connection is closed ami.jobs.tasks.run_job View Issue

_{Did you find this useful? React with a 👍 or 👎}

vanessavmac and others added 30 commits November 6, 2024 18:11

Create backend model

caedaaa

Create backend status endpoint

49bea0b

Return server status and available pipelines

a7116d5

Use pipeline slug

27d4792

Fix .gitignore

a8bb218

Update backend status endpoint, test pipeline process images

a1dfcc0

fix: missing import in ml models

a66ab33

Add Backend to admin, update pipeline/backend model, register_pipelin…

c0ca4a3

…es action

Fix type checking

d5f43f5

Add backend id to test pipeline processing

74825fe

Constant and Random pipeline processing

fa6579a

Add test fixture

8ad8b57

Don't use same project id for all tests

18f9b9e

Added Backend created_at and updated_at serializer fields

d2a9acd

Update models and display backends last checked

6fe4267

Resolve merge conflicts

81c415d

Merge branches 'feat/ml-pipeline-registry' and 'main' of https://gith…

149f4cf

…ub.com/RolnickLab/ami-platform into feat/ml-pipeline-registry

Remove unused variables

a70a325

Remove unused file

81d92fe

Register pipelines via frontend

20ca912

Add missing fields to backend, fix migration error after merging with…

6909f33

… main

Add backend details dialog

53d08f0

Display backend details

6fbdc3e

Fix backend details displayed values

780c6df

Select first backend associated with pipeline

08eb418

Fix linting errors

565e107

Remove backend_id

8f44493

Remove version/version name, fix adding project, make endpoint required

b54e91f

Merge branch 'main' of https://github.com/RolnickLab/ami-platform int…

3a3cb74

…o feat/ml-pipeline-registry

Use ErrorState component

f09c04a

mihow added 5 commits January 20, 2025 22:18

fix: selection of existing pipelines & algorithms

3718088

fix: undefined variable

c4b6835

feat: increase batch size per request

1418787

chore: use old name for processing_service for now

94510b1

Merge branch 'main' of github.com:RolnickLab/antenna into feat/ml-pip…

a5193e8

…eline-v2

mihow added backend infrastructure labels Jan 22, 2025

mihow added 4 commits January 24, 2025 19:54

fix: update tests

71e4e46

fix: use old endpoint name for now

2a6ac07

Merge branch 'main' of github.com:RolnickLab/antenna into feat/ml-pip…

bd24ee7

…eline-v2

fix: conflicting migrations

aca567f

mihow marked this pull request as ready for review January 25, 2025 05:01

vanessavmac and others added 6 commits January 25, 2025 00:07

Fix processing service error handling

75911d9

Merge branch 'feat/ml-pipeline-registry' of github.com:RolnickLab/ant…

738979f

…enna into feat/ml-pipeline-v2

feat: show algorithm key/slug in UI

674ef91

fix: don't rename existing algorithms if not necessary

54d2df5

feat: update existing matching algorithms

9e76b4d

fix: ensure scores and labels align

a345436

This was referenced Jan 26, 2025

Enable users to register Processing Services & Pipelines #632

Merged

ML Pipeline v2 RolnickLab/ami-data-companion#67

Merged

Merge branch 'main' of github.com:RolnickLab/antenna into feat/ml-pip…

a72aa0c

…eline-v2

mihow mentioned this pull request Jan 26, 2025

Refactor and enhancements to saving of ML pipeline results #635

Closed

17 tasks

fix: missing env var during build time

59b4867

mihow merged commit b5d6885 into main Jan 26, 2025
6 checks passed

mihow deleted the feat/ml-pipeline-v2 branch January 26, 2025 15:32

This was referenced Jan 27, 2025

hotfix: snafu from renaming ml_backends to processing_services #697

Merged

Make a pipeline for only the binary classifier #621

Closed

Support for reprocessing detections and skipping detector #706

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML Pipeline v2 #684

ML Pipeline v2 #684

mihow commented Jan 21, 2025 •

edited

Loading

netlify bot commented Jan 21, 2025 •

edited

Loading

sentry-io bot commented Jan 26, 2025 •

edited

Loading

ML Pipeline v2 #684

ML Pipeline v2 #684

Conversation

mihow commented Jan 21, 2025 • edited Loading

Summary

List of Major Changes

Related Issues

Screenshots

Deployment Notes

Checklist

netlify bot commented Jan 21, 2025 • edited Loading

✅ Deploy Preview for ami-dev canceled.

sentry-io bot commented Jan 26, 2025 • edited Loading

Suspect Issues

mihow commented Jan 21, 2025 •

edited

Loading

netlify bot commented Jan 21, 2025 •

edited

Loading

sentry-io bot commented Jan 26, 2025 •

edited

Loading