maestro Florence-2 fine-tuning #33

SkalskiP · 2024-09-04T09:59:57Z

README.md update
maestro CLI with train and evaluate commands
Florence-2 fine-tuning
MeanAveragePrecisionMetric
saving best and latest checpoints
tracking and saving metrics

…add_timm fix: current Florence-2 training pipeline is missing `timm`

…readme updated project `README` to showcase new project profile

…checkpoints Feature/foundations of training checkpoints

…ns_of_cli # Conflicts: # maestro/trainer/common/utils/metrics_tracing.py # maestro/trainer/models/florence_2/entities.py # maestro/trainer/models/florence_2/training.py

Add first scratch of implementation for maestro CLI

SangbumChoi

Overall it looks well prepared. Since it is on-going project left some common question here, because I'm still learning the overall pipeline and code-style that roboflow team has made. (Luckly, it seems very similar to transformers pipeline)

does training.py in florence_2 is missing or on-going? (while paligemma has training.py)
Definitely consider multi-gpu circumstances when thinking of real user scenario.

This is my on-going Zero-shot Object detection pipeline in HuggingFace.
huggingface/transformers#32483

SangbumChoi · 2024-09-11T01:16:23Z

maestro/trainer/models/florence_2/core.py

+        num_workers=config.num_workers,
+        test_loaders_workers=config.val_num_workers,
+    )
+    peft_model = prepare_peft_model(


approaching with peft is also good way to start. FYI, I have tried three different technique

Full finetuning

Part finetuning (freezing encoder-like part)

peft

it turns out 2, 3 is robust for other hyperparameter option and I couldn't fine any stable configuration for 1.

We don't have 1/2 yet. I'm just wondering how to solve 2. In theory, users might want to freeze larger/smaller parts of the graph. Do you think such flexibility might be useful or can we just offer a pre-defined freeze?

I think for now supporting only peft would be enough (which means not the highest priority), since this is the moment of just starting to make the growth. If there is retention or other users inquiry then we can support from that moment.

I think so too! For the time being, we have to remember that such an option may arise at some point.

SangbumChoi · 2024-09-11T01:22:25Z

maestro/trainer/models/florence_2/metrics.py

+        # Postprocess prediction for mean average precision calculation
+        prediction = processor.post_process_generation(generated_text, task="<OD>", image_size=image.size)
+        prediction = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, prediction, resolution_wh=image.size)
+        prediction = prediction[np.isin(prediction["class_name"], classes)]


I also agree that this is one option to choose text-based output to calculate traditional OD.
However, some prediction is very close to the classes, but this np.isin will not able to catch.
e.g. prediction : apple, groundtruth : apples.

I also considered to use calculate the distance vectorized text embedding or other heuristic method such as CIDEr for to make more robust. It would be great to consider metrics in VLM. e.g. CIDEr, BLUE, etc...

I also agree that this is one option to choose text-based output to calculate traditional OD.
However, some prediction is very close to the classes, but this np.isin will not able to catch.
e.g. prediction : apple, groundtruth : apples.

Good catch! I've experienced that myself. I don't have the time to address it right now, but I'll add a task for it. Maybe one of the external contributors would like to implement this feature.chciałby zaimplementować ten feature.

I also considered to use calculate the distance vectorized text embedding or other heuristic method such as CIDEr for to make more robust. It would be great to consider metrics in VLM. e.g. CIDEr, BLUE, etc...

Do you have any resources (papers) where I could read about alternative metrics?

https://huggingface.co/metrics

https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vedantam_CIDEr_Consensus-Based_Image_2015_CVPR_paper.pdf

SangbumChoi · 2024-09-11T01:23:24Z

maestro/trainer/models/paligemma/training.py

+    with torch.amp.autocast(device.type, torch.float16):
+        lora_layers = filter(lambda p: p.requires_grad, peft_model.parameters())
+        optimizer = optim.SGD(lora_layers, lr=learning_rate)
+        scheduler = optim.lr_scheduler.CosineAnnealingLR(


Any reason using CosineAnnealingLR?

SkalskiP · 2024-09-11T14:38:43Z

Overall it looks well prepared. Since it is on-going project left some common question here, because I'm still learning the overall pipeline and code-style that roboflow team has made. (Luckly, it seems very similar to transformers pipeline)

does training.py in florence_2 is missing or on-going? (while paligemma has training.py)

Definitely consider multi-gpu circumstances when thinking of real user scenario.

This is my on-going Zero-shot Object detection pipeline in HuggingFace. huggingface/transformers#32483

Hi @SangbumChoi 👋🏻 First of all, thank you so much for taking the time to look at the code.

Originally, we planned to deliver recipes for two foundational models - Florence2 and PaliGemma. However, during the process, we realized that making PaliGemma work is harder to fine-tune. The Florence2 codebase is definitely more mature, so if you see any differences between Florence2 and PaliGemma, you can be almost certain that we'll ultimately do it the Florence2 way.
Do you have any experience setting up training in transformers on multiple GPUs?

add Florence-2 `evaluation` command

SangbumChoi · 2024-09-12T01:41:05Z

@SkalskiP

I think it is also good to have similar codebase of PaliGemma and Florence2 (It might not be possible, but let me brainstrom it).
Yes I have, and I always use multiple GPUs for traditional OD task also. (Never use single GPU)

Since this PR is merged let me run this repo and discuss in slack!

PawelPeczek-Roboflow and others added 30 commits August 26, 2024 23:59

Add initial setup

a494bd2

Add basic utils

b3eee67

Add basic training utils

8a0713e

Fix wrong typing

2d6e4cf

use reproducibility utils

aded4dd

Add basic utils for florence

263b7fe

Fix prompting

f9abed8

Fix prompting

c65fc44

Fix prompting

0c83e90

Fix prompting

9fbfc10

Fix prompting

453d4d1

Fix prompting

45a04da

Fix prompting

13d9d36

Scratch of training loop

e4e9fee

Fix bug

785014c

Fix bug

bf2b3e0

Fix bug

d21333e

Fix loss display

b870667

Fix bug with model loading

d4bb6f9

Add training summary

4a24e97

Fix visualisation bug

44a2d0b

Fix visualisation bug

5048079

Add metrics plots

7c62cae

Add metrics plots

783dc88

Add metrics plots

dbe7593

Fix minor issues

2fd4d7d

fix: current Florence-2 training pipeline is missing timm

6ff1926

update maestro README.md

47aef06

Merge pull request #31 from roboflow/feature/foundations_of_training_…

aa0708c

…add_timm fix: current Florence-2 training pipeline is missing `timm`

Merge pull request #32 from roboflow/feature/foundations_of_training_…

50b4876

…readme updated project `README` to showcase new project profile

SkalskiP added 20 commits September 10, 2024 00:33

updated results display

bb44d11

cleanup

ec2a324

test new checkpoint management system

2723987

more cleanup

27c3cfd

Merge pull request #36 from roboflow/feature/foundations_of_training_…

758c72a

…checkpoints Feature/foundations of training checkpoints

Merge branch 'feature/foundations_of_training' into feature/foundatio…

3e00b40

…ns_of_cli # Conflicts: # maestro/trainer/common/utils/metrics_tracing.py # maestro/trainer/models/florence_2/entities.py # maestro/trainer/models/florence_2/training.py

TrainingConfiguration filed names refactoer

dad39ba

final tests before plugging in CLI

672f27e

initial tests of CLI mode

4a339a4

fix

c7c63b7

fix No such option: --mode

5cc4220

fix 2 No such option: --mode

518323c

fix 3 No such option: --mode

fb212ea

fix 4 No such option: --mode

f15b7a9

fix 5 No such option: --mode

566d9ca

fix 6 No such option: --mode

fb1c826

bring back Pawel's code with improvements

d556a88

remove Literal from command definitions

f46049e

remove Union from command definitions

a2850ac

Merge pull request #35 from roboflow/feature/foundations_of_cli

278918c

Add first scratch of implementation for maestro CLI

SangbumChoi reviewed Sep 11, 2024

View reviewed changes

SkalskiP added 4 commits September 11, 2024 16:59

initial evaluate implementation

d614d25

adding quickstart section to README.md

5aba660

plug in mean_average_precision to Florence-2 CLI

50751d5

Merge pull request #38 from roboflow/feature/florence_2_evaluate

3a82c11

add Florence-2 `evaluation` command

SkalskiP marked this pull request as ready for review September 11, 2024 20:32

SkalskiP changed the title ~~WIP: foundations of training~~ maestro Florence-2 fine-tuning Sep 11, 2024

SkalskiP merged commit ccd268c into develop Sep 11, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

maestro Florence-2 fine-tuning #33

maestro Florence-2 fine-tuning #33

SkalskiP commented Sep 4, 2024 •

edited

Loading

SangbumChoi left a comment

SangbumChoi Sep 11, 2024 •

edited

Loading

SkalskiP Sep 11, 2024

SangbumChoi Sep 12, 2024 •

edited

Loading

SkalskiP Sep 12, 2024

SangbumChoi Sep 11, 2024

SkalskiP Sep 11, 2024

SkalskiP Sep 11, 2024

SangbumChoi Sep 12, 2024

SangbumChoi Sep 12, 2024

SangbumChoi Sep 11, 2024

SkalskiP commented Sep 11, 2024

SangbumChoi commented Sep 12, 2024 •

edited

Loading

maestro Florence-2 fine-tuning #33

maestro Florence-2 fine-tuning #33

Conversation

SkalskiP commented Sep 4, 2024 • edited Loading

SangbumChoi left a comment

Choose a reason for hiding this comment

SangbumChoi Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SangbumChoi Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SkalskiP commented Sep 11, 2024

SangbumChoi commented Sep 12, 2024 • edited Loading

SkalskiP commented Sep 4, 2024 •

edited

Loading

SangbumChoi Sep 11, 2024 •

edited

Loading

SangbumChoi Sep 12, 2024 •

edited

Loading

SangbumChoi commented Sep 12, 2024 •

edited

Loading