Merge pull request #32 from roboflow/feature/foundations_of_training_…

…readme updated project `README` to showcase new project profile
roboflow · Sep 4, 2024 · 50b4876 · 50b4876
2 parents aa0708c + 47aef06
commit 50b4876
Showing 1 changed file with 13 additions and 120 deletions.
diff --git a/README.md b/README.md
@@ -1,142 +1,35 @@
 
 <div align="center">
 
-  <h1>multimodal-maestro</h1>
+  <h1>maestro</h1>
 
-  <br>
-
-  [![version](https://badge.fury.io/py/maestro.svg)](https://badge.fury.io/py/maestro)
-  [![license](https://img.shields.io/pypi/l/maestro)](https://github.com/roboflow/multimodal-maestro/blob/main/LICENSE)
-  [![python-version](https://img.shields.io/pypi/pyversions/maestro)](https://badge.fury.io/py/maestro)
-  [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Roboflow/SoM)
-  [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow/multimodal-maestro/blob/develop/cookbooks/multimodal_maestro_gpt_4_vision.ipynb)
+  <p>coming: when it's ready...</p>
 
 </div>
 
 ## 👋 hello
 
-Multimodal-Maestro gives you more control over large multimodal models to get the 
-outputs you want. With more effective prompting tactics, you can get multimodal models 
-to do tasks you didn't know (or think!) were possible. Curious how it works? Try our 
-[HF space](https://huggingface.co/spaces/Roboflow/SoM)!
+**maestro** is a tool designed to streamline and accelerate the fine-tuning process for 
+multimodal models. It provides ready-to-use recipes for fine-tuning popular 
+vision-language models (VLMs) such as **Florence-2**, **PaliGemma**, and 
+**Phi-3.5 Vision** on downstream vision-language tasks.
 
 ## 💻 install
 
-⚠️ Our package has been renamed to `maestro`. Install the package in a
-[**3.11>=Python>=3.8**](https://www.python.org/) environment.
+Pip install the supervision package in a
+[**Python>=3.8**](https://www.python.org/) environment.
 
 ```bash
 pip install maestro
 ```
 
-## 🔌 API
-
-🚧 The project is still under construction. The redesigned API is coming soon.
-
-![maestro-docs-Snap](https://github.com/roboflow/multimodal-maestro/assets/26109316/a787b7c0-527e-465a-9ca9-d46f4d63ea53)
-
-## 🧑‍🍳 prompting cookbooks
-
-| Description                                                     | Colab                                                                                                                                                                                                   |
-|:----------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
-| Prompt LMMs with Multimodal Maestro                             | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow/multimodal-maestro/blob/develop/cookbooks/multimodal_maestro_gpt_4_vision.ipynb) |
-| Manually annotate ONE image and let GPT-4V annotate ALL of them | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow/multimodal-maestro/blob/develop/cookbooks/grounding_dino_and_gpt4_vision.ipynb)  |
-
-
 ## 🚀 example
 
-```
-Find dog.
-
->>> The dog is prominently featured in the center of the image with the label [9].
-```
-
-<details close>
-<summary>👉 read more</summary>
-
-<br>
-
-- **load image**
-
-  ```python
-  import cv2
-
-  image = cv2.imread("...")
-  ```
-
-- **create and refine marks**
-
-  ```python
-  import maestro
-
-  generator = maestro.SegmentAnythingMarkGenerator(device='cuda')
-  marks = generator.generate(image=image)
-  marks = maestro.refine_marks(marks=marks)
-  ```
-
-- **visualize marks**
-
-  ```python
-  mark_visualizer = maestro.MarkVisualizer()
-  marked_image = mark_visualizer.visualize(image=image, marks=marks)
-  ```
-  ![image-vs-marked-image](https://github.com/roboflow/multimodal-maestro/assets/26109316/92951ed2-65c0-475a-9279-6fd344757092)
-
-- **prompt**
-
-  ```python
-  prompt = "Find dog."
-
-  response = maestro.prompt_image(api_key=api_key, image=marked_image, prompt=prompt)
-  ```
-
-  ```
-  >>> "The dog is prominently featured in the center of the image with the label [9]."
-  ```
-
-- **extract related marks**
-
-  ```python
-  masks = maestro.extract_relevant_masks(text=response, detections=refined_marks)
-  ```
-
-  ```
-  >>> {'6': array([
-  ...     [False, False, False, ..., False, False, False],
-  ...     [False, False, False, ..., False, False, False],
-  ...     [False, False, False, ..., False, False, False],
-  ...     ...,
-  ...     [ True,  True,  True, ..., False, False, False],
-  ...     [ True,  True,  True, ..., False, False, False],
-  ...     [ True,  True,  True, ..., False, False, False]])
-  ... }
-  ```
-
-</details>
-
-![multimodal-maestro](https://github.com/roboflow/multimodal-maestro/assets/26109316/c04f2b18-2a1d-4535-9582-e5d3ec0a926e)
+Documentation and Florence-2 fine-tuning examples for object detection and VQA coming 
+soon.
 
 ## 🚧 roadmap
 
-- [ ] Rewriting the `maestro` API.
-- [ ] Update [HF space](https://huggingface.co/spaces/Roboflow/SoM).
-- [ ] Documentation page.
-- [ ] Add GroundingDINO prompting strategy.
-- [ ] CovVLM demo.
-- [ ] Qwen-VL demo.
-
-## 💜 acknowledgement
-
-- [Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding
-in GPT-4V](https://arxiv.org/abs/2310.11441) by Jianwei Yang, Hao Zhang, Feng Li, Xueyan
-Zou, Chunyuan Li, Jianfeng Gao.
-- [The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)](https://arxiv.org/abs/2309.17421)
-by Zhengyuan Yang, Linjie Li, Kevin Lin, Jianfeng Wang, Chung-Ching Lin, Zicheng Liu, 
-Lijuan Wang
-
-## 🦸 contribution
-
-We would love your help in making this repository even better! If you noticed any bug, 
-or if you have any suggestions for improvement, feel free to open an 
-[issue](https://github.com/roboflow/multimodal-maestro/issues) or submit a 
-[pull request](https://github.com/roboflow/multimodal-maestro/pulls).
+- [ ] Release a CLI for predefined fine-tuning recipes.
+- [ ] Multi-GPU fine-tuning support.
+- [ ] Allow multi-dataset fine-tuning and support multiple tasks at the same time.