-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which prompt use to generate the masks? #9
Comments
Thanks for your question! The "Detect XXX" prompt is mainly trained on the single object detection scenarios. A way to include multiple similar object is to use "Can you describe the image and detect objects?". However, this way is only for the whole image description but weak in referring the given objects. If you want to achieve this, I think a further fine-tuning on such kind of data is required. We are sorry about the incapability now. |
Thanks for the reply. Is there a prompt that will tell the model that I need a mask? |
If there is a box, there will be a corresponding mask. However, I filter out some low-quality masks by using a predicted iou_thres. You can modify the NExT-Chat/mllm/demo/demo_util.py Line 267 in ea67b83
|
Is there some kind of template to explain to the model that a mask is needed?
Also, the model is very unlikely (almost never) to give multiple bboxes for the same class when requesting something like: "Detect XXX. Please include several object locations". Most often the model just gives a bbox that combines all matching objects into one, but I would like them to be separate.
The text was updated successfully, but these errors were encountered: