Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

predicting oriented bounding box with either (x1,y1,x2,y2,x3,y3,x4,y4) or (cx,cy,w,h,angle) #241

Closed
arasharchor opened this issue Mar 7, 2018 · 4 comments

Comments

@arasharchor
Copy link

arasharchor commented Mar 7, 2018

Expected results

What did you expect to see? Rotated bounding boxes

Actual results

What did you observe instead? no detected bounding box

I have provided the network with either gt boxes with (x1,y1,x2,y2,x3,y3,x4,y4) format or (cx,cy,w,h,angle), but the network is not producing any bounding box without any errors. I have searched a lot and looked for similar issues, but could not find any helpful information. how to predict oriented bbox with R-FCN and RPN?

System information

  • Operating system: ? Ubuntu 16.04
  • CUDA version: ? 8.0
  • cuDNN version: ? 6.0
  • GPU models (for all devices if they are not all the same): ? Titan X Pascal
  • python --version output: ? 2.7

screenshot from 2018-03-06 23-06-17

@vsd550
Copy link

vsd550 commented Jun 11, 2018

Please reply on this as to how to modify its bounding boxes regression part to regress oriented bounding boxes.
I am having the same problem
Thanks

@gadcam
Copy link
Contributor

gadcam commented Jun 11, 2018

@smajida, @vsd550 as far as I know, this is not supported currently and the internal format is specified here

"""Box manipulation functions. The internal Detectron box format is
[x1, y1, x2, y2] where (x1, y1) specify the top-left box corner and (x2, y2)
specify the bottom-right box corner. Boxes from external sources, e.g.,
datasets, may be in other formats (such as [x, y, w, h]) and require conversion.
This module uses a convention that may seem strange at first: the width of a box
is computed as x2 - x1 + 1 (likewise for height). The "+ 1" dates back to old
object detection days when the coordinates were integer pixel indices, rather
than floating point coordinates in a subpixel coordinate frame. A box with x2 =
x1 and y2 = y1 was taken to include a single pixel, having a width of 1, and
hence requiring the "+ 1". Now, most datasets will likely provide boxes with
floating point coordinates and the width should be more reasonably computed as
x2 - x1.
In practice, as long as a model is trained and tested with a consistent
convention either decision seems to be ok (at least in our experience on COCO).
Since we have a long history of training models with the "+ 1" convention, we
are reluctant to change it even if our modern tastes prefer not to use it.

Looks like https://arxiv.org/ftp/arxiv/papers/1706/1706.09579.pdf or https://arxiv.org/pdf/1712.02294.pdf could help you for the RPN part.

EDIT : see also rbgirshick/py-faster-rcnn#432

@ir413
Copy link
Contributor

ir413 commented Jun 11, 2018

As @gadcam noted, this is not supported at this time. Sorry for the inconvenience.

@ir413 ir413 closed this as completed Jun 11, 2018
@HappyKerry
Copy link

@vsd550 @smajida I have the same problem,how to modify faster rcnn?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants