Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to create new landmark point definitions? #55

Closed
jjzazuet opened this issue Mar 21, 2022 · 6 comments
Closed

Is there a way to create new landmark point definitions? #55

jjzazuet opened this issue Mar 21, 2022 · 6 comments

Comments

@jjzazuet
Copy link

Hi. I'm trying to implement webcam based facial motion capture with with library.

The current set of face and retina landmarks is quite good, but I'm wondering if it's possible to extend the landmark regions that the PICO process can identify.

Specifically:

Option 1: adding new face landmarks

I'm assuming that new facial landmark locations would require defining new cascade files. Did the paper authors mentioned any recommendations on collecting and tagging example training data, or the training process itself?

facial landmarks

Option 2: adding simple paint based landmarks

Perhaps a simpler option would be to somehow define cascades for recognizing simple shapes like dot marks, or cross marks placed in an actor's face where the missing facial landmark points are needed.

paint marks

Let me know if I missed anything.

Thanks!

Resources

AutoDesk SoftImage - The Motion Capture Process

@esimov
Copy link
Owner

esimov commented Mar 21, 2022

In order to define new landmark points these needs to be trained by using some kind of neural network. I obtained the cascade files directly from one of the authors of the paper. The project is covering only the computer vision part and not the convolutional neural network part. In the papers cited on the Pigo readme page also there is no reference about the process used for training the cascade files and I'm not sure if this information would be available. I contacted Nenad Markus (one of the authors) a few years back and had a conversation with him about a few aspects of the project, but never talked about the neural network training. Eventually I might contact him again for asking some help on the neural network part (because I admit that it would be great to extend the facial landmark zones with more points of interests).

However my future interest is to develop this project into a computer vision library by supporting features and objects detection as it is mentioned in the Readme page. This is quite a big task but I consider doable. For more reference there is separate ticket covering this requirement #38.

@jjzazuet
Copy link
Author

@esimov got it, makes sense. Should I close this ticket and continue the conversation in #38 then? Thanks!

@esimov
Copy link
Owner

esimov commented Mar 22, 2022

You should close it. I might reopen it case it will be more progress regarding new feature points.

@jjzazuet
Copy link
Author

Got it. Thanks!

@jjzazuet
Copy link
Author

Hi @esimov . I kept pondering upon this issue during these last days, and I'd like to run this idea by you.

It looks like the OpenCV project already has some level of tooling available in order to generate training data used to create new Haar cascades:

https://docs.opencv.org/3.4/dc/d88/tutorial_traincascade.html
https://amin-ahmadi.com/cascade-trainer-gui/

Here are some examples of the training process outputs:

https://github.com/opencv/opencv/tree/master/data/haarcascades

So the question is: is there a way to convert an OpenCV XML cascade file into the in-memory format used by Pigo? In other words, is it possible to extend the input cascade file format to read from OpenCV XML files instead of the legacy binary format defined by Nenad Markus?

If this is possible, I think it would be useful to convert XML cascades into the face cascade format since it can be applied to the whole input image.

I realized this when I started creating a Java port for Pigo:

https://github.com/vaccovecrana/kimaris

Let me know what you think.

Thanks!

@esimov
Copy link
Owner

esimov commented Apr 2, 2022

@jjzazuet I'm not very convinced if this approach should work from various reasons: first because the xml structure is quite different, but that's not the main bottleneck. The main culprit from my perspective is that the algorithm itself has been adapted to the binary tree cascade structure. I found similarities between the OpenCV based XML cascade tree structure and the in-memory, binary based one used by Pigo, but I also found differences. Overall they resembles in many aspects (the leaf nodes are present, there is also a threshold and weakness counter), but I found that the tree depth and the tree codes are missing form the xml cascade files. These are key parts of the algorithm. So in order to adapt the algorithm to the xml based cascade files means that the whole code needs to be rewritten.

I will try to contact Nenad.

Btw: thumbs up for your Java port!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants