-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is this permutation necessary here (classification task) ? #205
Comments
@paolo-fraccaro any idea? |
I cannot figure out either, but also have a very vague memory that @CarlosGomes98 told me there was some factory that was needing data flipped. But cannot really remember the specifics. |
@paolo-fraccaro from your findings it seems that NHWC and NCHW are both present in TT correct? depending on the model/factory... should we set a project wide standard? we could as we control all the factories....if yes, what would happen if we don't do it? I've read somewhere that NCHW is the standard and better (performance wise) for nVidia In case we don't set the standard, there is no risk in getting things wrong as the shapes won't match and we will come to know through an error, correct? We could introduce this as a parameter (with NCHW as default) to the factory(ies) |
Yes I think this would be helpful. Sticking to one convention would help. I am happy with NCHW (NLC for transformers). Our classification head at the moment expects NHWC/NCL instead. |
ok @paolo-fraccaro what would u suggest? change the CH or add a transform neck and have the factories add the neck to the model per default? |
this is a way! We have the |
This code in the head strikes me as probably transformer specific... this transpose bit should be done by a neck instead. I think we can remove this line and try with a transform. As for the convention, NCHW is the norm, but transformers have N T C (t for tokens) internally. This is needed so that internally they can efficiently perform attention (e.g. applying the same mlp to all tokens) and is why we have the necks that perform permutation if needed (see the permute neck and the neck that goes from tokens to image) |
terratorch/terratorch/models/heads/classification_head.py
Line 49 in efd84db
Any comments @CarlosGomes98 ?
The text was updated successfully, but these errors were encountered: