Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify how we scale modules #352

Open
fmartiescofet opened this issue Jan 7, 2025 · 1 comment
Open

Unify how we scale modules #352

fmartiescofet opened this issue Jan 7, 2025 · 1 comment

Comments

@fmartiescofet
Copy link
Contributor

When using UperNet decoder with plain ViT decoder_scale_modules should be set to True to upscale the layers to simulate a "hierarchical output". This part can also be done in the LearnedInterpolateToPyramidal neck, in the tests for the Unet, which also needs this hierarchical output, this neck is used.
For this, I suggest deprecating scale_modules for the UperNetDecoder and recommend using the neck instead.

Also, we should consider the architecture for this neck as currently it only supports 4 layers and the last one is downscaled using maxpool, not sure if this is the best option and it might be interesting to look into it.

CC: @blumenstiel

@blumenstiel
Copy link
Collaborator

Thanks @fmartiescofet!

I agree, It is more inline with our architecture design to use necks instead of decoder specific settings. As you mentioned, the necks should also generalize better, by making it possible to configure the scaling (e.g. also scaling only 3 or 5 layers).

Also, the last layers should be scaled while the first layers are ignored. E.g. with 5 input laters and 4 scaling settings, the 1 laters gets passed. As Francesc explained it to me, smp_Unet required a additional 1. input layers that gets ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants