Unify how we scale modules #352

fmartiescofet · 2025-01-07T15:56:34Z

When using UperNet decoder with plain ViT decoder_scale_modules should be set to True to upscale the layers to simulate a "hierarchical output". This part can also be done in the LearnedInterpolateToPyramidal neck, in the tests for the Unet, which also needs this hierarchical output, this neck is used.
For this, I suggest deprecating scale_modules for the UperNetDecoder and recommend using the neck instead.

Also, we should consider the architecture for this neck as currently it only supports 4 layers and the last one is downscaled using maxpool, not sure if this is the best option and it might be interesting to look into it.

CC: @blumenstiel

The text was updated successfully, but these errors were encountered:

blumenstiel · 2025-01-08T14:30:35Z

Thanks @fmartiescofet!

I agree, It is more inline with our architecture design to use necks instead of decoder specific settings. As you mentioned, the necks should also generalize better, by making it possible to configure the scaling (e.g. also scaling only 3 or 5 layers).

Also, the last layers should be scaled while the first layers are ignored. E.g. with 5 input laters and 4 scaling settings, the 1 laters gets passed. As Francesc explained it to me, smp_Unet required a additional 1. input layers that gets ignored.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify how we scale modules #352

Unify how we scale modules #352

fmartiescofet commented Jan 7, 2025

blumenstiel commented Jan 8, 2025

Unify how we scale modules #352

Unify how we scale modules #352

Comments

fmartiescofet commented Jan 7, 2025

blumenstiel commented Jan 8, 2025