You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.
What batch size number other than batch size of 1024 have been tried when training a DeiT or ViT model? In the paper, DeiT (https://arxiv.org/abs/2012.12877), they used a batch size of 1024 and they mentioned that the learning rate should be scaled according to the batch size.
However, I was wondering if anyone have any experience or successfully train a DeiT model with a batch size that is even less than 512? If yes, what accuracy did you achieve?
The text was updated successfully, but these errors were encountered:
Phuoc-Hoan-Le
changed the title
What batch size number other than 1024 have been tried when training a DeiT or ViT model?
What batch size number other than 1024 have been tried when training a DeiT model?
Jan 8, 2023
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
What batch size number other than batch size of 1024 have been tried when training a DeiT or ViT model? In the paper, DeiT (https://arxiv.org/abs/2012.12877), they used a batch size of 1024 and they mentioned that the learning rate should be scaled according to the batch size.
However, I was wondering if anyone have any experience or successfully train a DeiT model with a batch size that is even less than 512? If yes, what accuracy did you achieve?
The text was updated successfully, but these errors were encountered: