You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am curious as to why the FAN-B-H network, despite having fewer parameters and computational costs than ViT-B, has an inference time that is four times longer. I tested the inference times of FAN-B-H and ViT-B, with the former taking 20.8 ms per 100 runs and the latter 4.8 ms. The training time for FAN-B-H is also significantly slower. Could this be because some computations in FAN-B-H are not parallelizable?
The text was updated successfully, but these errors were encountered:
I am curious as to why the FAN-B-H network, despite having fewer parameters and computational costs than ViT-B, has an inference time that is four times longer. I tested the inference times of FAN-B-H and ViT-B, with the former taking 20.8 ms per 100 runs and the latter 4.8 ms. The training time for FAN-B-H is also significantly slower. Could this be because some computations in FAN-B-H are not parallelizable?
The text was updated successfully, but these errors were encountered: