You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation works with the 15M parameter version of tinyllamas. Just dropping in the next larger one (42M) flashes fine, but freezes at runtime.
Would need to look into what's happening here. It could be that the model weights plus the run state are larger than the available RAM (63.5MB). I might also have overlooked something about the memory layout. If it's the former, there might be a way to optimize memory usage to fit everything.
Another option would be to train a model between 15M and 42M parameters that just barely fits without any further optimizations.
The text was updated successfully, but these errors were encountered:
The current implementation works with the 15M parameter version of
tinyllamas
. Just dropping in the next larger one (42M) flashes fine, but freezes at runtime.Would need to look into what's happening here. It could be that the model weights plus the run state are larger than the available RAM (63.5MB). I might also have overlooked something about the memory layout. If it's the former, there might be a way to optimize memory usage to fit everything.
Another option would be to train a model between 15M and 42M parameters that just barely fits without any further optimizations.
The text was updated successfully, but these errors were encountered: