Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try larger models 💪 #3

Open
maxbbraun opened this issue Nov 15, 2023 · 0 comments
Open

Try larger models 💪 #3

maxbbraun opened this issue Nov 15, 2023 · 0 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@maxbbraun
Copy link
Owner

The current implementation works with the 15M parameter version of tinyllamas. Just dropping in the next larger one (42M) flashes fine, but freezes at runtime.

Would need to look into what's happening here. It could be that the model weights plus the run state are larger than the available RAM (63.5MB). I might also have overlooked something about the memory layout. If it's the former, there might be a way to optimize memory usage to fit everything.

Another option would be to train a model between 15M and 42M parameters that just barely fits without any further optimizations.

@maxbbraun maxbbraun added enhancement New feature or request good first issue Good for newcomers labels Nov 15, 2023
@maxbbraun maxbbraun changed the title Try larger models Try larger models 💪 Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant