Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix crash on non-AVX systems dynamically loading GGML CPU backends #11780

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jmorganca
Copy link
Contributor

Thanks for the awesome work by @slaren in #10469 (and a few follow up PRs) to enable dynamic GGML backend loading. This made supporting different CPU instructions in GGML much, much easier.

I noticed a small hitch with with the llamafile code where a machine with a non-AVX CPU would crash when trying to dlopen CPU libraries built with GGML_LLAMAFILE=ON. This moves the AVX-dependent code to do a member variable, fixing the crash on dlopen. I'm not sure how sgemm.cpp is vendored, and so let me know the best way/place to suggest a change.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Feb 10, 2025
Copy link
Collaborator

@slaren slaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I missed this global. The fix looks ok, but if the code is not inlined it may add some overhead to the other types. I will leave this open for a while in case someone knowledgeable about llamafile/tinyblas wants to propose a better solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants