-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancements #55
base: main
Are you sure you want to change the base?
Enhancements #55
Conversation
allows to set verbosity, low_memory flags etc Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
now the layer_ids arg can be either a list of int (the index of the layers), or "all" for all layers, "middle" for the middle half, "only_middle" for only one layer that is at the middle, or ranges in the form "0.3-0.7" (=layers from 30% of the depth to 70%) Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
should close vgel#51 Signed-off-by: thiswillbeyourgithub <[email protected]>
…r other) norm Signed-off-by: thiswillbeyourgithub <[email protected]>
'autocorrect' meaning that we check that each message is indeed present in the output after applying the chat template. This was needed as there's a lot of cases where models (including official releases from GAFAM!) have sketchy implementation that drop system prompts silently etc Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
…storing the arrays before computing directions Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
…on, avoiding OOM, etc Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
Signed-off-by: thiswillbeyourgithub <[email protected]>
This is really cool! My first thought is there's probably too much going on here for one PR, but let me look it over more closely and figure out how we can split this up. I've also been considering adding a utils file for a similar reason as you--I might make a PR for that soon-ish and tag you in. |
I'm also currently migrating the library from poetry to uv, so no rush on adding dependencies! |
Glad you lile it! I take this opportunity to thank you once again for making repeng, it's been super helpful to get me started on many ideas I had. I'm fine with splitting this up but I do care about my contribution being credited as I'm a self taught medical student (soon psychiatry resident!) so I want my interests and contributions to appear on my github for exposition :) In any case I keep breaking and unbreaking my fork as I'm continuously experimenting (since yesterday I've been playing around with filtering samples that are retrievable after a 2 cluster kmeans on the umap projection and I feel good about this one!). Appart from the read_representation method I think the code is mostly stable. I'm sometimes unsure about my chat templating thingy but that might be because of errors in the original templates of their repo too so would appreciate a second look on this if you have the bandwith. Also I couldn't run the tests on my machine, I'm thinking it would really help to know if my modifications are stable and with no side effects edit: actually I was able to run the tests. But as I made extensive changes I'll keep them so far in my fork. |
definitely, will absolutely make sure you're credited on anything that gets merged :-) |
here, a good initial thing to split off would be the ability to pass a string for
then in the docstring, we should suggest people use "middle" as the default. i don't think we should support None, because people should be explicit about what layers are being controlled. and i think "only_middle" and the range stuff are marginal enough that we don't need to support them by default--people can always calculate it themselves if they need it, or we can add it later. if you could split that off into a new PR, i can help make sure the tests are passing etc so we can get it merged, if that sounds good? |
Sounds good to me, thanks! I'll do that when I have some time in the next week |
Hi!
I'm the dev who made a research fork of repeng to explore some questions about what's possible with repeng and repeng-like techniques. I first mentioned my ideas in these messages
The project is still ongoing but as I kept adding more and more little enhancements it felt a good idea to try to upstream some of them.
Notably, added a utils.py file with function make datasets, joblib caching for the activations, support for chat templates (with auto correction), l1/l2 norm, documentation, more controls on which layer to trigger, numpy v2 compatibility.
Note that I couldn't run the test file because of import issues with flash attention.
I also did not add any modules to the requirements yet.
I mainly wanted your feedback on this.
Also thanks a lot for making repeng, it's really nice and really allowed me to get my hands dirty!
Commits: