-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] ISS-60: Implement Self Extend #431
base: dev
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
@jan-wassenberg would need your CR here, its in alpha stage right now. Let's go back and forth on this. Thanks |
4bfe885
to
5a2a7ee
Compare
Ooh nice :) Please note that we can only take pull requests on the dev branch. That code has just changed to replace template arguments with a runtime argument. Would you mind updating/rebasing your code to that? |
7bb4e0b
to
8cf3966
Compare
My bad, let me do the needful! Thanks for the pointer though. |
Haha I took so long to understand how main branch worked now I have to re-do it with this new base branch |
Note to self: Was able to compile gemma dev branch by commenting out Had to do this because the compiler has strict check against not sending non-trivial args to variadic method. Maybe I could've turned off that feature in clang using ^^ This issue should be resolved now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the code change. We are moving toward an "all in one file" model. Thanks for rebasing!
gemma/configs.h
Outdated
@@ -127,6 +127,10 @@ struct LayerConfig { | |||
size_t conv1d_width = 0; | |||
bool ff_biases = false; | |||
bool softmax_attn_output_biases = false; | |||
bool self_extend = false; | |||
size_t ngb_size = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this n-gram block? Maybe expand to block_size for more clarity? We can also move these three new fields into a section (just newline before them) with a // Self-extension
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jan-wassenberg Sorry, didn't understood it. I did it here because LayerConfig
gets accessed during the Attention mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be unclear, I was suggesting considering renaming this to ngram_block_size.
And it would be good to add a newline plus "// Self-extension" comment for visual separation from the other fields in this struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ngb
is short for neighbour
, i see the point of confusion now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah :) Generally it's good to write out words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understood, i'll make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to reply late :)
gemma/configs.h
Outdated
@@ -127,6 +127,10 @@ struct LayerConfig { | |||
size_t conv1d_width = 0; | |||
bool ff_biases = false; | |||
bool softmax_attn_output_biases = false; | |||
bool self_extend = false; | |||
size_t ngb_size = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be unclear, I was suggesting considering renaming this to ngram_block_size.
And it would be good to add a newline plus "// Self-extension" comment for visual separation from the other fields in this struct.
Nit: I noticed when I ran I suppose the team has some internal flow to run clang-format on the entire repo before merging which I'm not aware of, so I'll leave that decision onto you. |
No worries, both the result of clang-format, or leaving it unchanged, are fine. |
Hm looks like there's |
hm, the build error I'm currently seeing seems to be due to an extra/unnecessary & in line 309, "const hwy::Divisor&". We want to just construct one instance, not a reference. |
EDIT (IGNORE BELOW): Took longer than expected but here's a tech doc, will be easier to collaborate here. I pasted the same comment there, let me know if I got your email wrong? @jan-wassenberg Let me highlight an issue here: Summary: I want to mutate Background: Let's say I want to modify My proposal: Approach 1: Modification of A basic approach would allow modification of class ModelConfig {
int MutateModelConfig(const RuntimeConfig& runtime_config) {
// do some validation
// consume self_extend params, and modify model config with some validation
this->layer_configs = runtime_config.layer_configs
} Pros:
Cons:
Approach 2: Create member variables for Gemma
Pros:
Cons:
|
hm, I understand. FYI ModelConfig has some larger changes coming up in order to allow serializing it to disk. |
Hi @jan-wassenberg, thanks for the prompt reply. I suppose you mean we do something similar to how // gemma.h
ModelConfig& GetMutableModelConfig() const { return model_.MutableConfig(); } // weights.h
ModelConfig& MutableConfig() const { return config_; } In LayerConfig, define it as // configs.h
class LayerConfig {
/**
* Self-extend
* Jin, Hongye, et al. "Llm maybe longlm: Self-extend llm context window without tuning." arXiv preprint arXiv:2401.01325 (2024).
*/
bool self_extend = false;
// Self-extend neighbor size
size_t se_neighbor_size = std::numeric_limits<size_t>::max();
// Self-extend group window size
size_t se_group_size = 1; In our app we should allow storing of self extend params inside it, something like this struct LoaderArgs {
// Self-extend
Tristate self_extend;
size_t se_group_size;
size_t se_neighbor_size;
} Finally, in // run.cc
void ApplySelfExtendIfGiven(Gemma& model, const LoaderArgs& loader) {
ModelConfig& config = model.GetMutableConfig();
if (loader.self_extend != TriState::kTrue) {
return;
}
// Modify layer config in-place
auto& layer_configs = std::move(config.layer_configs);
std::transform(layer_configs.begin(), layer_configs.end(), layer_configs.begin(), [](
){
layer_config.self_extend = loader.self_extend;
layer_config.se_group_size = loader.se_group_size;
layer_config.se_neighbor_size = loader.se_neighbor_size;
});
}
void Run(LoaderArgs& loader, InferenceArgs& inference, AppArgs& app) {
// post CreateGemma
Gemma model = CreateGemma(loader, pools);
ModelConfig& mutable_model_config = model.GetMutableConfig();
ApplySelfExtendIfGiven(model, loader);
|
Nice, this looks good to me, except that the lambda's layer_config argument seems to have been omitted? LayerWeightsPtrs.layer_config is used by Reshape in weights.h. |
Woah, that's an interesting insight :) and yes I see the use of |
Okay! Done with the changes, and its working with and without the runtime config 🍾 Tried it on a sample, and it completely fails 😄 . I highly doubt it has to do with the Moving on: Unfortunately, I'm unable to run LongLM because I have mac and it has some issues with |
Example:
Output prompt:
self_extend: true |
Nice, your code looks good to me! |
#60