-
-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RMSNorm #629
Add RMSNorm #629
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc nit aside, this LGTM! Thank you for tidying this up.
I'll merge this into the new dev
branch shortly.
[this paper](https://browse.arxiv.org/abs/2307.14995). `\beta` is an optional bias | ||
term. | ||
|
||
??? cite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this will need a new line here in order to render correctly in the docs.
i.e.
??? cite
foo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, whilst we're here -- maybe worth emphasising that use_{weight,bias}=False
is the default.
(Should we keep it as the default? I appreciate that's maybe more useful, but it's inconsistent with the other normalisation layers. WDYT?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising the point about the defaults. I feel its good to maintain consistency, especially with LayerNorm
. Changed defaults to on.
Alright, LGTM! Thank you for tidying this one up. |
Branched from #545 addressing reviewers' comments.