Support constant lr with cooldown #35453

LoserCheems · 2024-12-29T15:33:45Z

What does this PR do?

Fixes #35449
Added the 'warmup_stable_cooldown' learning rate scheduler method.
This method allows three phases, linear warmup, stable, and cooldown, where cooldown can be done using the linear, cosine and 1-sqrt methods.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@muellerzr and @SunMarc

SunMarc · 2025-01-02T14:12:15Z

Thanks for the PR @LoserCheems, we already have warmup_stable_decay scheduler with get_wsd_schedule that does practically the same thing as the new scheduler you proposed. I think what you be good is to maybe add more options to the decay stage of the wsd_scheduler as you did in this PR (e.g linear/ 1-sqrt) WDYT ?

LoserCheems · 2025-01-03T03:33:45Z

Thanks for the PR @LoserCheems, we already have warmup_stable_decay scheduler with get_wsd_schedule that does practically the same thing as the new scheduler you proposed. I think what you be good is to maybe add more options to the decay stage of the wsd_scheduler as you did in this PR (e.g linear/ 1-sqrt) WDYT ?

oh, thank for your suggestion, I intend to integrate different cooldown methods and minimum learning rate into get_wsd_schedule.
In addition, is it possible to rename get_wsd_schedule to get_wsc_schedule, which is more in line with the three phases of warmup, stable, cooldown.

SunMarc · 2025-01-03T14:54:47Z

Thanks ! We can't really rename it wsc, wsd is a real term that appears in the MiniCPM paper.

LoserCheems · 2025-01-03T15:37:41Z

Thanks ! We can't really rename it wsc, wsd is a real term that appears in the MiniCPM paper.

Thank you, the renaming of the function is done.

SunMarc

Thanks for the PR ! Left a comment

src/transformers/optimization.py

SunMarc

Nice ! Just a few nits and we should be able to merge this soon ! Thanks for being reactive

src/transformers/optimization.py

LoserCheems · 2025-01-10T10:26:06Z

Nice ! Just a few nits and we should be able to merge this soon ! Thanks for being reactive

Everything is ready!

HuggingFaceDocBuilderDev · 2025-01-10T15:01:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc

LGTM ! Thanks for iterating ! Just a few nits

src/transformers/optimization.py

LoserCheems · 2025-01-10T17:18:03Z

LGTM ! Thanks for iterating ! Just a few nits

It's done!

SunMarc

LGTM!

LoserCheems · 2025-01-15T13:33:15Z

@SunMarc Thank you, this PR is ready for merge.

LoserCheems and others added 9 commits December 29, 2024 23:19

Add support for constant learning rate with cooldown

efd6f83

Add support for constant learning rate with cooldown

319e375

Add support for constant learning rate with cooldown

b4c690e

Add support for constant learning rate with cooldown

4d0ebc6

Add support for constant learning rate with cooldown

4419d93

Add support for constant learning rate with cooldown

d32e21a

Add support for constant learning rate with cooldown

a7ff1c4

Merge branch 'main' into support-constant-lr-with-cooldown

8efb432

Merge branch 'main' into support-constant-lr-with-cooldown

3d9812f

Merge branch 'main' into support-constant-lr-with-cooldown

c1501f5

LoserCheems and others added 4 commits January 3, 2025 12:15

Add more warmup and cooldown methods to 'get_wsc_schedule'

23c475b

Add more warmup and cooldown methods to 'get_wsc_schedule'

f2b3dcc

Add more warmup and cooldown methods to 'get_wsc_schedule'

3546ce4

Add more warmup and cooldown methods to 'get_wsc_schedule'

52597f4

LoserCheems and others added 2 commits January 3, 2025 23:23

Add more warmup and decay methods to 'get_wsd_schedule'

3f8c31f

Merge branch 'main' into support-constant-lr-with-cooldown

8db1dc8

Merge branch 'main' into support-constant-lr-with-cooldown

c66f25c

SunMarc reviewed Jan 6, 2025

View reviewed changes

src/transformers/optimization.py Outdated Show resolved Hide resolved

src/transformers/optimization.py Outdated Show resolved Hide resolved

LoserCheems and others added 4 commits January 6, 2025 23:24

support num_training_steps and num_stable_steps for get_wsd_schedule

144aa46

support num_training_steps and num_stable_steps for get_wsd_schedule

d66a1b2

Merge branch 'main' into support-constant-lr-with-cooldown

a93e722

Merge branch 'main' into support-constant-lr-with-cooldown

224379d

SunMarc reviewed Jan 7, 2025

View reviewed changes

src/transformers/optimization.py Outdated Show resolved Hide resolved

LoserCheems and others added 3 commits January 8, 2025 09:32

get wsd scheduler before the num_training_steps decision

adc3444

fix code_quality

01d087f

Merge branch 'main' into support-constant-lr-with-cooldown

1497707

LoserCheems and others added 2 commits January 9, 2025 19:20

Move stable stage decide to get_wsd_schedule

10b1c85

Merge branch 'main' into support-constant-lr-with-cooldown

c9d1a6c

SunMarc reviewed Jan 10, 2025

View reviewed changes

src/transformers/optimization.py Outdated Show resolved Hide resolved

src/transformers/optimization.py Outdated Show resolved Hide resolved

src/transformers/optimization.py Show resolved Hide resolved

src/transformers/optimization.py Outdated Show resolved Hide resolved

LoserCheems and others added 2 commits January 10, 2025 18:14

Update docstring of get_wsd_schedule

0371aed

Merge branch 'main' into support-constant-lr-with-cooldown

26d7eb6

LoserCheems added 2 commits January 10, 2025 21:30

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

c9a20f9

Merge branch 'main' into support-constant-lr-with-cooldown

0f487c7

Merge branch 'main' into support-constant-lr-with-cooldown

c0e100d

SunMarc approved these changes Jan 10, 2025

View reviewed changes

src/transformers/optimization.py Outdated Show resolved Hide resolved

src/transformers/optimization.py Outdated Show resolved Hide resolved

LoserCheems and others added 3 commits January 11, 2025 01:02

Update num_train_steps to optional

29e1570

Update num_train_steps to optional

d52f99c

Merge branch 'main' into support-constant-lr-with-cooldown

ee901f2

LoserCheems and others added 7 commits January 11, 2025 08:14

Update docstring of get_wsd_schedule

933a2c6

Merge branch 'main' into support-constant-lr-with-cooldown

7014f0f

Merge branch 'main' into support-constant-lr-with-cooldown

5001725

Merge branch 'main' into support-constant-lr-with-cooldown

689e151

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

6bf7abe

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

75b8076

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

4dc9463

LoserCheems requested a review from SunMarc January 15, 2025 13:14

SunMarc approved these changes Jan 15, 2025

View reviewed changes

LoserCheems added 2 commits January 15, 2025 21:24

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

968778d

Merge branch 'main' into support-constant-lr-with-cooldown

54e0b52

LoserCheems added 3 commits January 21, 2025 02:08

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

6390f92

Merge branch 'huggingface:main' into support-constant-lr-with-cooldown

d015c38

Merge branch 'main' into support-constant-lr-with-cooldown

836f97f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support constant lr with cooldown #35453

Support constant lr with cooldown #35453

LoserCheems commented Dec 29, 2024 •

edited

Loading

SunMarc commented Jan 2, 2025

LoserCheems commented Jan 3, 2025 •

edited

Loading

SunMarc commented Jan 3, 2025

LoserCheems commented Jan 3, 2025

SunMarc left a comment

SunMarc left a comment

LoserCheems commented Jan 10, 2025

HuggingFaceDocBuilderDev commented Jan 10, 2025

SunMarc left a comment

LoserCheems commented Jan 10, 2025 •

edited

Loading

SunMarc left a comment

LoserCheems commented Jan 15, 2025 •

edited

Loading

Support constant lr with cooldown #35453

Are you sure you want to change the base?

Support constant lr with cooldown #35453

Conversation

LoserCheems commented Dec 29, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

SunMarc commented Jan 2, 2025

LoserCheems commented Jan 3, 2025 • edited Loading

SunMarc commented Jan 3, 2025

LoserCheems commented Jan 3, 2025

SunMarc left a comment

Choose a reason for hiding this comment

SunMarc left a comment

Choose a reason for hiding this comment

LoserCheems commented Jan 10, 2025

HuggingFaceDocBuilderDev commented Jan 10, 2025

SunMarc left a comment

Choose a reason for hiding this comment

LoserCheems commented Jan 10, 2025 • edited Loading

SunMarc left a comment

Choose a reason for hiding this comment

LoserCheems commented Jan 15, 2025 • edited Loading

LoserCheems commented Dec 29, 2024 •

edited

Loading

LoserCheems commented Jan 3, 2025 •

edited

Loading

LoserCheems commented Jan 10, 2025 •

edited

Loading

LoserCheems commented Jan 15, 2025 •

edited

Loading