Default loop output transforms are too intrusive #362

seanmor5 · 2022-09-17T15:52:48Z

For supervised training loops, we provide a convenience output transform which ensures only the model state is returned from the training loop. This means you always lose the entire training state, which might be of interest later on.

I propose instead that we return a tuple:

{loop_state, transformed_state} which always returns the whole state, as well as a transformed version. That way you never accidentally lose the entire state.

The text was updated successfully, but these errors were encountered:

josevalim · 2023-12-12T08:46:34Z

My suggestion is to get rid of output_transform altogether. After all, anyone can transform the output by piping an operation after it.

👍 for the trainer returning {model_state, loop_state} though. The user can even access other metadata inside state.step_state. If you want, you can even add more structure by defining a TrainerStep struct which you then place it as the step_state.

Perhaps it is best to do these changes sooner than later, since they are breaking?

seanmor5 added kind:chore Internal Improvements area:loop Applies to loop API labels Sep 20, 2022

Ian-GL mentioned this issue Oct 12, 2022

Return tuple of loop state and transformed state #382

Closed

seanmor5 linked a pull request May 11, 2024 that will close this issue

Implement loops as Streams #573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default loop output transforms are too intrusive #362

Default loop output transforms are too intrusive #362

seanmor5 commented Sep 17, 2022

josevalim commented Dec 12, 2023

Default loop output transforms are too intrusive #362

Default loop output transforms are too intrusive #362

Comments

seanmor5 commented Sep 17, 2022

josevalim commented Dec 12, 2023