-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[onert-micro] Introduce OMTraining entities #13145
[onert-micro] Introduce OMTraining entities #13145
Conversation
This pr introduces OMTrainingInterpreter, OMTrainingContext and OMTrainingRuntimeModule entities. ONE-DCO-1.0-Signed-off-by: Artem Balyshev <a.balyshev@samsung.com>
uint32_t num_of_train_layers = 0; | ||
OMTrainOptimizer optimizer = SGD; | ||
OMLoss loss = MSE; | ||
float learning_rate = 0.f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about default value other than 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For learning_rate, right? Yes, I agree, I will use 0.001
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// Close file | ||
out_file.close(); | ||
#else | ||
assert(fasle && "Not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(fasle && "Not supported"); | |
assert(false && "Not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, fixed it
{ | ||
OMStatus status = Ok; | ||
uint32_t batch_size = config.training_context.batch_size; | ||
config.training_context.num_step += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only problem, where to reset this value to 0. For example we now start new epoch and that is why we need to reset this value again to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, this num_step is used only for ADAM. Then it does not need to be reset at new epoch.
/* | ||
* OMMetrics - enum to store metrics supported by training with onert-micro | ||
*/ | ||
enum OMMetrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BalyshevArtem Note that nnfw api does not expose Metric Evaluation. That is, nnfw api will expose loss evaluation function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
return UnknownError; | ||
|
||
// Write data | ||
out_file.write(config.model_ptr, config.model_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better to catch and to process some exceptions of the write
function and in case of arising ones to return Error status (the same can be apply for saveCheckpoint
). Just check ios_base::badbit
state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IN next pr will add this, thank you
This pr introduces OMTrainingInterpreter, OMTrainingContext and OMTrainingRuntimeModule entities.
for issue #12873
from draft: #13107
ONE-DCO-1.0-Signed-off-by: Artem Balyshev a.balyshev@samsung.com