-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT][onert-micro] Onert-micro training PoCv3 #13107
[DRAFT][onert-micro] Onert-micro training PoCv3 #13107
Conversation
|
||
printf("MAE_ERROR TEST = %f\n", mae_result); | ||
// Save training result | ||
saveModel(output_trained_file_path, circle_model); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be exposed via API(
NNFW_STATUS nnfw_train_export_circle(nnfw_session *session, const char *path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, move this function insert OMTrainingInterpreter, right?
auto data = reinterpret_cast<float *>(interpreter.readOutputTensor(0)); | ||
// Temporary buffer to read input data from file using BATCH_SIZE | ||
float training_input[BATCH_SIZE * INPUT_SIZE]; | ||
float training_target[BATCH_SIZE * OUTPUT_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
label data for CCE is one hot encoding. You will feed int label data by converting into float ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it is one of the preprocessing step. And we have (for cross-etropy task) as input already preprocessed one-hot encoded float data. Do you think it is better to move it into runtime?
// Averaged result | ||
{ | ||
float *f_metric_val = reinterpret_cast<float *>(metric_val); | ||
*f_metric_val /= test_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reasonable to insert assert for checking test_size
not equal zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you, I will add assert here
|
||
for (uint32_t i = 0; i < flat_size; ++i) | ||
{ | ||
result_value += std::pow((calculated_data[i] - target_data[i]), 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here std::pow() can be replace by multiplication
@chunseoklee, I added checkpoints saving and loading api:
It it similar to #12997 (comment) |
To check format of the checkpoint file please see #13037 (comment) |
cur_batch_size = std::max(1u, cur_batch_size); | ||
|
||
config.training_context.batch_size = cur_batch_size; | ||
config.training_context.num_step = i + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, num_step for ADAM optimizer should not be reset by each epoch.
config.training_context.num_step = i + 1; | |
config.training_context.num_step++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or
config.training_context.num_step = i + 1; | |
config.training_context.adam_step++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I will change it
@chunseoklee, can I start split this draft ad merging it? |
Sure. |
@BalyshevArtem I am trying to make onert-micro-dev module, which implement nnfw api on https://github.com/chunseoklee/ONE/commits/v3/. After drafting this, I am going to apply this to your TizenRT internal commit. |
* Warning: before using trainSingleStep call: 1) importTrainModel; 2) setInput; 3) setTarget | ||
*/ | ||
OMStatus OMTrainingRuntimeModule::trainSingleStep(const OMConfig &config) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | |
{ | |
config.training_context.num_step++; |
we need this for update num_step for ADAM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and to save this value in checkpoints files (if needed method will call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean we need somewhere to keep this num_step value
* num_of_train_layers - number of trainable last layers (Note: 0 - all layers will be trained) | ||
* optimizer - optimizer which onert-micro training will be used (Note: SGD - default one) | ||
* loss - loss which onert-micro training will be used (Note: CROSS_ENTROPY - default one) | ||
* lambda - used by all optimizers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this learning rate ? then how about using lr or learning_rate ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sure
ea703dd
to
97976ac
Compare
TEST_F(BostonHousingTaskTest, ADAM_MSE_P) | ||
{ | ||
// Create BostonHousing data handler | ||
BostonHousingTask<float> bostonTask; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests
This draft introduces second version of the onert-micro training runtime. This is for one-stage training (without generating backprop graph part). Also this draft introduces weight divider tool. ONE-DCO-1.0-Signed-off-by: Artem Balyshev <[email protected]>
97976ac
to
130afb4
Compare
Everything is merged, close it |
This draft introduces third version of the onert-micro training runtime. This is for one-stage training (without generating backprop graph part) and without weight divider.
for issue: #12873
ONE-DCO-1.0-Signed-off-by: Artem Balyshev [email protected]