-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[onert] Optimize Bias Grad Computation #12673
Conversation
Training Time for mobilenet_v2 Before this commit== training parameter ==
- learning_rate = 0.001
- batch_size = 32
- loss_info = {loss = mean squared error, reduction = sum over batch size}
- optimizer = adam
========================
Epoch 1/5 - time: 4430.210ms/step - loss: [0] 30090.0898
Epoch 2/5 - time: 4607.027ms/step - loss: [0] 30090.0840
Epoch 3/5 - time: 4669.991ms/step - loss: [0] 30090.0840
Epoch 4/5 - time: 4705.184ms/step - loss: [0] 30090.0840
Epoch 5/5 - time: 4659.798ms/step - loss: [0] 30090.0840
===================================
MODEL_LOAD takes 5.4760 ms
PREPARE takes 786.9770 ms
EXECUTE takes 207755.0330 ms
- Epoch 1 takes 39871.8920 ms
- Epoch 2 takes 41463.2420 ms
- Epoch 3 takes 42029.9200 ms
- Epoch 4 takes 42346.6520 ms
- Epoch 5 takes 41938.1790 ms
=================================== After this commit== training parameter ==
- learning_rate = 0.001
- batch_size = 32
- loss_info = {loss = mean squared error, reduction = sum over batch size}
- optimizer = adam
========================
Epoch 1/5 - time: 2800.052ms/step - loss: [0] 30090.0977
Epoch 2/5 - time: 2580.089ms/step - loss: [0] 30090.0938
Epoch 3/5 - time: 2654.771ms/step - loss: [0] 30090.0938
Epoch 4/5 - time: 2647.186ms/step - loss: [0] 30090.0938
Epoch 5/5 - time: 2648.930ms/step - loss: [0] 30090.0938
===================================
MODEL_LOAD takes 5.3360 ms
PREPARE takes 775.7780 ms
EXECUTE takes 120092.1480 ms
- Epoch 1 takes 25200.4690 ms
- Epoch 2 takes 23220.8040 ms
- Epoch 3 takes 23892.9350 ms
- Epoch 4 takes 23824.6720 ms
- Epoch 5 takes 23840.3740 ms
=================================== |
@@ -170,17 +169,8 @@ void DepthwiseConvolutionLayer::backwardFloat32() | |||
// Calculate gradient for bias | |||
if (_bias) | |||
{ | |||
// TODO Use optimized kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit resolves this TODO item.
@@ -196,20 +195,8 @@ void ConvolutionLayer::backwardFloat32() | |||
// Calculate gradient for bias | |||
if (_bias) | |||
{ | |||
// TODO Use optimized kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit resolves this TODO item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In runtime/onert/backend
, there are not directly using sths in Eigen
. In convention, onert doesn't use directly specific kernel such as Eigen
AFAIK. How about hiding this?
- AS-IS: in
biasGrad()
, we can seeEigen
. - TO-BE: in
biasGradIO
, there are noEigen
.
I'd like to get feedback from others.
@Aeren1564 Would you like to look at another file structure? This is because this PR doesn't fit with the existing code design. I think this |
float *bias_grad_buffer = getBuffer<float>(bias_grad); | ||
nnfw::cker::Tensor bias_grad_t{bias_grad_shape, bias_grad_buffer}; | ||
|
||
Eigen::Index outer = input_backprop_shape.Dims(input_backprop_shape.DimensionsCount() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These Eigen
class cannot be hidden?
Until now, onert has hidden these things AFAIK.
ONE/runtime/onert/backend (master ✗) grep -rni "eigen"
train/TensorManager.h:38: // _align is equal to EIGEN_MAX_ALIGN_BYTES, though including Eigen headers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hid "Eigen" from onert but now redux_functor
accepts the following arguments
void operator()(const Device &device, const Eigen::Index &outer_dim, const Eigen::Index &inner_dim, const Tensor &input, Tensor *output) const
Please let me know if you have some better way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The built-in type can be transmitted and converted into the Eigen::Index
in redux_functor
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The built-in type can be transmitted and converted into the
Eigen::Index
inredux_functor
function.
I implemented @YongseopKim 's suggestion for now 👀
I thought like this @Aeren1564 . |
This commit optimizes bias grad computation with multithreading. ONE-DCO-1.0-Signed-off-by: YongHyun An <[email protected]>
void biasReductionHelper(float *input_backprop_buffer, const Shape &input_backprop_shape, | ||
float *bias_grad_buffer, const Shape &bias_grad_shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from @YongseopKim 's suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thank you!!!
This commit optimizes bias grad computation with multithreading.
ONE-DCO-1.0-Signed-off-by: YongHyun An [email protected]
From draft #12571, #12666
Relevant issue: #12569