[onert] Optimize Bias Grad Computation #12673

Aeren1564 · 2024-02-21T09:30:20Z

This commit optimizes bias grad computation with multithreading.

ONE-DCO-1.0-Signed-off-by: YongHyun An [email protected]

From draft #12571, #12666
Relevant issue: #12569

Aeren1564 · 2024-02-21T09:32:03Z

Training Time for mobilenet_v2

Before this commit

== training parameter ==
- learning_rate   = 0.001
- batch_size      = 32
- loss_info       = {loss = mean squared error, reduction = sum over batch size}
- optimizer       = adam
========================
Epoch 1/5 - time: 4430.210ms/step - loss: [0] 30090.0898
Epoch 2/5 - time: 4607.027ms/step - loss: [0] 30090.0840
Epoch 3/5 - time: 4669.991ms/step - loss: [0] 30090.0840
Epoch 4/5 - time: 4705.184ms/step - loss: [0] 30090.0840
Epoch 5/5 - time: 4659.798ms/step - loss: [0] 30090.0840
===================================
MODEL_LOAD   takes 5.4760 ms
PREPARE      takes 786.9770 ms
EXECUTE      takes 207755.0330 ms
- Epoch 1      takes 39871.8920 ms
- Epoch 2      takes 41463.2420 ms
- Epoch 3      takes 42029.9200 ms
- Epoch 4      takes 42346.6520 ms
- Epoch 5      takes 41938.1790 ms
===================================

After this commit

== training parameter ==
- learning_rate   = 0.001
- batch_size      = 32
- loss_info       = {loss = mean squared error, reduction = sum over batch size}
- optimizer       = adam
========================
Epoch 1/5 - time: 2800.052ms/step - loss: [0] 30090.0977
Epoch 2/5 - time: 2580.089ms/step - loss: [0] 30090.0938
Epoch 3/5 - time: 2654.771ms/step - loss: [0] 30090.0938
Epoch 4/5 - time: 2647.186ms/step - loss: [0] 30090.0938
Epoch 5/5 - time: 2648.930ms/step - loss: [0] 30090.0938
===================================
MODEL_LOAD   takes 5.3360 ms
PREPARE      takes 775.7780 ms
EXECUTE      takes 120092.1480 ms
- Epoch 1      takes 25200.4690 ms
- Epoch 2      takes 23220.8040 ms
- Epoch 3      takes 23892.9350 ms
- Epoch 4      takes 23824.6720 ms
- Epoch 5      takes 23840.3740 ms
===================================

Aeren1564 · 2024-02-21T09:33:19Z

runtime/onert/backend/train/ops/DepthwiseConvolutionLayer.cc

@@ -170,17 +169,8 @@ void DepthwiseConvolutionLayer::backwardFloat32()
  // Calculate gradient for bias
  if (_bias)
  {
-    // TODO Use optimized kernel


This commit resolves this TODO item.

Aeren1564 · 2024-02-21T09:33:41Z

runtime/onert/backend/train/ops/ConvolutionLayer.cc

@@ -196,20 +195,8 @@ void ConvolutionLayer::backwardFloat32()
  // Calculate gradient for bias
  if (_bias)
  {
-    // TODO Use optimized kernel


This commit resolves this TODO item.

runtime/onert/backend/train/ops/OperationUtils.cc

YongseopKim

In runtime/onert/backend, there are not directly using sths in Eigen. In convention, onert doesn't use directly specific kernel such as Eigen AFAIK. How about hiding this?

AS-IS: in biasGrad(), we can see Eigen.
TO-BE: in biasGradIO, there are no Eigen.

I'd like to get feedback from others.

jyoungyun · 2024-02-22T10:02:26Z

@Aeren1564 Would you like to look at another file structure? This is because this PR doesn't fit with the existing code design. I think this reduxFunctor.h should be located in the compute/cker/include/cker/eigen/. And when copying code from other repository, it is recommended not to modify it as much as possible. So it would be better to keep the file name as before. :)

YongseopKim · 2024-02-26T05:17:29Z

runtime/onert/backend/train/ops/OperationUtils.cc

+  float *bias_grad_buffer = getBuffer<float>(bias_grad);
+  nnfw::cker::Tensor bias_grad_t{bias_grad_shape, bias_grad_buffer};
+
+  Eigen::Index outer = input_backprop_shape.Dims(input_backprop_shape.DimensionsCount() - 1);


These Eigen class cannot be hidden?

Until now, onert has hidden these things AFAIK.

ONE/runtime/onert/backend (master ✗) grep -rni "eigen" train/TensorManager.h:38: // _align is equal to EIGEN_MAX_ALIGN_BYTES, though including Eigen headers

I hid "Eigen" from onert but now redux_functor accepts the following arguments

void operator()(const Device &device, const Eigen::Index &outer_dim, const Eigen::Index &inner_dim, const Tensor &input, Tensor *output) const

Please let me know if you have some better way to handle this.

The built-in type can be transmitted and converted into the Eigen::Index in redux_functor function.

The built-in type can be transmitted and converted into the Eigen::Index in redux_functor function.

I implemented @YongseopKim 's suggestion for now 👀

YongseopKim · 2024-02-26T07:38:15Z

I thought like this @Aeren1564 .

This commit optimizes bias grad computation with multithreading. ONE-DCO-1.0-Signed-off-by: YongHyun An <[email protected]>

Aeren1564 · 2024-02-26T23:18:14Z

compute/cker/include/cker/eigen/redux_functor.h

+void biasReductionHelper(float *input_backprop_buffer, const Shape &input_backprop_shape,
+                         float *bias_grad_buffer, const Shape &bias_grad_shape)


This is from @YongseopKim 's suggestion.

YongseopKim

LGTM 👍

jyoungyun

LGTM, Thank you!!!

Aeren1564 commented Feb 21, 2024

View reviewed changes

Aeren1564 marked this pull request as ready for review February 21, 2024 09:35

Aeren1564 added approval: 2 Require at least 2 approvals PR/ready for review It is ready to review. Please review it. labels Feb 21, 2024

Aeren1564 requested a review from a team February 21, 2024 09:37

jyoungyun reviewed Feb 22, 2024

View reviewed changes

runtime/onert/backend/train/ops/OperationUtils.cc Outdated Show resolved Hide resolved

Aeren1564 marked this pull request as draft February 22, 2024 04:41

Aeren1564 removed the PR/ready for review It is ready to review. Please review it. label Feb 22, 2024

Aeren1564 marked this pull request as ready for review February 22, 2024 05:24

Aeren1564 added the PR/ready for review It is ready to review. Please review it. label Feb 22, 2024

YongseopKim reviewed Feb 22, 2024

View reviewed changes

Aeren1564 removed the PR/ready for review It is ready to review. Please review it. label Feb 26, 2024

Aeren1564 marked this pull request as draft February 26, 2024 01:07

Aeren1564 marked this pull request as ready for review February 26, 2024 02:32

Aeren1564 added the PR/ready for review It is ready to review. Please review it. label Feb 26, 2024

Aeren1564 requested a review from a team February 26, 2024 02:57

YongseopKim reviewed Feb 26, 2024

View reviewed changes

[onert] Optimize Bias Grad Computation

789ed64

This commit optimizes bias grad computation with multithreading. ONE-DCO-1.0-Signed-off-by: YongHyun An <[email protected]>

Aeren1564 commented Feb 26, 2024

View reviewed changes

Aeren1564 requested a review from YongseopKim February 26, 2024 23:18

YongseopKim approved these changes Feb 26, 2024

View reviewed changes

Aeren1564 requested a review from jyoungyun February 26, 2024 23:32

jyoungyun approved these changes Feb 27, 2024

View reviewed changes

hseok-oh merged commit b4444ee into Samsung:master Feb 27, 2024
9 checks passed

Aeren1564 deleted the optimize-bias branch February 27, 2024 12:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert] Optimize Bias Grad Computation #12673

[onert] Optimize Bias Grad Computation #12673

Aeren1564 commented Feb 21, 2024 •

edited

Loading

Aeren1564 commented Feb 21, 2024

Aeren1564 Feb 21, 2024

Aeren1564 Feb 21, 2024

YongseopKim left a comment •

edited

Loading

jyoungyun commented Feb 22, 2024

YongseopKim Feb 26, 2024 •

edited

Loading

Aeren1564 Feb 26, 2024

jyoungyun Feb 26, 2024

Aeren1564 Feb 26, 2024

YongseopKim commented Feb 26, 2024

Aeren1564 Feb 26, 2024

YongseopKim left a comment

jyoungyun left a comment •

edited

Loading

		void biasReductionHelper(float *input_backprop_buffer, const Shape &input_backprop_shape,
		float *bias_grad_buffer, const Shape &bias_grad_shape)

[onert] Optimize Bias Grad Computation #12673

[onert] Optimize Bias Grad Computation #12673

Conversation

Aeren1564 commented Feb 21, 2024 • edited Loading

Aeren1564 commented Feb 21, 2024

Before this commit

After this commit

Aeren1564 Feb 21, 2024

Choose a reason for hiding this comment

Aeren1564 Feb 21, 2024

Choose a reason for hiding this comment

YongseopKim left a comment • edited Loading

Choose a reason for hiding this comment

jyoungyun commented Feb 22, 2024

YongseopKim Feb 26, 2024 • edited Loading

Choose a reason for hiding this comment

Aeren1564 Feb 26, 2024

Choose a reason for hiding this comment

jyoungyun Feb 26, 2024

Choose a reason for hiding this comment

Aeren1564 Feb 26, 2024

Choose a reason for hiding this comment

YongseopKim commented Feb 26, 2024

Aeren1564 Feb 26, 2024

Choose a reason for hiding this comment

YongseopKim left a comment

Choose a reason for hiding this comment

jyoungyun left a comment • edited Loading

Choose a reason for hiding this comment

Aeren1564 commented Feb 21, 2024 •

edited

Loading

YongseopKim left a comment •

edited

Loading

YongseopKim Feb 26, 2024 •

edited

Loading

jyoungyun left a comment •

edited

Loading