Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cker/train] Introduce Pad op in train #12522

Merged
merged 5 commits into from
Jan 26, 2024

Conversation

YongseopKim
Copy link
Contributor

@YongseopKim YongseopKim commented Jan 24, 2024

This commit introduces Pad op in cker/train.

ONE-DCO-1.0-Signed-off-by: Yongseop Kim [email protected]


for: #12413
draft: #12499

This commit introduces Pad op in train.

ONE-DCO-1.0-Signed-off-by: Yongseop Kim <[email protected]>
Comment on lines 38 to 45
template <typename T>
inline void Pad(const int32_t *padding_data, int32_t pad_rank, const Shape &input_shape,
const T *input_data, const Shape &output_shape, T *output_data,
const T *constant_value_data)
{
// Use nnfw::cker::Pad directly
nnfw::cker::Pad<T>(padding_data, pad_rank, input_shape, input_data, output_shape, output_data, constant_value_data);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, there is no need to re-define this function. We can use nnfw::cker::Pad directly in train backend. It makes unnecessary function call and we have to manage two set of code (Pad in train/operation and Pad in cpu/operation). Also it makes binary size bigger. (Of course, it's a very trivial issue. 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* -> [C,C,C,C]
*/
template <typename T>
inline void Pad(const int32_t *padding_data, int32_t pad_rank, const Shape &input_shape,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My new patch overwrites @jyoungyun 's comment. I write here again instead of @jyoungyun .

#12522 (comment)

IMHO, there is no need to re-define this function. We can use nnfw::cker::Pad directly in train backend. It makes unnecessary function call and we have to manage two set of code (Pad in train/operation and Pad in cpu/operation). Also it makes binary size bigger. (Of course, it's a very trivial issue. 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jyoungyun . I'm going to use nnfw::ckwer::Pad in train backend.

@YongseopKim YongseopKim requested review from jyoungyun and a team January 24, 2024 23:55
@YongseopKim
Copy link
Contributor Author

PTAL @jyoungyun and @Samsung/one_onert

jyoungyun
jyoungyun previously approved these changes Jan 25, 2024
Copy link
Contributor

@jyoungyun jyoungyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +400 to +404
struct PadParams
{
int32_t data[8];
int32_t rank;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

It would be good to use this param in cpu backend too.

@jyoungyun jyoungyun requested a review from a team January 25, 2024 01:04
@YongseopKim
Copy link
Contributor Author

PTAL

Comment on lines 147 to 152
const auto in_offset = (c + padding_cube) * in_cube_size +
(d + padding_depth) * in_plain_size +
(h + padding_top) * in_width + (padding_left);
const auto out_offset = (c * out_cube_size) + (d * out_plain_size) + (h * out_width);
// copy a row of input data to output data
std::memcpy(output_data + out_offset, input_data + in_offset, out_width * sizeof(T));
Copy link
Contributor

@zetwhite zetwhite Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit afraid that I might be wrong.

But this offset calculation looks like NCHW. 🤔

const auto in_offset = (c + padding_cube) * in_cube_size +    // n * CHW
                       (d + padding_depth) * in_plain_size +  // c * HW 
                       (h + padding_top) * in_width           // h * W 
                       + padding_left                         // w 

I referred this : https://oneapi-src.github.io/oneDNN/dev_guide_understanding_memory_formats.html#nchw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be not afraid 😄. My code could be wrong. I'll take a look more. Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IHMO, in cker, the logic here doesn't matter NCHW/NHWC. In the view of Pad kernel, it just does copying. If it might be NCHW, the code var name will be changed.

If I'm wrong, I'll leave a comment.

Copy link
Contributor

@zetwhite zetwhite Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IHMO, in cker, the logic here doesn't matter NCHW/NHWC.

I agree somehow.

IMHO, layout in onert(ir::Layout::NHWC) just meansnnfw::cker::Shape's element order.
In the tensor memory level, offset-calculation can be different.

BTW mostly looks good.
Let's just clarify this layout things and process it 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay now I'm sure somewhat.

At the train backend level, tensors' layer assume NHWC.

// backend/train/Config.cc
ir::Layout Config::supportLayout(const ir::IOperation &, ir::Layout) { return ir::Layout::NHWC; }

// backend/train/BackendContext.cc
backend::train::ITensorRegistry *BackendContext::genTrainingTensors()
{
  const ir::train::TrainableGraph &tgraph = *trainable_graph();
  auto tensor_builder = _tensor_builder;

  tgraph.operands().iterate([&](const ir::OperandIndex &ind, const ir::Operand &obj) {
    if (external_operands().contains(ind))
      return;
    // NOTE Assuming there is no layout changes (Always assume NHWC or UNKNOWN)
    assert(tgraph.layout() != ir::Layout::NCHW);

    // TODO Different shape of back propagation tensor
    ir::OperandInfo backend_info{obj.shape(), obj.typeInfo(), obj.info().memAllocType(),
                                 obj.isConstant()};
    tensor_builder->registerBackwardTensorInfo(ind, backend_info, ir::Layout::NHWC);
  });

  // TODO Plan tensor builds to reduce peak memory usage
  tgraph.operands().iterate([&](const ir::OperandIndex &ind, const ir::Operand &) {
    if (tensor_builder->isRegisteredBackward(ind))
      tensor_builder->notifyBackwardFirstUse(ind);
  });

  tensor_builder->allocateBackward();

  return _tensor_registry.get();
}

Therefore, at the rank-4 tensor in cker, it would be handled as NHWC.

I fixed words from specific width/height/depth to abstract(less-specific) row/col/plain. This means that CHW/HWC is not important point in cker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I think I also got it.
Thank you 👍

@YongseopKim
Copy link
Contributor Author

PTAL @jyoungyun and @zetwhite

Copy link
Contributor

@zetwhite zetwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@YongseopKim
Copy link
Contributor Author

PTAL @jyoungyun

Copy link
Contributor

@jyoungyun jyoungyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YongseopKim
Copy link
Contributor Author

ping @Samsung/one_onert

@hseok-oh hseok-oh merged commit 3e8d9f9 into Samsung:master Jan 26, 2024
9 checks passed
@YongseopKim YongseopKim deleted the pr/cker/train/pad branch January 29, 2024 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants