-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cker/train] Introduce Pad op in train #12522
Conversation
This commit introduces Pad op in train. ONE-DCO-1.0-Signed-off-by: Yongseop Kim <[email protected]>
template <typename T> | ||
inline void Pad(const int32_t *padding_data, int32_t pad_rank, const Shape &input_shape, | ||
const T *input_data, const Shape &output_shape, T *output_data, | ||
const T *constant_value_data) | ||
{ | ||
// Use nnfw::cker::Pad directly | ||
nnfw::cker::Pad<T>(padding_data, pad_rank, input_shape, input_data, output_shape, output_data, constant_value_data); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, there is no need to re-define this function. We can use nnfw::cker::Pad
directly in train backend. It makes unnecessary function call and we have to manage two set of code (Pad in train/operation and Pad in cpu/operation). Also it makes binary size bigger. (Of course, it's a very trivial issue. 😄)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* -> [C,C,C,C] | ||
*/ | ||
template <typename T> | ||
inline void Pad(const int32_t *padding_data, int32_t pad_rank, const Shape &input_shape, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My new patch overwrites @jyoungyun 's comment. I write here again instead of @jyoungyun .
IMHO, there is no need to re-define this function. We can use nnfw::cker::Pad directly in train backend. It makes unnecessary function call and we have to manage two set of code (Pad in train/operation and Pad in cpu/operation). Also it makes binary size bigger. (Of course, it's a very trivial issue. 😄)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jyoungyun . I'm going to use nnfw::ckwer::Pad
in train backend.
PTAL @jyoungyun and @Samsung/one_onert |
a2271f0
to
ff7b425
Compare
ff7b425
to
b04c37a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
struct PadParams | ||
{ | ||
int32_t data[8]; | ||
int32_t rank; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
It would be good to use this param in cpu backend too.
PTAL |
const auto in_offset = (c + padding_cube) * in_cube_size + | ||
(d + padding_depth) * in_plain_size + | ||
(h + padding_top) * in_width + (padding_left); | ||
const auto out_offset = (c * out_cube_size) + (d * out_plain_size) + (h * out_width); | ||
// copy a row of input data to output data | ||
std::memcpy(output_data + out_offset, input_data + in_offset, out_width * sizeof(T)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit afraid that I might be wrong.
But this offset calculation looks like NCHW
. 🤔
const auto in_offset = (c + padding_cube) * in_cube_size + // n * CHW
(d + padding_depth) * in_plain_size + // c * HW
(h + padding_top) * in_width // h * W
+ padding_left // w
I referred this : https://oneapi-src.github.io/oneDNN/dev_guide_understanding_memory_formats.html#nchw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be not afraid 😄. My code could be wrong. I'll take a look more. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IHMO, in cker, the logic here doesn't matter NCHW/NHWC. In the view of Pad kernel, it just does copying. If it might be NCHW, the code var name will be changed.
If I'm wrong, I'll leave a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IHMO, in cker, the logic here doesn't matter NCHW/NHWC.
I agree somehow.
IMHO, layout in onert(ir::Layout::NHWC
) just meansnnfw::cker::Shape
's element order.
In the tensor memory level, offset-calculation can be different.
BTW mostly looks good.
Let's just clarify this layout things and process it 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay now I'm sure somewhat.
At the train backend level, tensors' layer assume NHWC.
// backend/train/Config.cc
ir::Layout Config::supportLayout(const ir::IOperation &, ir::Layout) { return ir::Layout::NHWC; }
// backend/train/BackendContext.cc
backend::train::ITensorRegistry *BackendContext::genTrainingTensors()
{
const ir::train::TrainableGraph &tgraph = *trainable_graph();
auto tensor_builder = _tensor_builder;
tgraph.operands().iterate([&](const ir::OperandIndex &ind, const ir::Operand &obj) {
if (external_operands().contains(ind))
return;
// NOTE Assuming there is no layout changes (Always assume NHWC or UNKNOWN)
assert(tgraph.layout() != ir::Layout::NCHW);
// TODO Different shape of back propagation tensor
ir::OperandInfo backend_info{obj.shape(), obj.typeInfo(), obj.info().memAllocType(),
obj.isConstant()};
tensor_builder->registerBackwardTensorInfo(ind, backend_info, ir::Layout::NHWC);
});
// TODO Plan tensor builds to reduce peak memory usage
tgraph.operands().iterate([&](const ir::OperandIndex &ind, const ir::Operand &) {
if (tensor_builder->isRegisteredBackward(ind))
tensor_builder->notifyBackwardFirstUse(ind);
});
tensor_builder->allocateBackward();
return _tensor_registry.get();
}
Therefore, at the rank-4 tensor in cker, it would be handled as NHWC.
I fixed words from specific width/height/depth to abstract(less-specific) row/col/plain. This means that CHW/HWC is not important point in cker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I think I also got it.
Thank you 👍
PTAL @jyoungyun and @zetwhite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
PTAL @jyoungyun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ping @Samsung/one_onert |
This commit introduces Pad op in cker/train.
ONE-DCO-1.0-Signed-off-by: Yongseop Kim [email protected]
for: #12413
draft: #12499