[onert] Introduce ExtraTensorRequest #13604

zetwhite · 2024-08-07T02:36:56Z

This PR introduces ExtraTensorRequest.
Through this class, each TrainableFunction can request extra tensors to be pre-allocated.

ONE-DCO-1.0-Signed-off-by: seunghui youn [email protected]

draft : #13486
related : #13282

runtime/onert/core/include/backend/train/ExtraTensorRequest.h

This PR introduces ExtraTensorRequest. Through this class, each TrainableFunction can request extra tensors to be pre-allocated. ONE-DCO-1.0-Signed-off-by: seunghui youn <[email protected]>

zetwhite · 2024-08-07T04:11:49Z

runtime/onert/core/include/backend/train/ExtraTensorRequest.h

+  const ir::OperandInfo &info() const { return _info; }
+  ExtraTensorLifeTime lifetime() const { return _lifetime; }


I intentionally add only getters for these fields.
Once ExtraTensor's info and lifetime are set, there is no need to be changed.

runtime/onert/core/include/backend/train/ExtraTensorRequest.h

ys44kim · 2024-08-07T04:35:21Z

runtime/onert/core/include/backend/train/ExtraTensorRequest.h

+{
+  BACKWARD,            // alive during backward()
+  FORWARD_TO_BACKWARD, // alive from forward to backward()
+};


first of all, I may be asking the wrong question due to my lack of understanding of this function, so please understand.

what does lifetime mean?
is it dynamically allocating/deallocating memory at the start/end of the section(backward, forward_to_backward)?
and, what does "Extra" mean in ExtraTensor?

first of all, I may be asking the wrong question due to my lack of understanding of this function, so please understand.

Thanks for your review and interest! I'm glad that I could explain it.

what does "Extra" mean in ExtraTensor?

ExtraTensor means a tensor additionally requested while doing backward(backpropagte).

For example, while backwarding FullyConnected, we need to calculate X^T and W^T.
Since X^T and W^T are 'additionaly' needed because of mathmetical properties of FC, I used the word 'Extra' to mark it additional tensor related to the specific layer.

what does lifetime mean?

Lifetime means the time that additional tensors must remain alive.

In the example above, W^T and X^T are used only in backwarding of FC layer.
So, the lifetime of these tensors are limited to backward(). (mark it BACKWARD)

Some ExtraTensors have to be alive while both forwarding and backwarding.
So, I'd like to mark it as FORWARD_TO_BACKWARD.

is it dynamically allocating/deallocating memory at the start/end of the section(backward, forward_to_backward)?

With lifetime information, we can know in advance when and how much tensor will be needed.
So My plan is to (statically) pre-allocate maximum memory before the training process starts.

I think that my explanation is somehow verbose and a bit hard to understand 😢
If you're more interested I could explain the details offline!

1.

When I suggested LayerScopeTensor in #13605 (comment) based on your comment

ExtraTensor is a tensor that is accessed within one operation layer.
In other words, the scope of the extra tensor is confined to one specific layer.

But it seems misleading to use the scope is one specific layer.
It may be accessed in the specific layer only, (but it is true for all other tensors) but its lifetime is longer than the specific layer.

I expected it is like a local variable, which is released when it is out of the scope.

2.

Why do we need ExtraTensor to get the scope of a Tensor?
Is it possible to use use-def chain to get the scope of a Tensor?
It would be better not to introduce yet another tensor if possible.

About 2.

Is it possible to use use-def chain to get the scope of a Tensor?

As I understood, using the use-def chain is not easy in this problem. (at least for now)

Current use-def chain has below structure.

// UseDefChain.h class UseDefChain { const Operand &_operand; std::set<TrainingOperationIndex> _uses; std::set<TrainingOperationIndex> _defs; };

This assumes that _operand (tensor) is defined and used by operations in the graph.

Since ExtraTensors are definition and usage is dependent on each layer's implementation,
It is NOT shown in the graph. So, it is hard to express ExtraTensor's use-def with current use-def chains.

about 1.

But it seems misleading to use the scope is one specific layer.

I thought the word 'scope' meant 'where the tensor can be accessed' and it doesn't imply the lifetime.
So for me, the word 'scope' looks proper in this case.

It may be accessed in the specific layer only, (but it is true for all other tensors)

Aha, I agree with your point. About the weight tensors, it is accessed by one specific layer.
I need to re-consider the Tensor naming and its property carefully.

jyoungyun · 2024-08-07T06:31:44Z

runtime/onert/core/include/backend/train/ExtraTensorRequest.h

+{
+
+public:
+  ExtraTensorRequest(ir::OperandInfo info, ExtraTensorLifeTime lt, ExtraTensor **addr)


Suggested change

ExtraTensorRequest(ir::OperandInfo info, ExtraTensorLifeTime lt, ExtraTensor **addr)

ExtraTensorRequest(const ir::OperandInfo &info, ExtraTensorLifeTime lt, ExtraTensor **addr)

About ExtraTensor **addr, can't it be implemented without using a double pointer? If you want to use double pointer, please add const to the variable properly.

Aha, Thank you for your notice. I'll try to find another way 🤔

Hmm.. I rethink about this.

Easy way is ..

use ExtraTensor** const

use ExtraTensor*&

Otherwise, shared_ptr<ExtraTensor>& looks also possible.

About adding const, ExtraTensor** just becomes ExtraTensor** const.

ExtraTensor should be mutable in the view of each Layer.

ExtraTensor* should be mutable, this has to be updated after the extra tensor is registered

ExtraTensor** can be constant.

@jyoungyun Could you share your opinion?

Do you think ExtraTensor*& is safe enough? or should we have to find another way?

@zetwhite I prefer to use smart pointer instead of raw pointer. The smart pointers do automatic memory management and it lessen tha chances of common pointer-related errors, enhancing code reliability. If you implement this code using shared_ptr, how about trying to use it?

@ragmani Could you share your opinion about this? Is it fine to add shared_ptr in TensorRegistry?

I couldn't find a way to solve this issue better than using shared_ptr or double pointer. And I think it's better to use shared_ptr& instead of double raw pointer as https://github.com/Samsung/ONE/pull/13604/files#r1708816555.

@ragmani @jyoungyun I'll update to use shared_ptr and re-request reviews soon. Thank you for help!

As I understand, double pointer was there since it is output parameter. Is the ownership is shared? If not, we don't need to use shared_ptr.

In fact, I am not sure ExtraTensor is necessary.

As I understand, double pointer was there since it is output parameter. Is the ownership is shared? If not, we don't need to use shared_ptr.

The ownership is shared with the TensorRegistry if each layer has the ownership.

In fact, I am not sure ExtraTensor is necessary.

If managing and planning ExtraTensors are the same as other tensor types, those tensor types can be unified. But I'm still not sure if that's the case yet. I think there are DisposableTensor and GradientTensor as candidates, but ExtraTensor may be required in both forwarding and backwarding nodes in some cases.

Is the ownership is shared?

About this.. IMO there is no concrete answer :

Sbd could see that only TensorRegistry takes ownership and Layer borrows(not owns) the tensors.

Otherwise, Sbd could see that both TensorRegistry and Layer own the tensors.

In fact, I am not sure ExtraTensor is necessary.

I thought that somehow we need to manage the (Extra or LayerScope) tensors in core to plan the memory usage. So, I tried the way define the tensor as 'extra' and manage it through TensorManager.

Could you share your opinion( or what you are concerned) about this?
Since this work is not in a hurry, I would like to reflect your opinion.

(+) Ah, I could find a details in here #13604 (comment)

ragmani · 2024-08-08T07:20:59Z

runtime/onert/core/include/backend/train/ExtraTensorRequest.h

+private:
+  ir::OperandInfo _info;
+  ExtraTensorLifeTime _lifetime;
+  backend::train::ExtraTensor **_address;


Suggested change

backend::train::ExtraTensor **_address;

std::shared_ptr<backend::train::ExtraTensor> &_address;

zetwhite · 2024-08-29T11:00:03Z

#13604 (comment)

Using shared_ptr, ExtraTensorRequest is no more necessary.

Because Each Layer could generate its own tensor and share the shared_ptr<ExtraTensor> with the core (TensorRegistry).

So, I'll close this PR and move to next PR.

zetwhite commented Aug 7, 2024

View reviewed changes

runtime/onert/core/include/backend/train/ExtraTensorRequest.h Outdated Show resolved Hide resolved

zetwhite marked this pull request as ready for review August 7, 2024 04:04

zetwhite requested a review from a team August 7, 2024 04:04

zetwhite added approval: 2 Require at least 2 approvals PR/ready for review It is ready to review. Please review it. labels Aug 7, 2024

zetwhite force-pushed the 0807/etensor-requests branch from 1294475 to b008152 Compare August 7, 2024 04:09

[onert] Introduce ExtraTensorRequest

0182ae2

This PR introduces ExtraTensorRequest. Through this class, each TrainableFunction can request extra tensors to be pre-allocated. ONE-DCO-1.0-Signed-off-by: seunghui youn <[email protected]>

zetwhite force-pushed the 0807/etensor-requests branch from b008152 to 0182ae2 Compare August 7, 2024 04:10

zetwhite commented Aug 7, 2024

View reviewed changes

zetwhite mentioned this pull request Aug 7, 2024

[onert/train] Attach auxiliary tensors to tensor builder #13282

Open

zetwhite commented Aug 7, 2024

View reviewed changes

runtime/onert/core/include/backend/train/ExtraTensorRequest.h Outdated Show resolved Hide resolved

Update runtime/onert/core/include/backend/train/ExtraTensorRequest.h

bffd4d1

ys44kim reviewed Aug 7, 2024

View reviewed changes

zetwhite requested a review from ys44kim August 7, 2024 05:49

jyoungyun reviewed Aug 7, 2024

View reviewed changes

jyoungyun requested a review from a team August 8, 2024 01:59

glistening mentioned this pull request Aug 8, 2024

[onert] Introduce ExtraTensorIndex #13605

Merged

ragmani reviewed Aug 8, 2024

View reviewed changes

zetwhite added PR/NO MERGE Please don't merge. I'm still working on this :) and removed PR/ready for review It is ready to review. Please review it. labels Aug 9, 2024

zetwhite closed this Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert] Introduce ExtraTensorRequest #13604

[onert] Introduce ExtraTensorRequest #13604

zetwhite commented Aug 7, 2024 •

edited

Loading

zetwhite Aug 7, 2024

ys44kim Aug 7, 2024

zetwhite Aug 7, 2024 •

edited

Loading

zetwhite Aug 7, 2024 •

edited

Loading

zetwhite Aug 7, 2024

glistening Aug 9, 2024

zetwhite Aug 9, 2024 •

edited

Loading

zetwhite Aug 9, 2024 •

edited

Loading

jyoungyun Aug 7, 2024

zetwhite Aug 7, 2024

zetwhite Aug 7, 2024 •

edited

Loading

zetwhite Aug 7, 2024

jyoungyun Aug 8, 2024

ragmani Aug 8, 2024 •

edited

Loading

zetwhite Aug 8, 2024

glistening Aug 9, 2024

ragmani Aug 9, 2024 •

edited

Loading

zetwhite Aug 9, 2024 •

edited

Loading

ragmani Aug 8, 2024

zetwhite commented Aug 29, 2024

		const ir::OperandInfo &info() const { return _info; }
		ExtraTensorLifeTime lifetime() const { return _lifetime; }

	ExtraTensorRequest(ir::OperandInfo info, ExtraTensorLifeTime lt, ExtraTensor **addr)
	ExtraTensorRequest(const ir::OperandInfo &info, ExtraTensorLifeTime lt, ExtraTensor **addr)

	backend::train::ExtraTensor **_address;
	std::shared_ptr<backend::train::ExtraTensor> &_address;

[onert] Introduce ExtraTensorRequest #13604

[onert] Introduce ExtraTensorRequest #13604

Conversation

zetwhite commented Aug 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zetwhite Aug 7, 2024 • edited Loading

Choose a reason for hiding this comment

zetwhite Aug 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zetwhite Aug 9, 2024 • edited Loading

Choose a reason for hiding this comment

zetwhite Aug 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zetwhite Aug 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ragmani Aug 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ragmani Aug 9, 2024 • edited Loading

Choose a reason for hiding this comment

zetwhite Aug 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zetwhite commented Aug 29, 2024

zetwhite commented Aug 7, 2024 •

edited

Loading

zetwhite Aug 7, 2024 •

edited

Loading

zetwhite Aug 7, 2024 •

edited

Loading

zetwhite Aug 9, 2024 •

edited

Loading

zetwhite Aug 9, 2024 •

edited

Loading

zetwhite Aug 7, 2024 •

edited

Loading

ragmani Aug 8, 2024 •

edited

Loading

ragmani Aug 9, 2024 •

edited

Loading

zetwhite Aug 9, 2024 •

edited

Loading