Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CharTensor] Enable memory data to store scale factors based on quantization schemes #2844

Merged
merged 1 commit into from
Jan 10, 2025

Conversation

djeong20
Copy link
Contributor

This pull request aims to modify the existing codebase such that the memory data of CharTensor can now store scale factors based on different quantization schemes.
Additionally, this change allows the Tensor class to specify the desired quantization scheme while creating a new CharTensor instance.
The scale factors are determined either during the quantization process using a specific quantizer or they can be manually initialized if both the quantized data and the corresponding scale factors are provided as inputs.

Self-evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

Copy link
Contributor

@EunjuYang EunjuYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good works :) Please check my review below:

MemoryData *mem_data =
new MemoryData((void *)(new int8_t[dim.getDataLen()]()));
new MemoryData((void *)(new int8_t[dim.getDataLen() + 4 * scale_size()]()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about updating the code like:

Suggested change
new MemoryData((void *)(new int8_t[dim.getDataLen() + 4 * scale_size()]()));
new MemoryData((void *)(new int8_t[dim.getDataLen() + sizeof(float) * scale_size()]()));

If you have a plan to support different type of scales, it would be better to update it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for sharing your thoughts! sizeof(float) makes more sense :)

return nullptr;

data->validate();
return ((int8_t *)getScale()) + idx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it ((float *)getScale()) + idx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right! thank you 👍

@djeong20 djeong20 force-pushed the update/char_tensor/qinfo branch from c55db72 to a5ec4e6 Compare December 30, 2024 01:37
@EunjuYang EunjuYang self-assigned this Dec 31, 2024
@EunjuYang EunjuYang removed their assignment Dec 31, 2024
Copy link
Contributor

@EunjuYang EunjuYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

Copy link
Member

@skykongkong8 skykongkong8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…ization schemes

This pull request aims to modify the existing codebase such that the memory data of CharTensor can now store scale factors based on different quantization schemes.
Additionally, this change allows the Tensor class to specify the desired quantization scheme while creating a new CharTensor instance.
The scale factors are determined either during the quantization process using a specific quantizer or they can be manually initialized if both the quantized data and the corresponding scale factors are provided as inputs.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <[email protected]>
@djeong20 djeong20 force-pushed the update/char_tensor/qinfo branch from a5ec4e6 to 44fbaf6 Compare January 6, 2025 01:16
Copy link
Collaborator

@jijoongmoon jijoongmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jijoongmoon jijoongmoon merged commit d6d02c8 into nnstreamer:main Jan 10, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants