[CharTensor] Enable memory data to store scale factors based on quantization schemes #2844

djeong20 · 2024-12-27T00:31:37Z

This pull request aims to modify the existing codebase such that the memory data of CharTensor can now store scale factors based on different quantization schemes.
Additionally, this change allows the Tensor class to specify the desired quantization scheme while creating a new CharTensor instance.
The scale factors are determined either during the quantization process using a specific quantizer or they can be manually initialized if both the quantized data and the corresponding scale factors are provided as inputs.

Self-evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

EunjuYang

Good works :) Please check my review below:

EunjuYang · 2024-12-27T07:02:25Z

nntrainer/tensor/char_tensor.cpp

  MemoryData *mem_data =
-    new MemoryData((void *)(new int8_t[dim.getDataLen()]()));
+    new MemoryData((void *)(new int8_t[dim.getDataLen() + 4 * scale_size()]()));


What about updating the code like:

Suggested change

new MemoryData((void *)(new int8_t[dim.getDataLen() + 4 * scale_size()]()));

new MemoryData((void *)(new int8_t[dim.getDataLen() + sizeof(float) * scale_size()]()));

If you have a plan to support different type of scales, it would be better to update it.

thank you for sharing your thoughts! sizeof(float) makes more sense :)

EunjuYang · 2024-12-27T07:09:23Z

nntrainer/tensor/char_tensor.cpp

+    return nullptr;
+
+  data->validate();
+  return ((int8_t *)getScale()) + idx;


Isn't it ((float *)getScale()) + idx?

yes, you're right! thank you 👍

EunjuYang

LGTM !

skykongkong8

LGTM

…ization schemes This pull request aims to modify the existing codebase such that the memory data of CharTensor can now store scale factors based on different quantization schemes. Additionally, this change allows the Tensor class to specify the desired quantization scheme while creating a new CharTensor instance. The scale factors are determined either during the quantization process using a specific quantizer or they can be manually initialized if both the quantized data and the corresponding scale factors are provided as inputs. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

jijoongmoon

LGTM

djeong20 requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, gichan-jang, anyj0527, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8 and EunjuYang as code owners December 27, 2024 00:31

github-actions bot added the Need Review label Dec 27, 2024

EunjuYang reviewed Dec 27, 2024

View reviewed changes

djeong20 force-pushed the update/char_tensor/qinfo branch from c55db72 to a5ec4e6 Compare December 30, 2024 01:37

EunjuYang self-assigned this Dec 31, 2024

EunjuYang mentioned this pull request Dec 31, 2024

[CharTensor] Enable QINT8 multiplication feature #2850

Open

EunjuYang removed their assignment Dec 31, 2024

EunjuYang approved these changes Jan 3, 2025

View reviewed changes

skykongkong8 approved these changes Jan 6, 2025

View reviewed changes

djeong20 force-pushed the update/char_tensor/qinfo branch from a5ec4e6 to 44fbaf6 Compare January 6, 2025 01:16

jijoongmoon approved these changes Jan 10, 2025

View reviewed changes

jijoongmoon merged commit d6d02c8 into nnstreamer:main Jan 10, 2025
18 checks passed

github-actions bot added PR/READY2MERGE and removed Need Review labels Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CharTensor] Enable memory data to store scale factors based on quantization schemes #2844

[CharTensor] Enable memory data to store scale factors based on quantization schemes #2844

djeong20 commented Dec 27, 2024

EunjuYang left a comment

EunjuYang Dec 27, 2024

djeong20 Dec 30, 2024

EunjuYang Dec 27, 2024

djeong20 Dec 30, 2024

EunjuYang left a comment

skykongkong8 left a comment

jijoongmoon left a comment

	new MemoryData((void )(new int8_t[dim.getDataLen() + 4 scale_size()]()));
	new MemoryData((void )(new int8_t[dim.getDataLen() + sizeof(float) scale_size()]()));

[CharTensor] Enable memory data to store scale factors based on quantization schemes #2844

[CharTensor] Enable memory data to store scale factors based on quantization schemes #2844

Conversation

djeong20 commented Dec 27, 2024

EunjuYang left a comment

Choose a reason for hiding this comment

EunjuYang Dec 27, 2024

Choose a reason for hiding this comment

djeong20 Dec 30, 2024

Choose a reason for hiding this comment

EunjuYang Dec 27, 2024

Choose a reason for hiding this comment

djeong20 Dec 30, 2024

Choose a reason for hiding this comment

EunjuYang left a comment

Choose a reason for hiding this comment

skykongkong8 left a comment

Choose a reason for hiding this comment

jijoongmoon left a comment

Choose a reason for hiding this comment