-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contraction f16, bf16, f32_f16, f32_bf16, f64_f32 #158
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CongMa13
changed the title
Contraction f16 bf16
Contraction f16, bf16, f32_f16, f32_bf16, f64_f32
Nov 30, 2023
CongMa13
force-pushed
the
contraction_f16_bf16
branch
3 times, most recently
from
December 6, 2023 01:45
259d0ae
to
c58123a
Compare
CongMa13
requested review from
cgmillette,
bragadeesh,
mkarunan,
dlangbe and
afanfa
as code owners
December 6, 2023 01:47
CongMa13
commented
Dec 6, 2023
CongMa13
commented
Dec 6, 2023
- Support _Float16 - Support hip_bfloat16 - Add unit test of _Float16 and hip_bfloat16 - Add sample of _Float16 and hip_bfloat16
- Support ABCD data type f32 and compute type f16, bf16 - Support ABCD data type f64 and compute type f32 - Fixed bug: alpha, beta were passed in as wrong data type in unit test of contraction - Create sample template of contraction
Solution unique_ids of Actor Critic are have not been ready yet, but we put some placeholders in the new Actor Critic to make the unit tests be able to pass.
Update contraction device instances since CK has updated them.
1. Initiate the data with 0.01, 0.02, ... by default 2. Print C
When logger level is set to HIPTENSOR_LOG_LEVEL_PERF_TRACE, we make CK instances measure the running time. The problem is that CK internally will run the contraction 10 times by default. This leads to an issues: 1. It returns wrong result for C = alpha A x B + beta C Set StreamConfig.nrepeat_ = 1, the contraction will be run once
1. ck::bhalf_t cannot cast to float or double by static_cast. Use ck::type_convert() to fix it. 2. epsilon() is not good value to measure the relative difference of data. It is too small for double (eps < 10e-13).
The pattern of contraction sameple file is - bilinear: simple_bilinear_contraction_<A>_<B>_<C>_<D>_compute_<compute>.cpp - scale : simple_scale_contraction_<A>_<B>_<C>_compute_<compute>.cpp
The relative difference between contraction result and CPU reference is less than 0.1% after the improvement.
1. Revert the default threshold of relative difference to (100 * std::numeric_limits<T>::epsilon()) 2. Update CPU reference to make the difference between CPU reference and output of contraction instance is less than (100 * std::numeric_limits<T>::epsilon()).
CongMa13
force-pushed
the
contraction_f16_bf16
branch
from
December 8, 2023 20:53
67fbad0
to
67aebeb
Compare
cgmillette
reviewed
Dec 8, 2023
cgmillette
previously approved these changes
Dec 8, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me - let's wait until CK merges and then we will have to update CI to the latest
CongMa13
force-pushed
the
contraction_f16_bf16
branch
3 times, most recently
from
December 11, 2023 16:25
c1e92bf
to
5ba83c1
Compare
cgmillette
previously approved these changes
Dec 11, 2023
CongMa13
force-pushed
the
contraction_f16_bf16
branch
from
December 11, 2023 17:35
5ba83c1
to
b21fe0b
Compare
cgmillette
approved these changes
Dec 11, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.