Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#16871: Unify conditional typecasting in BinaryNg #17148

Merged
merged 4 commits into from
Feb 1, 2025

Conversation

mcw-anasuya
Copy link
Contributor

@mcw-anasuya mcw-anasuya commented Jan 27, 2025

Ticket

#16871

Problem description

Move conditional logic that exists in inplace binary_ng to binary_ng and delegate inplace binary_ng to binary_ng struct.

What's changed

Unified the logic for binary_ng and inplace binary_ng.

Explanation

bfloat8_b and bfloat4_b tensors need to be typecasted to bfloat16 for binary_ng.
In this unified logic, inplace operation is delegated to BinaryNg by passing input_tensor_a as the optional_output_tensor. This combined logic takes care of the following:
1. Operation with no optional output provided:
Eg: output_tensor = ttnn.experimental.sub(input_tensor_a, input_tensor_b). In this case, if none of the inputs require typecasting, the program will return the result. The resulting tensor will have the dtype of input_tensor_a. If input_tensor_a alone or both input tensors require typecasting, we typecast the output_tensor back to the original dtype and return result.

Screenshot 2025-01-28 at 19 37 57

2. Operation with optional output tensor provided:
Eg: ttnn.experimental.sub(input_tensor_a, input_tensor_b, output_tensor = out_tt) where out_tt is a tensor of provided output shape. In this case, if none of the inputs require typecasting, the program will return the result. The resulting tensor will have the dtype of out_tt. If out_tt is of dtype bfloat8_b or bfloat4_b, we typecast it to bfloat16 before passing it to the invoke function. We then typecast the result back to the original dtype and copy it back to out_tt, which is then returned.
Screenshot 2025-01-29 at 17 57 38

3. Inplace operation:
Eg: ttnn.experimental.sub_(input_tensor_a, input_tensor_b). In this case, the optional output tensor is input_tensor_a. if none of the inputs require typecasting, the program will return the result with the dtype of input_tensor_a. If input_tensor_a alone or both input tensors require typecasting, we typecast result from bfloat16 back to to input_tensor_a's dtype.

Checklist

@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch from 5b8870c to 19cb5d8 Compare January 27, 2025 19:19
@KalaivaniMCW KalaivaniMCW force-pushed the anasuya/unify_typecast_in_binary_ng branch 2 times, most recently from f31e04e to 01adab6 Compare January 28, 2025 03:49
@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch 2 times, most recently from 2e6a288 to 5de8ec1 Compare January 28, 2025 14:23
@KalaivaniMCW KalaivaniMCW force-pushed the anasuya/unify_typecast_in_binary_ng branch 2 times, most recently from 373803a to 51535a1 Compare January 28, 2025 16:00
@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch from 51535a1 to 271df47 Compare January 29, 2025 11:37
@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch from 271df47 to 7f95220 Compare January 29, 2025 17:11
@mcw-anasuya mcw-anasuya marked this pull request as ready for review January 29, 2025 17:11
@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch 2 times, most recently from 6039cf2 to 4675ee3 Compare January 30, 2025 11:04
@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch 2 times, most recently from 37a278c to 51386b9 Compare February 1, 2025 06:04
@mcw-anasuya mcw-anasuya force-pushed the anasuya/unify_typecast_in_binary_ng branch from 51386b9 to 50a3cb2 Compare February 1, 2025 06:29
@mcw-anasuya mcw-anasuya merged commit 55dd492 into main Feb 1, 2025
8 checks passed
@mcw-anasuya mcw-anasuya deleted the anasuya/unify_typecast_in_binary_ng branch February 1, 2025 06:36
nikileshx pushed a commit to nikileshx/tt-metal that referenced this pull request Feb 3, 2025
…rent#17148)

### Ticket
tenstorrent#16871

### Problem description
Move conditional logic that exists in inplace binary_ng to binary_ng and
delegate inplace binary_ng to binary_ng struct.

### What's changed
Unified the logic for binary_ng and inplace binary_ng.


### Explanation
bfloat8_b and bfloat4_b tensors need to be typecasted to bfloat16 for
binary_ng.
In this unified logic, inplace operation is delegated to BinaryNg by
passing `input_tensor_a` as the `optional_output_tensor`. This combined
logic takes care of the following:
**1. Operation with no optional output provided:**
Eg: output_tensor = ttnn.experimental.sub(input_tensor_a,
input_tensor_b). In this case, if none of the inputs require
typecasting, the program will return the result. The resulting tensor
will have the dtype of `input_tensor_a`. If `input_tensor_a` alone or
both input tensors require typecasting, we typecast the output_tensor
back to the original dtype and return result.
 
<img width="344" alt="Screenshot 2025-01-28 at 19 37 57"
src="https://github.com/user-attachments/assets/de301b9f-e8fc-435f-8032-223297ac9d12"
/>

**2. Operation with optional output tensor provided:**
Eg: ttnn.experimental.sub(input_tensor_a, input_tensor_b, output_tensor
= out_tt) where out_tt is a tensor of provided output shape. In this
case, if none of the inputs require typecasting, the program will return
the result. The resulting tensor will have the dtype of `out_tt`. If
`out_tt` is of dtype bfloat8_b or bfloat4_b, we typecast it to bfloat16
before passing it to the invoke function. We then typecast the result
back to the original dtype and copy it back to `out_tt`, which is then
returned.
<img width="338" alt="Screenshot 2025-01-29 at 17 57 38"
src="https://github.com/user-attachments/assets/4e4ddda5-e0ea-48f5-8e91-73154efcf4b6"
/>


**3. Inplace operation:**
Eg: ttnn.experimental.sub_(input_tensor_a, input_tensor_b). In this
case, the optional output tensor is `input_tensor_a`. if none of the
inputs require typecasting, the program will return the result with the
dtype of `input_tensor_a`. If `input_tensor_a` alone or both input
tensors require typecasting, we typecast result from bfloat16 back to to
`input_tensor_a`'s dtype.

### Checklist
- [ ] Post commit CI passes
https://github.com/tenstorrent/tt-metal/actions/runs/13072107561
- [x] Blackhole Post commit tests
https://github.com/tenstorrent/tt-metal/actions/runs/13072118352
- [ ] (Single-card) Tests for new models
https://github.com/tenstorrent/tt-metal/actions/runs/13072147663
- [x] New/Existing tests provide coverage for changes

---------

Co-authored-by: KalaivaniMCW <[email protected]>
hschoi4448 pushed a commit that referenced this pull request Feb 20, 2025
### Ticket
#16871

### Problem description
Move conditional logic that exists in inplace binary_ng to binary_ng and
delegate inplace binary_ng to binary_ng struct.

### What's changed
Unified the logic for binary_ng and inplace binary_ng.


### Explanation
bfloat8_b and bfloat4_b tensors need to be typecasted to bfloat16 for
binary_ng.
In this unified logic, inplace operation is delegated to BinaryNg by
passing `input_tensor_a` as the `optional_output_tensor`. This combined
logic takes care of the following:
**1. Operation with no optional output provided:**
Eg: output_tensor = ttnn.experimental.sub(input_tensor_a,
input_tensor_b). In this case, if none of the inputs require
typecasting, the program will return the result. The resulting tensor
will have the dtype of `input_tensor_a`. If `input_tensor_a` alone or
both input tensors require typecasting, we typecast the output_tensor
back to the original dtype and return result.
 
<img width="344" alt="Screenshot 2025-01-28 at 19 37 57"
src="https://github.com/user-attachments/assets/de301b9f-e8fc-435f-8032-223297ac9d12"
/>

**2. Operation with optional output tensor provided:**
Eg: ttnn.experimental.sub(input_tensor_a, input_tensor_b, output_tensor
= out_tt) where out_tt is a tensor of provided output shape. In this
case, if none of the inputs require typecasting, the program will return
the result. The resulting tensor will have the dtype of `out_tt`. If
`out_tt` is of dtype bfloat8_b or bfloat4_b, we typecast it to bfloat16
before passing it to the invoke function. We then typecast the result
back to the original dtype and copy it back to `out_tt`, which is then
returned.
<img width="338" alt="Screenshot 2025-01-29 at 17 57 38"
src="https://github.com/user-attachments/assets/4e4ddda5-e0ea-48f5-8e91-73154efcf4b6"
/>


**3. Inplace operation:**
Eg: ttnn.experimental.sub_(input_tensor_a, input_tensor_b). In this
case, the optional output tensor is `input_tensor_a`. if none of the
inputs require typecasting, the program will return the result with the
dtype of `input_tensor_a`. If `input_tensor_a` alone or both input
tensors require typecasting, we typecast result from bfloat16 back to to
`input_tensor_a`'s dtype.

### Checklist
- [ ] Post commit CI passes
https://github.com/tenstorrent/tt-metal/actions/runs/13072107561
- [x] Blackhole Post commit tests
https://github.com/tenstorrent/tt-metal/actions/runs/13072118352
- [ ] (Single-card) Tests for new models
https://github.com/tenstorrent/tt-metal/actions/runs/13072147663
- [x] New/Existing tests provide coverage for changes

---------

Co-authored-by: KalaivaniMCW <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants