Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers with latest GPU pipeline changes @open sesame 10/04 17:23 #2745

Closed
wants to merge 2 commits into from

Conversation

niket-agarwal
Copy link
Contributor

Updated the **SwiGLU**, **Reshape**, and **Concat** layer with the new shared_ptr flow.
Replaced `clCreateKernel` with `registerClKernel` for all these layers.


   **Self evaluation:**

        Build test: [X]Passed [ ]Failed [ ]Skipped
        Run test: [X]Passed [ ]Failed [ ]Skipped

…es with OpenCL ops

Added naive version of OpenCL implementation for Transpose function using blas
Incorporated kernels for ops used.
Added unit tests for transpose about different axes.

Signed-off-by: Niket Agarwal <[email protected]>
Updated the swiglu, reshape, and concat layers with the new shared_ptr flow.
Replaced clCreateKernel with registerClKernel for all these layers.

Self evaluation:

	Build test: [X]Passed [ ]Failed [ ]Skipped
        Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Niket Agarwal <[email protected]>
@taos-ci
Copy link

taos-ci commented Oct 4, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2745. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

@niket-agarwal niket-agarwal changed the title [GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers with shared_ptr flow [GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers with latest GPU pipeline changes Oct 4, 2024
@jijoongmoon jijoongmoon changed the title [GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers with latest GPU pipeline changes [GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers with latest GPU pipeline changes @open sesame 10/04 17:23 Oct 4, 2024
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.

Comment on lines 138 to 144
do {
result = context.clCreateKernel(copy_cl_kernel_, context.LayerKernel::COPY,
ReshapeLayerCl::kernel_copy);
if (!result) {
ClContext::SharedPtrClKernel kernel_copy_ptr =
cl_context_ref.registerClKernel(copy_cl_kernel_fp16_, "copy_cl_fp16");
if (!kernel_copy_ptr) {
break;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making this part as a separate function and letting the function to be called when the clLayer is registered? This opinion comes from #2723. Could you please think about this and share your opinion?

Copy link
Contributor

@EunjuYang EunjuYang Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About this comment, we may need additional discussions and can apply it in another PR. As I understood, this PR aims to update the kernels to work with the recent commit. You can ignore this comment for this PR now :)

@@ -148,15 +149,13 @@ class ConcatLayerCl : public Layer {
* @param[in] input1_height represents the height of the input tensor
* @param[in] input1_width represents the width of the input tensor A
* @param[in] input2_width represents the width of the input tensor X
* @param[in] context RunLayerContext reference
*/
void concat_cl_axis3_fp16(const __fp16 *matAdata, const __fp16 *vecXdata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add ENABLE_FP16 condition to this header.

Suggested change
void concat_cl_axis3_fp16(const __fp16 *matAdata, const __fp16 *vecXdata,
#ifdef ENABLE_FP16
void concat_cl_axis3_fp16(const __fp16 *matAdata, const __fp16 *vecXdata,

Besides this function prototypes, all concat_cl_axis*_fp16 should be imported only when ENABLE_FP16 is defined.
Please consider this conditional compilation for cpp file as well.

@s-debadri
Copy link
Contributor

s-debadri commented Oct 7, 2024

Please rebase it so that it's in sync with PR #2738 (merged). Make sure to add the kernel strings in blas_kernel_strings.h.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants