Add QNN EP HTP shared memory allocator #23136

edgchen1 · 2024-12-18T01:08:30Z

Description

Adds QNN EP HTP shared memory allocator.

The HTP shared memory allocator (HtpSharedMemoryAllocator) calls the rpcmem shared library (libcdsprpc.so/dll) to allocate and free memory that can be shared between HTP and CPU.

The allocator can be enabled by setting QNN EP option enable_htp_shared_memory_allocator to 1. QNNExecutionProvider::CreatePreferredAllocators() will then return an instance of HtpSharedMemoryAllocator.

For each QNN context, we also need to register and unregister memory handles in order to use the HTP shared memory. This memory handle management is added to QnnBackendManager, which also manages the QNN context handles.

For more information about using HTP shared memory with QNN, see: https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_shared_buffer_tutorial.html#shared-buffer-tutorial

Limitations:

HTP shared memory usage is only supported for graph inputs and outputs. Intermediate values are not supported.
An allocation is assigned to a single shared memory buffer. The allocator is not smart enough to have multiple allocations share a single shared memory buffer.

Motivation and Context

Improve performance by using HTP shared memory to avoid overhead from copying data between CPU and NPU.

…test

… declarations and definitions for IAllocator::TensorAlloc().

…ion clean up callback

onnxruntime/core/providers/qnn/builder/qnn_backend_manager.h

onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc

onnxruntime/core/providers/qnn/builder/qnn_context_mem_handle_manager.cc

onnxruntime/core/providers/qnn/builder/qnn_model.cc

onnxruntime/core/providers/qnn/builder/qnn_utils.cc

onnxruntime/core/framework/allocator.cc

onnxruntime/core/providers/qnn/builder/qnn_utils.cc

onnxruntime/test/shared_lib/test_inference.cc

include/onnxruntime/core/session/onnxruntime_cxx_api.h

yuslepukhin · 2025-01-13T20:05:42Z

Can this be used for Lora support when the model is modified to have optional inputs, and the data can be fed to override default initializers?

onnxruntime/test/shared_lib/test_inference.cc

yuslepukhin

🕐

edgchen1 · 2025-01-13T23:00:38Z

Can this be used for Lora support when the model is modified to have optional inputs, and the data can be fed to override default initializers?

I'm not too familiar with the scenario. If that can be done using OrtValues, an input OrtValue can use this new allocator.

yuslepukhin

onnxruntime/core/providers/qnn/qnn_allocator.cc

edgchen1 · 2025-01-14T17:45:50Z

/azp run Windows GPU WebGPU CI Pipeline

azure-pipelines · 2025-01-14T17:46:04Z

Azure Pipelines successfully started running 1 pipeline(s).

edgchen1 and others added 30 commits November 5, 2024 15:12

save work

110a3bc

save work

0ba3a2f

add logging for setting QNN tensor memory, update comment

8436b14

add option to enable HTP shared memory allocator to onnxruntime_perf_…

c9826f4

…test

hack - try to cache mem handles in QnnModel

c07c35e

Remove duplicate include.

60dc837

hack, continued - move cache out to SharedContext

24e072f

Merge remote-tracking branch 'origin/main' into edgchen1/qnn_ep_rpcmem

e66cbef

move mem handle registration to allocator

8c515da

hook up some test code

18e2780

Merge remote-tracking branch 'origin/main' into edgchen1/qnn_ep_rpcmem

09ddce5

rename to RpcMemAllocator to HtpSharedMemoryAllocator

a65bb71

Merge remote-tracking branch 'origin/main' into edgchen1/qnn_ep_rpcmem

bfb135e

remove onnx protobuf dependency from allocator.h, add shared provider…

f179a0d

… declarations and definitions for IAllocator::TensorAlloc().

remove unused CPUAllocator::TensorAlloc declaration

7645ef4

Check for nullptr when trying to free

1043732

move mem handle management to QNN backend manager

022f4bc

remove IAllocator::TensorAlloc()

c527dee

document IAllocator::Free

e4f72b3

remove IAllocator__TensorAlloc

39ff901

Merge remote-tracking branch 'origin/main' into edgchen1/qnn_ep_rpcmem

1bed5a4

fix android build warning

d70db84

remove shared mem handles from shared context

45ef883

remove allocation clean up callback removal, use weak_ptrs in allocat…

d2e7b3c

…ion clean up callback

some clean up

c892c18

more clean up

b295eef

add helper to get qnn error message

13f5e30

use make_shared for QnnBackendManager

d5eace1

add test to qnn_basic_test.cc, document allocator parameter.

bacbcdc

Merge remote-tracking branch 'origin/main' into edgchen1/qnn_ep_rpcmem

30cd9ed

skottmckay reviewed Jan 9, 2025

View reviewed changes

baijumeswani reviewed Jan 9, 2025

View reviewed changes

onnxruntime/core/framework/allocator.cc Show resolved Hide resolved

onnxruntime/core/providers/qnn/builder/qnn_utils.cc Show resolved Hide resolved

edgchen1 added 8 commits January 9, 2025 16:40

initialize with QNN_MEM_DESRIPTOR_INIT

88dec64

address comments

4ca3ea7

rework context handle ownership

7a88c3f

add / update tests

f373035

add check for qnn tensor dynamic shape

e86ff2e

Add comment about multi-threading considerations

6fa33f0

fix test comment

4101cca

fix formatting

14af7ad

edgchen1 commented Jan 11, 2025

View reviewed changes

onnxruntime/test/shared_lib/test_inference.cc Outdated Show resolved Hide resolved

edgchen1 added 2 commits January 13, 2025 11:20

add ifdef to use htp backend if on arm64 or linux.

2f5c93c

Merge remote-tracking branch 'origin/main' into edgchen1/qnn_ep_rpcmem

b868a9f

yuslepukhin reviewed Jan 13, 2025

View reviewed changes

include/onnxruntime/core/session/onnxruntime_cxx_api.h Outdated Show resolved Hide resolved

fix typo

7ca4552

yuslepukhin reviewed Jan 13, 2025

View reviewed changes

onnxruntime/test/shared_lib/test_inference.cc Show resolved Hide resolved

yuslepukhin reviewed Jan 13, 2025

View reviewed changes

onnxruntime/test/shared_lib/test_inference.cc Show resolved Hide resolved

yuslepukhin requested changes Jan 13, 2025

View reviewed changes

edgchen1 requested a review from yuslepukhin January 13, 2025 23:03

yuslepukhin approved these changes Jan 13, 2025

View reviewed changes

baijumeswani approved these changes Jan 13, 2025

View reviewed changes

skottmckay approved these changes Jan 13, 2025

View reviewed changes

adrianlizarraga approved these changes Jan 14, 2025

View reviewed changes

adrianlizarraga reviewed Jan 14, 2025

View reviewed changes

onnxruntime/core/providers/qnn/qnn_allocator.cc Show resolved Hide resolved

edgchen1 merged commit 04030f6 into main Jan 14, 2025
98 checks passed

edgchen1 deleted the edgchen1/qnn_ep_rpcmem branch January 14, 2025 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QNN EP HTP shared memory allocator #23136

Add QNN EP HTP shared memory allocator #23136

edgchen1 commented Dec 18, 2024 •

edited

Loading

yuslepukhin commented Jan 13, 2025

yuslepukhin left a comment

edgchen1 commented Jan 13, 2025

yuslepukhin left a comment

edgchen1 commented Jan 14, 2025

azure-pipelines bot commented Jan 14, 2025

Add QNN EP HTP shared memory allocator #23136

Add QNN EP HTP shared memory allocator #23136

Conversation

edgchen1 commented Dec 18, 2024 • edited Loading

Description

Motivation and Context

yuslepukhin commented Jan 13, 2025

yuslepukhin left a comment

Choose a reason for hiding this comment

edgchen1 commented Jan 13, 2025

yuslepukhin left a comment

Choose a reason for hiding this comment

edgchen1 commented Jan 14, 2025

azure-pipelines bot commented Jan 14, 2025

edgchen1 commented Dec 18, 2024 •

edited

Loading