Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MIGraphX + ROCm Onnxruntime Execution Provider support to Onnxruntime Backend #231

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a1f50c0
Initial changes for MIGraphX/ROCm EP additiona to build
Dec 4, 2023
55051a4
More pieces for run with migraphx/rocm
Dec 5, 2023
20bd228
Fix more hooks. Generate image with
Dec 6, 2023
b4bdfea
Update generator to create valid dockerfile for MIGraphX/ROCm EPs
Dec 6, 2023
e5f7ab3
additional changes to generate migraphx ORT dockerfile
Dec 20, 2023
3e5907b
Additional apt/pip libs and changes to onnxruntime for migarphx/rocm …
Dec 22, 2023
f4b66dd
Fix last step for MIGraphX specific headers
Dec 22, 2023
ec4dd6a
Merge branch 'main' into add_migraphx_rocm_onnxrt_eps
TedThemistokleous Jan 9, 2024
aa89843
Add CMake Hooks for enabling MIGraphX/ROCm in triton server build
Jan 10, 2024
3df7f34
Update cmake for generator script
Jan 10, 2024
dad5041
Fix link to hiprtc
Jan 12, 2024
61b6650
Fixes to scripts
Feb 21, 2024
062d0a2
Fix warning in CMakeList
Feb 21, 2024
1fd0f37
Remove hiptoolkit package for now
Feb 21, 2024
c087741
Use hip::host instead of hiprtc::hiprtc
Feb 21, 2024
7e7ec55
fixup! Use hip::host instead of hiprtc::hiprtc
Feb 21, 2024
ef5de15
Allow flowthrough of base container for ROCm builds
Feb 21, 2024
16565e2
Hard code ROCm container for now in build. Parameterize later
Feb 22, 2024
4bb23b2
fixup! Hard code ROCm container for now in build. Parameterize later
Feb 22, 2024
06a952e
Add docker commands to build to view process
Feb 22, 2024
e665bbd
Add rocm hip include for onnxruntime.cc
Feb 22, 2024
57cdc67
fixup! Add docker commands to build to view process
Feb 22, 2024
3d83dda
Show dockerfile during ORT build
Feb 22, 2024
f8302e9
Dockerfile.ort
Feb 22, 2024
cc9320b
bit more cleanup on gen_ort_dockerfile.py script
Feb 23, 2024
80ae1db
Fix issue where we weren't output Dockerfile correctly
Feb 23, 2024
5160558
fix dir for MIGraphX dockerfile
Feb 23, 2024
779f017
Update migraphx build arg
Feb 24, 2024
3386bf2
Remove workspace dir and work off root for Onnxruntime
Feb 24, 2024
6dfd219
Revert "Remove workspace dir and work off root for Onnxruntime"
Feb 26, 2024
4e3bf74
Fix WORKDIR for onnruntime. errornously removed when adding back olde…
Feb 26, 2024
e24b320
fixup! Fix WORKDIR for onnruntime. errornously removed when adding ba…
Feb 27, 2024
e769b82
Fix dir for migraphx_provider_factory.h
Feb 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 92 additions & 4 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -82,15 +82,36 @@ project(tritononnxruntimebackend LANGUAGES C CXX)
# igpu. If not set, the current platform will be used. If building on
# Jetpack, always set to igpu to avoid misdetection.
#
# - If you want ROCm support set
# TRITON_ENABLE_ONNXRUNTIME_ROCM=ON and set
# TRITON_BUILD_ONNXRUNTIME_ROCM_VERSION to the ROCm stack
# version that is compatible with the specified version of ONNX
# Runtime.
#
# - If you want MIGraphX support set
# TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX=ON and set
# TRITON_BUILD_ONNXRUNTIME_MIGRAPHX_VERSION to the MIGraphX
# version that is compatible with the specified version of ONNX
# Runtime. Requires that ROCm Support also be set
#
# - If you want to disable GPU usage, set TRITON_ENABLE_GPU=OFF.
# This will make builds with CUDA and TensorRT flags to fail.
#
option(TRITON_ENABLE_GPU "Enable GPU support in backend" ON)
option(TRITON_ENABLE_ROCM "Enable AMD GPU support in backend" ON)
option(TRITON_ENABLE_STATS "Include statistics collections in backend" ON)
option(TRITON_ENABLE_ONNXRUNTIME_TENSORRT
"Enable TensorRT execution provider for ONNXRuntime backend in server" OFF)
option(TRITON_ENABLE_ONNXRUNTIME_ROCM
"Enable ROCm execution provider for ONNXRuntime backend in server" OFF)
option(TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX
"Enable MIGraphX execution provider for ONNXRuntime backend in server" OFF)
option(TRITON_ENABLE_ONNXRUNTIME_OPENVINO
"Enable OpenVINO execution provider for ONNXRuntime backend in server" OFF)
set(TRITON_BUILD_ROCM_VERSION "" CACHE STRING "Version of ROCm install")
set(TRITON_BUILD_ROCM_HOME "" CACHE PATH "Path to ROCm install")
set(TRITON_BUILD_MIGRAPHX_VERSION "" CACHE STRING "Version of MIGraphX install")
set(TRITON_BUILD_MIGRAPHX_HOME "" CACHE PATH "Path to MIGraphX install")
set(TRITON_BUILD_CONTAINER "" CACHE STRING "Triton container to use a base for build")
set(TRITON_BUILD_CONTAINER_VERSION "" CACHE STRING "Triton container version to target")
set(TRITON_BUILD_ONNXRUNTIME_VERSION "" CACHE STRING "ONNXRuntime version to build")
Expand Down Expand Up @@ -122,6 +143,12 @@ if (NOT TRITON_ENABLE_GPU)
endif() # TRITON_ENABLE_ONNXRUNTIME_TENSORRT
endif() # NOT TRITON_ENABLE_GPU

if (NOT TRITON_ENABLE_ROCM)
if (TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX)
message(FATAL_ERROR "TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX=ON requires TRITON_ENABLE_ROCM=ON")
endif() # TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX
endif() # NOT TRITON_ENABLE_ROCM

if(NOT CMAKE_BUILD_TYPE)
set(CMAKE_BUILD_TYPE Release)
endif()
Expand Down Expand Up @@ -149,7 +176,11 @@ else()
endif()

if(NOT TRITON_BUILD_CONTAINER)
set(TRITON_BUILD_CONTAINER "nvcr.io/nvidia/tritonserver:${TRITON_BUILD_CONTAINER_VERSION}-py3-min")
if (TRITON_ENABLE_ROCM)
set(TRITON_BUILD_CONTAINER "rocm/pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2")
else()
set(TRITON_BUILD_CONTAINER "nvcr.io/nvidia/tritonserver:${TRITON_BUILD_CONTAINER_VERSION}-py3-min")
endif()
endif()

set(TRITON_ONNXRUNTIME_DOCKER_IMAGE "tritonserver_onnxruntime")
Expand Down Expand Up @@ -201,6 +232,13 @@ if(${TRITON_ENABLE_GPU})
find_package(CUDAToolkit REQUIRED)
endif() # TRITON_ENABLE_GPU

#
# ROCM
#
if(${TRITON_ENABLE_ROCM})
find_package(hip REQUIRED)
endif() # TRITON_ENABLE_ROCM

#
# Shared library implementing the Triton Backend API
#
Expand Down Expand Up @@ -234,6 +272,13 @@ target_compile_options(
$<$<CXX_COMPILER_ID:MSVC>:/Wall /D_WIN32_WINNT=0x0A00 /EHsc /Zc:preprocessor>
)

if(${TRITON_ENABLE_ROCM})
target_compile_definitions(
triton-onnxruntime-backend
PRIVATE TRITON_ENABLE_ROCM=1
)
endif() # TRITON_ENABLE_ROCM

if(${TRITON_ENABLE_GPU})
target_compile_definitions(
triton-onnxruntime-backend
Expand All @@ -253,6 +298,20 @@ if(${TRITON_ENABLE_ONNXRUNTIME_OPENVINO})
)
endif() # TRITON_ENABLE_ONNXRUNTIME_OPENVINO

if(${TRITON_ENABLE_ONNXRUNTIME_ROCM})
target_compile_definitions(
triton-onnxruntime-backend
PRIVATE TRITON_ENABLE_ONNXRUNTIME_ROCM=1
)
endif() # TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX

if(${TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX})
target_compile_definitions(
triton-onnxruntime-backend
PRIVATE TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX=1
)
endif() # TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX

if (WIN32)
set_target_properties(
triton-onnxruntime-backend
Expand Down Expand Up @@ -305,6 +364,14 @@ if(${TRITON_ENABLE_GPU})
)
endif() # TRITON_ENABLE_GPU

if(${TRITON_ENABLE_ROCM})
target_link_libraries(
triton-onnxruntime-backend
PRIVATE
hip::host
)
endif() #TRITON_ENABLE_ROCM

if(${TRITON_ENABLE_ONNXRUNTIME_OPENVINO})
target_link_libraries(
triton-onnxruntime-backend
Expand Down Expand Up @@ -334,11 +401,29 @@ if(TRITON_ONNXRUNTIME_DOCKER_BUILD)
set(_GEN_FLAGS ${_GEN_FLAGS} "--tensorrt-home=${TRITON_BUILD_TENSORRT_HOME}")
endif() # TRITON_BUILD_TENSORRT_HOME
if(${TRITON_ENABLE_ONNXRUNTIME_TENSORRT})
set(_GEN_FLAGS ${_GEN_FLAGS} "--ort-tensorrt")
set(_GEN_FLAGS ${_GEN_FLAGS} "--ort-tensorrt --trt-version=${TRT_VERSION} --onnx-tensorrt-tag=${TRITON_ONNX_TENSORRT_REPO_TAG}")
endif() # TRITON_ENABLE_ONNXRUNTIME_TENSORRT
if(${TRITON_ENABLE_ONNXRUNTIME_OPENVINO})
set(_GEN_FLAGS ${_GEN_FLAGS} "--ort-openvino=${TRITON_BUILD_ONNXRUNTIME_OPENVINO_VERSION}")
endif() # TRITON_ENABLE_ONNXRUNTIME_OPENVINO
if(NOT ${TRITON_BUILD_ROCM_VERSION} STREQUAL "")
set(_GEN_FLAGS ${_GEN_FLAGS} "--rocm-version=${TRITON_BUILD_ROCM_VERSION}")
endif() # TRITON_BUILD_ROCM_VERSION
if(NOT ${TRITON_BUILD_ROCM_HOME} STREQUAL "")
set(_GEN_FLAGS ${_GEN_FLAGS} "--migraphx-home=${TRITON_BUILD_ROCM_HOME}")
endif() # TRITON_BUILD_ROCM_HOME
if(${TRITON_ENABLE_ONNXRUNTIME_ROCM})
set(_GEN_FLAGS ${_GEN_FLAGS} "--enable-rocm")
endif() # TRITON_ENABLE_ONNXRUNTIME_ROCM
if(NOT ${TRITON_BUILD_MIGRAPHX_VERSION} STREQUAL "")
set(_GEN_FLAGS ${_GEN_FLAGS} "--migraphx-version=${TRITON_BUILD_MIGRAPHX_VERSION}")
endif() # TRITON_BUILD_MIGRAPHX_VERSION
if(NOT ${TRITON_BUILD_MIGRAPHX_HOME} STREQUAL "")
set(_GEN_FLAGS ${_GEN_FLAGS} "--migraphx-home=${TRITON_BUILD_MIGRAPHX_HOME}")
endif() # TRITON_BUILD_MIGRAPHX_HOME
if(${TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX})
set(_GEN_FLAGS ${_GEN_FLAGS} "--ort-migraphx")
endif() # TRITON_ENABLE_ONNXRUNTIME_MIGRAPHX

set(ENABLE_GPU_EXTRA_ARGS "")
if(${TRITON_ENABLE_GPU})
Expand All @@ -349,7 +434,7 @@ if(TRITON_ONNXRUNTIME_DOCKER_BUILD)
add_custom_command(
OUTPUT
onnxruntime/lib/${ONNXRUNTIME_LIBRARY}
COMMAND python3 ${CMAKE_CURRENT_SOURCE_DIR}/tools/gen_ort_dockerfile.py --triton-container="${TRITON_BUILD_CONTAINER}" --ort-version="${TRITON_BUILD_ONNXRUNTIME_VERSION}" --trt-version="${TRT_VERSION}" --onnx-tensorrt-tag="${TRITON_ONNX_TENSORRT_REPO_TAG}" ${_GEN_FLAGS} --output=Dockerfile.ort ${ENABLE_GPU_EXTRA_ARGS}
COMMAND python3 ${CMAKE_CURRENT_SOURCE_DIR}/tools/gen_ort_dockerfile.py --triton-container="${TRITON_BUILD_CONTAINER}" --ort-version="${TRITON_BUILD_ONNXRUNTIME_VERSION}" ${_GEN_FLAGS} --output=Dockerfile.ort ${ENABLE_GPU_EXTRA_ARGS}
COMMAND docker build --memory ${TRITON_ONNXRUNTIME_DOCKER_MEMORY} --cache-from=${TRITON_ONNXRUNTIME_DOCKER_IMAGE} --cache-from=${TRITON_ONNXRUNTIME_DOCKER_IMAGE}_cache0 --cache-from=${TRITON_ONNXRUNTIME_DOCKER_IMAGE}_cache1 -t ${TRITON_ONNXRUNTIME_DOCKER_IMAGE} -f ./Dockerfile.ort ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND powershell.exe -noprofile -c "docker rm onnxruntime_backend_ort > $null 2>&1; if ($LASTEXITCODE) { 'error ignored...' }; exit 0"
COMMAND docker create --name onnxruntime_backend_ort ${TRITON_ONNXRUNTIME_DOCKER_IMAGE}
Expand All @@ -362,11 +447,14 @@ if(TRITON_ONNXRUNTIME_DOCKER_BUILD)
add_custom_command(
OUTPUT
onnxruntime/lib/${ONNXRUNTIME_LIBRARY}
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/tools/gen_ort_dockerfile.py --ort-build-config="${CMAKE_BUILD_TYPE}" --triton-container="${TRITON_BUILD_CONTAINER}" --ort-version="${TRITON_BUILD_ONNXRUNTIME_VERSION}" --trt-version="${TRT_VERSION}" --onnx-tensorrt-tag="${TRITON_ONNX_TENSORRT_REPO_TAG}" ${_GEN_FLAGS} --output=Dockerfile.ort ${ENABLE_GPU_EXTRA_ARGS}
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/tools/gen_ort_dockerfile.py --ort-build-config="${CMAKE_BUILD_TYPE}" --triton-container="${TRITON_BUILD_CONTAINER}" --ort-version="${TRITON_BUILD_ONNXRUNTIME_VERSION}" ${_GEN_FLAGS} --output=Dockerfile.ort ${ENABLE_GPU_EXTRA_ARGS}
COMMAND cat Dockerfile.ort
COMMAND docker build --cache-from=${TRITON_ONNXRUNTIME_DOCKER_IMAGE} --cache-from=${TRITON_ONNXRUNTIME_DOCKER_IMAGE}_cache0 --cache-from=${TRITON_ONNXRUNTIME_DOCKER_IMAGE}_cache1 -t ${TRITON_ONNXRUNTIME_DOCKER_IMAGE} -f ./Dockerfile.ort ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND docker rm onnxruntime_backend_ort || echo 'error ignored...' || true
COMMAND docker create --name onnxruntime_backend_ort ${TRITON_ONNXRUNTIME_DOCKER_IMAGE}
COMMAND rm -fr onnxruntime
COMMAND docker image list
COMMAND docker ps
COMMAND docker cp onnxruntime_backend_ort:/opt/onnxruntime onnxruntime
COMMAND docker rm onnxruntime_backend_ort
COMMENT "Building ONNX Runtime"
Expand Down
5 changes: 5 additions & 0 deletions src/onnxruntime.cc
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,11 @@
#include <cuda_runtime_api.h>
#endif // TRITON_ENABLE_GPU

#ifdef TRITON_ENABLE_ROCM
#include <hip_runtime_api.h>
#endif // TRITON_ENABLE_ROCM


//
// ONNX Runtime Backend that implements the TRITONBACKEND API.
//
Expand Down
Loading