Triton Model Navigator v0.9.0
kacper-kleczewski
released this
07 May 18:21
·
142 commits
to main
since this release
-
Updates:
- new: TensorRT Timing Tactics Cache Management - using timing tactics cache files for optimization performance improvements
- new: Added throughput saturation verification in
nav.profile()
(enabled by default) - new: Allow to override Inplace cache dir through
MODEL_NAVIGATOR_DEFAULT_CACHE_DIR
env variable - new: inplace
nav.Module
can now receive a function name to be used instead of call in modules/submodules, allows customizing modules with non-standard calls - fix: torch dynamo export and torch dynamo onnx export
- fix: measurement stabilization in
nav.profile()
- fix: inplace inference through Torch
- fix: trt_profiles argument handling in ONNX to TRT conversion
- fix: optimal shape configuration for batch size in Inplace API
- change: Disable TensorRT profile builder
- change:
nav.optimize()
does not override module configuration
-
Known issues and limitations
- DistillERT ONNX dynamo export does not support dynamic shapes
-
Version of external components used during testing:
- PyTorch 2.3.0a0+6ddf5cf85e
- TensorFlow 2.15.0
- TensorRT 8.6.3
- Torch-TensorRT 2.0.0.dev0
- ONNX Runtime 1.17.1
- Polygraphy: 0.49.4
- GraphSurgeon: 0.4.6
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.