Release Triton Model Navigator v0.9.0 · triton-inference-server/model_navigator

Updates:
- new: TensorRT Timing Tactics Cache Management - using timing tactics cache files for optimization performance improvements
- new: Added throughput saturation verification in nav.profile() (enabled by default)
- new: Allow to override Inplace cache dir through MODEL_NAVIGATOR_DEFAULT_CACHE_DIR env variable
- new: inplace nav.Module can now receive a function name to be used instead of call in modules/submodules, allows customizing modules with non-standard calls
- fix: torch dynamo export and torch dynamo onnx export
- fix: measurement stabilization in nav.profile()
- fix: inplace inference through Torch
- fix: trt_profiles argument handling in ONNX to TRT conversion
- fix: optimal shape configuration for batch size in Inplace API
- change: Disable TensorRT profile builder
- change: nav.optimize() does not override module configuration
Known issues and limitations
- DistillERT ONNX dynamo export does not support dynamic shapes
Version of external components used during testing:
- PyTorch 2.3.0a0+6ddf5cf85e
- TensorFlow 2.15.0
- TensorRT 8.6.3
- Torch-TensorRT 2.0.0.dev0
- ONNX Runtime 1.17.1
- Polygraphy: 0.49.4
- GraphSurgeon: 0.4.6
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Provide feedback