Releases · triton-inference-server/model_navigator

03 Jan 11:52

kacper-kleczewski

v0.13.1

eca4003

Triton Model Navigator v0.13.1 Latest

Latest

Updates:
- fix: Add AutocastType to public API

Version of external components used during testing:
- PyTorch 2.6.0a0+df5bbc0
- TensorFlow 2.16.1
- TensorRT 10.6.0.26
- Torch-TensorRT 2.6.0a0
- ONNX Runtime 1.19.2
- Polygraphy: 0.49.13
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

06 Dec 14:35

knowicki-nvidia

v0.13.0

cbb0892

Triton Model Navigator v0.13.0

Updates:
- new: Introducing custom_args in TensorConfig for custom runners to use which
  allows dynamic shapes setup for TorchTensorRT compilation
- new: autocast_dtype added Torch runner configuration to set the dtype for autocast
- new: New version of Onnx Runtime 1.20 for python version >= 3.10
- new: Use torch.compile path in heuristic search for max batch size
- change: Removed TensorFlow dependencies for nav.jax.optimize
- change: Removed PyTorch dependencies from nav.profile
- change: Collect all Python packages in status instead of filtered list
- change: Use default throughput cutoff threshold for max batch size heuristic when None provided in configuration
- change: Updated default ONNX opset to 20 for Torch >= 2.5
- fix: Exception is raised with Python >=3.11 due to wrong dataclass initialization
- fix: Removed option from ExportOption removed from Torch 2.5
- fix: Improved preprocessing stage in Torch based runners
- fix: Warn when using autocast with bfloat16 in Torch
- fix: Pass runner configuration to runners in nav.profile

Version of external components used during testing:
- PyTorch 2.6.0a0+df5bbc0
- TensorFlow 2.16.1
- TensorRT 10.6.0.26
- Torch-TensorRT 2.6.0a0
- ONNX Runtime 1.19.2
- Polygraphy: 0.49.13
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

10 Sep 12:27

kacper-kleczewski

v0.12.0

bce2922

Triton Model Navigator v0.12.0

Updates:
- new: simple and detailed reporting of the optimization process
- new: adjusted exporting TensorFlow SavedModel for Keras 3.x
- new: inform user when wrapped a module which is not called during optimize
- new: inform user when module use a custom forward function
- new: support for dynamic shapes in Torch ExportedProgram
- new: use ExportedProgram for Torch-TensorRT conversion
- new: support back-off policy during profiling to avoid reporting local minimum
- new: automatically scale conversion batch size when modules have different batch sizes in scope of a single pipeline
- change: TensorRT conversion max batch size search rely on saturating throughput for base formats
- change: adjusted profiling configuration for throughput cutoff search
- change: include optimized pipeline to list of examined variants during nav.profile
- change: performance is not executed when correctness failed for format and runtime
- change: verify command is not executed when verify function is not provided
- change: do not create a model copy before executing torch.compile
- fix: pipelines sometimes obtain model and tensors on different devices during nav.profile
- fix: extract graph from ExportedProgram for running inference
- fix: runner configuration not propagated to pre-processing steps
Version of external components used during testing:
- PyTorch 2.4.0a0+3bcc3cddb5
- TensorFlow 2.16.1
- TensorRT 10.3.0.26
- Torch-TensorRT 2.4.0.a0
- ONNX Runtime 1.18.1
- Polygraphy: 0.49.12
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

05 Aug 12:44

kacper-kleczewski

v0.11.0

ef44a44

Triton Model Navigator v0.11.0

Updates:
- new: Python 3.12 support
- new: Improved logging
- new: optimized in-place module can be stored to Triton model repository
- new: multi-profile support for TensorRT model build and runtime
- new: measure duration of each command executed in optimization pipeline
- new: TensorRT-LLM model store generation for deployment on Triton Inference Server
- change: filter unsupported runners instead of raising an error when running optimize
- change: moved JAX to support to experimental module and limited support
- change: use autocast=True for Torch based runners
- change: use torch.inference_mode or torch.no_grad context in nav.profile measurements
- change: use multiple strategies to select optimized runtime, defaults to [MaxThroughputAndMinLatencyStrategy, MinLatencyStrategy]
- change: trt_profiles are not set automatically for module when using nav.optimize
- fix: properly revert log level after torch onnx dynamo export
Version of external components used during testing:
- PyTorch 2.4.0a0+07cecf4
- TensorFlow 2.15.0
- TensorRT 10.0.1.6
- Torch-TensorRT 2.4.0.a0
- ONNX Runtime 1.18.1
- Polygraphy: 0.49.10
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

26 Jun 19:23

kacper-kleczewski

v0.10.1

95e5a24

Triton Model Navigator v0.10.1

Updates:
- fix: Check if torch 2 is available before doing dynamo cleanup
Version of external components used during testing:
- PyTorch 2.4.0a0+07cecf4
- TensorFlow 2.15.0
- TensorRT 10.0.1.6
- Torch-TensorRT 2.4.0.a0
- ONNX Runtime 1.18.0
- Polygraphy: 0.49.10
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

24 Jun 12:49

piotr-bazan-nv

v0.10.0

b8c265a

Triton Model Navigator v0.10.0

Updates:
- new: inplace nav.Module accepts batching flag which overrides a config setting and precision which allows setting appropriate configuration for TensorRT
- new: Allow to set device when loading optimized modules using nav.load_optimized()
- new: Add support for custom i/o names and dynamic shapes in Torch ONNX Dynamo path
- new: Added nav.bundle.save and nav.bundle.load to save and load optimized models from cache
- change: Improved optimize and profile status in inplace mode
- change: Improved handling defaults for ONNX Dynamo when executing nav.package.optimize
- fix: Maintaining modules device in nav.profile()
- fix: Add support for all precisions for TensorRT in nav.profile()
- fix: Forward method not passed to other inplace modules.
Version of external components used during testing:
- PyTorch 2.4.0a0+07cecf4
- TensorFlow 2.15.0
- TensorRT 10.0.1.6
- Torch-TensorRT 2.4.0.a0
- ONNX Runtime 1.18.0
- Polygraphy: 0.49.10
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

07 May 18:21

kacper-kleczewski

v0.9.0

ce00f21

Triton Model Navigator v0.9.0

Updates:
- new: TensorRT Timing Tactics Cache Management - using timing tactics cache files for optimization performance improvements
- new: Added throughput saturation verification in nav.profile() (enabled by default)
- new: Allow to override Inplace cache dir through MODEL_NAVIGATOR_DEFAULT_CACHE_DIR env variable
- new: inplace nav.Module can now receive a function name to be used instead of call in modules/submodules, allows customizing modules with non-standard calls
- fix: torch dynamo export and torch dynamo onnx export
- fix: measurement stabilization in nav.profile()
- fix: inplace inference through Torch
- fix: trt_profiles argument handling in ONNX to TRT conversion
- fix: optimal shape configuration for batch size in Inplace API
- change: Disable TensorRT profile builder
- change: nav.optimize() does not override module configuration
Known issues and limitations
- DistillERT ONNX dynamo export does not support dynamic shapes
Version of external components used during testing:
- PyTorch 2.3.0a0+6ddf5cf85e
- TensorFlow 2.15.0
- TensorRT 8.6.3
- Torch-TensorRT 2.0.0.dev0
- ONNX Runtime 1.17.1
- Polygraphy: 0.49.4
- GraphSurgeon: 0.4.6
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

04 Apr 14:04

jkosek

v0.8.1

0c216e0

Triton Model Navigator v0.8.1

fix: Inference with TensorRT when model has input with empty shape
fix: Using stabilized runners when model has no batching
fix: Invalid dependencies for cuDNN - review known issues
fix: Make ONNX Graph Surgeon produce artifacts within protobuf Limit (2G)
change: Remove TensorRTCUDAGraph from default runners
change: updated ONNX package to 1.16.0

Version of external components used during testing:
- PyTorch 2.3.0a0+40ec155e58
- TensorFlow 2.15.0
- TensorRT 8.6.3
- Torch-TensorRT 2.0.0.dev0
- ONNX Runtime 1.17.1
- Polygraphy: 0.49.4
- GraphSurgeon: 0.4.6
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

22 Mar 16:23

kacper-kleczewski

v0.8.0

a681e25

Triton Model Navigator v0.8.0

Updates:

new: Allow to select device for TensorRT runner
new: Add device output buffers to TensorRT runner
new: nav.profile added for profiling any Python function
change: API for Inplace optimization (breaking change)
fix: Passing inputs for Torch to ONNX export
fix: Parse args to kwargs in torchscript-trace export
fix: Lower peak memory usage when loading Torch inplace optimized model
Version of external components used during testing:
- PyTorch 2.3.0a0+ebedce2
- TensorFlow 2.15.0
- TensorRT 8.6.3
- Torch-TensorRT 2.0.0.dev0
- ONNX Runtime 1.17.1
- Polygraphy: 0.49.4
- GraphSurgeon: 0.4.6
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3

09 Feb 05:43

ptarasiewiczNV

v0.7.7

aa2d21b

Triton Model Navigator v0.7.7

Updates:

change: Add input and output specs for Triton model repositories generated from packages

Version of external components used during testing:

PyTorch 2.2.0a0+81ea7a48
TensorFlow 2.14.0
TensorRT 8.6.1
ONNX Runtime 1.16.2
Polygraphy: 0.49.0
GraphSurgeon: 0.3.27
tf2onnx v1.16.1
Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: triton-inference-server/model_navigator

Triton Model Navigator v0.13.1

Triton Model Navigator v0.13.0

Triton Model Navigator v0.12.0

Triton Model Navigator v0.11.0

Triton Model Navigator v0.10.1

Triton Model Navigator v0.10.0

Triton Model Navigator v0.9.0

Triton Model Navigator v0.8.1

Triton Model Navigator v0.8.0

Triton Model Navigator v0.7.7