Skip to content

Commit

Permalink
docs: update for alpha 1
Browse files Browse the repository at this point in the history
  • Loading branch information
decahedron1 committed Nov 22, 2023
1 parent 9ed222f commit dc309c0
Show file tree
Hide file tree
Showing 5 changed files with 226 additions and 35 deletions.
56 changes: 53 additions & 3 deletions docs/migrating/v2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,29 @@ title: Migrating from v1.x to v2
description: Here's what you need to know to update your application to use the latest & greatest version of ort.
---

## `Environment` no more
The `Environment` struct has been removed. Only one `Environment` was allowed per process, so it didn't really make sense to have an environment as a struct.

To configure an `Environment`, you instead use the `ort::init` function, which returns the same `EnvironmentBuilder` as v1.x. Use `commit()` to then commit the environment.

```rust
ort::init()
.with_execution_providers([CUDAExecutionProvider::default().build()])
.commit()?;
```

`commit()` must be called before any sessions are created to take effect. Otherwise, the environment will be set to default and cannot be modified afterwards.

## Session creation
`SessionBuilder::new(&environment)` has been soft-replaced with `Session::builder()`:

```diff
-// v1.x
-let session = SessionBuilder::new(&environment)?.with_model_from_file("model.onnx")?;
+// v2
+let session = Session::builder()?.with_model_from_file("model.onnx")?;
```

## Session inputs

### `CowArray`/`IxDyn`/`ndarray` no longer required
Expand All @@ -17,7 +40,7 @@ One of the biggest usability changes is that the usual pattern of `CowArray::fro
-// v1.x
-let mut tokens = CowArray::from(Array1::from_iter(tokens.iter().cloned()).into_dyn());
+// v2
+let mut tokens = Array1::from_iter(tokens.iter().cloned()));
+let mut tokens = Array1::from_iter(tokens.iter().cloned());
```

It should be noted that there are some cases in which an array is cloned when converting into a `Value` which may lead to a surprising performance hit. ONNX Runtime does not expose an API to specify the strides of a tensor, so if an array is reshaped before being transformed into a `Value`, it must be cloned in order to make the data contiguous. Specifically:
Expand Down Expand Up @@ -62,7 +85,7 @@ In previous versions, `Value::from_array` took an allocator parameter. The alloc
```diff
-// v1.x
-let val = Value::from_array(session.allocator(), &array)?;
+// v2.0
+// v2
+let val = Value::from_array(&array)?;
```

Expand All @@ -86,7 +109,7 @@ v2.0 adds the groundwork for supporting sequence and maps types in `Value`s, so
```diff
-// v1.x
-let output1 = outputs[0].try_extract::<f32>()?;
+// v2.0
+// v2
+let output1 = outputs[0].extract_tensor::<f32>()?;
```

Expand All @@ -97,6 +120,33 @@ Just like how inputs can now be specified by name, you can now retrieve session
let l = outputs["latents"].extract_tensor::<f32>()?;
```

## Execution providers
Execution provider structs with public fields have been replaced with builder pattern structs. See the [API reference](https://docs.rs/ort/2.0.0-alpha.1/ort/index.html?search=ExecutionProvider) and the [execution providers reference](/perf/execution-providers) for more information.

```diff
-// v1.x
-builder = builder.with_execution_providers(ExecutionProvider::DirectML(DirectMLExecutionProvider {
- device_id: 1
-}))?;
+// v2
+builder = builder.with_execution_providers([
+ DirectMLExecutionProvider::default()
+ .with_device_id(1)
+ .build()
+])?;
```

## Updated dependencies & features

### `ort::sys` has been split into `ort-sys`
The `ort::sys` module has been split out into [its own `ort-sys` crate](https://crates.io/crates/ort-sys). If you don't use `ort::sys` in your application, this won't affect you.

### `ndarray` is now optional
The dependency on `ndarray` is now declared optional. If you use `ort` with `default-features = false`, you'll need to add the `ndarray` feature.

## The Flattening
All modules except `download` are no longer public. Exports have been flattened to the crate root, so i.e. `ort::session::Session` becomes `ort::Session`.

## Renamed types
The following types have been renamed with no other changes.
- `NdArrayExtensions` -> `ArrayExtensions`
Expand Down
2 changes: 1 addition & 1 deletion docs/migrating/version-mapping.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: Information about `ort`'s versioning and relation to ONNX Runtime v
## A note on SemVer
`ort` versions pre-2.0 were not SemVer compatible. From v2.0 onwards, breaking API changes are accompanied by a **major version update**.

Updates to the version of ONNX Runtime used by `ort` may occur on **minor** version updates, i.e. 2.0 ships with ONNX Runtime 1.16.2, but 2.1 may ship with 1.17.0. In case you require a specific version of ONNX Runtime, you should pin the minor version in your `Cargo.toml` using a [tilde requirement](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#tilde-requirements):
Updates to the version of ONNX Runtime used by `ort` may occur on **minor** version updates, i.e. 2.0 ships with ONNX Runtime 1.16.2, but 2.1 may ship with 1.17.0. ONNX Runtime is generally forward compatible, but in case you require a specific version of ONNX Runtime, you should pin the minor version in your `Cargo.toml` using a [tilde requirement](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#tilde-requirements):
```toml
[dependencies]
ort = { version = "~2.0", ... }
Expand Down
155 changes: 155 additions & 0 deletions docs/perf/execution-providers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,158 @@
title: Execution providers
description: Learn how to enable execution providers to leverage hardware acceleration.
---

Execution providers (EPs) enable ONNX Runtime to execute ONNX graphs with hardware acceleration. If you have specialized hardware like a GPU or NPU, execution providers can provide a massive performance boost to your `ort` applications. For more information on the intricacies of execution providers, see the [ONNX Runtime docs](https://onnxruntime.ai/docs/execution-providers/).

ONNX Runtime must be compiled with support for each execution provider. pyke provides precompiled binaries for some of the most common EPs. Below is a table showing available EPs, their support in `ort`, and their binary availability status.

| EP | Supported | Binaries | Static linking
|:-------- |:------- |:------ |:------ |
| NVIDIA CUDA | 🟢 | 🟢 ||
| NVIDIA TensorRT | 🟢 | 🟢 ||
| Microsoft DirectML | 🟢 | 🟢 | 🟢 |
| Apple CoreML | 🟢 | 🟢 | 🟢 |
| AMD ROCm | 🟢 |||
| Intel OpenVINO | 🟢 |||
| Intel oneDNN | 🟢 |||
| XNNPACK | 🟢 | 🟢 | 🟢 |
| Qualcomm QNN | 🟢 |||
| Huawei CANN | 🟢 |||
| Android NNAPI | 🟢 |||
| Apache TVM | 🟢 |||
| Arm ACL | 🟢 |||
| ArmNN | 🟢 |||
| AMD MIGraphX ||||
| AMD Vitis AI ||||
| Microsoft Azure ||||
| Rockchip RKNPU ||||

Some EPs supported by ONNX Runtime are not supported by `ort` due to a lack of hardware for testing. If your preferred EP is missing support and you've got the hardware, please [open an issue](https://github.com/pykeio/ort/issues/new)!

## Registering execution providers
In order to use an execution provider, you must **register** it before creating an environment or session. You can do this via the `SessionBuilder::with_execution_providers` method. For example, to register the CUDA execution provider:

```rust
use ort::{CUDAExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([CUDAExecutionProvider::default().build()])?
.with_model_from_file("model.onnx")?;

Ok(())
}
```

You can, of course, specify multiple execution providers. `ort` will register all EPs specified, in order. If registration of an EP succeeds, `ort` will still continue to register the rest of the EPs so that ONNX Runtime can fallback to other EPs in case an EP does not support a certain operator.

```rust
use ort::{CoreMLExecutionProvider, CUDAExecutionProvider, DirectMLExecutionProvider, TensorRTExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([
// Prefer TensorRT over CUDA.
TensorRTExecutionProvider::default().build(),
CUDAExecutionProvider::default().build(),
// Use DirectML on Windows if NVIDIA EPs are not available
DirectMLExecutionProvider::default().build(),
// Or use ANE on Apple platforms
CoreMLExecutionProvider::default().build()
])?
.with_model_from_file("model.onnx")?;

Ok(())
}
```

## Configuring EPs
EPs have configuration options to control behavior or increase performance. Each `XXXExecutionProvider` struct returns a builder with configuration methods. See the [API reference](https://docs.rs/ort/2.0.0-alpha.1/ort/index.html?search=ExecutionProvider) for the EP structs for more information on which options are supported and what they do.

```rust
use ort::{CoreMLExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([
CoreMLExecutionProvider::default()
// this model uses control flow operators, so enable CoreML on subgraphs too
.with_subgraphs()
// only use the ANE as the CoreML CPU implementation is super slow for this model
.with_ane_only()
.build()
])?
.with_model_from_file("model.onnx")?;

Ok(())
}
```

## Fallback behavior
`ort` will silently fail and fall back to executing on the CPU if all execution providers fail to register. In many cases, though, you'll want to show the user an error message when an EP fails to register, or outright abort the process.

If you need to know the status of an EP registration, use `ExecutionProvider::register` on a `SessionBuilder`.

```rust
use ort::{CUDAExecutionProvider, ExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
let builder = Session::builder()?;

let cuda = CUDAExecutionProvider::default();
if cuda.register(&builder).is_err() {
eprintln!("Failed to register CUDA!");
std::process::exit(1);
}

let session = builder.with_model_from_file("model.onnx")?;

Ok(())
}
```

You can also check whether ONNX Runtime is even compiled with support for the execution provider with the `is_available` method.

```rust
use ort::{CoreMLExecutionProvider, ExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
let builder = Session::builder()?;

let coreml = CoreMLExecutionProvider::default();
if !coreml.is_available() {
eprintln!("Please compile ONNX Runtime with CoreML!");
std::process::exit(1);
}

let session = builder.with_model_from_file("model.onnx")?;

Ok(())
}
```

## Global defaults
You can configure `ort` to attempt to register a list of execution providers for all sessions created in an environment.

```rust
use ort::{CUDAExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
ort::init()
.with_execution_providers([CUDAExecutionProvider::default().build()])
.commit()?;

let session = Session::builder()?.with_model_from_file("model.onnx")?;
// The session will attempt to register the CUDA EP
// since we configured the environment default.

Ok(())
}
```

If a session is built `with_execution_providers` after configuring global defaults, the EPs provided by the session will be registered *after* the environment defaults. It does not override the environment EPs.

Note that `ort::init` must come before you create any sessions, otherwise the configuration will not take effect!

## Troubleshooting
If it seems like the execution provider is not registering properly, or you are not getting acceptable performance, see the [Troubleshooting: Performance](/troubleshooting/performance) page for more information on how to debug any EP issues.
40 changes: 16 additions & 24 deletions docs/setup/platforms.mdx
Original file line number Diff line number Diff line change
@@ -1,35 +1,27 @@
---
title: Platform support
description: ONNX Runtime, and by extension `ort`, supports a wide variety of platforms. For most desktop users, pre-built binaries are available, so setting up `ort` is as simple as adding it to your `Cargo.toml`!
---

ONNX Runtime, and by extension `ort`, supports a wide variety of platforms. For some platforms, pre-built binaries are available, so setting up `ort` is as simple as adding it to your `Cargo.toml`!

Binaries may be provided by Microsoft, pyke, or both. Binaries provided by Microsoft are downloaded directly from ONNX Runtime's CI pipeline, though they may include telemetry. Binaries provided by pyke do not include this telemetry.

Here are the supported platforms and binary availability status, as of v2.0.

* 🟢 - Supported. Binaries provided by both Microsoft and pyke.
* 🔷 - Supported. Binaries provided by pyke.
* 🟨 - Supported. No binaries available.
* ❌ - Not supported. No binaries available.

| Platform | Dynamic | Static |
|:-------- |:------- |:------ |
| **Windows** | 🟢 x86/x64, ARM/ARM64 | 🔷 x64, ARM64 |
| **Linux** | 🟢 x86/x64, ARM/ARM64 | 🔷 x64, ARM64 |
| **macOS** | 🟢 x64, ARM64 | 🔷 x64, ARM64 |
| **WebAssembly** || 🔷 |
| **iOS** | 🟨 | 🟨 |
| **Android** | 🟨 | 🟨 |
* 🟢 - Supported. Dynamic & static binaries provided by pyke.
* 🔷 - Supported. Static binaries provided by pyke.
* 🟨 - Supported. Dynamic binaries provided by pyke.
* ⭕ - Supported. Precompiled binaries not available.
* ❌ - Not supported.

| Platform | x86 | x86-64 | ARMv7 | ARM64 | WASM32 |
|:-------- |:------- |:------ |:------ |:------ |:------ |
| **Windows** || 🟢 || 🔷 ||
| **Linux** || 🟢 || 🔷 ||
| **macOS** || 🔷 || 🔷 ||
| **iOS** ||||||
| **Android** ||||||
| **Web** ||||| 🔷 |

Remember that "Dynamic" linking means ONNX Runtime will come as a DLL or `.so` alongside your application. "Static" means ONNX Runtime is linked directly into your application in a single file. If available, `ort` will always prefer linking to static binaries over dynamic binaries.

Currently, the default behavior in `ort` is to only download Microsoft binaries. To enable pyke binaries, enable the `"pyke-parcel"` feature in your `Cargo.toml`.
```toml
[dependencies]
ort = { version = "2.0", features = [ "pyke-parcel" ] }
```

Certain execution providers may or may not have binaries available; check EP support in the [execution providers](/perf/execution-providers) documentation.
Certain execution providers may not have binaries available. Most EPs only work with dynamic linking. You can check EP binary support in the [execution providers](/perf/execution-providers) documentation.

For platforms marked as 🟨, you'll need to [compile ONNX Runtime from source](https://onnxruntime.ai/docs/build/).
8 changes: 1 addition & 7 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,7 @@ pub use self::environment::{init, EnvironmentBuilder};
#[cfg_attr(docsrs, doc(cfg(feature = "fetch-models")))]
pub use self::error::FetchModelError;
pub use self::error::{Error, ErrorInternal, Result};
pub use self::execution_providers::{
ACLExecutionProvider, ArenaExtendStrategy, ArmNNExecutionProvider, CANNExecutionProvider, CANNExecutionProviderImplementationMode,
CANNExecutionProviderPrecisionMode, CPUExecutionProvider, CUDAExecutionProvider, CUDAExecutionProviderCuDNNConvAlgoSearch, CoreMLExecutionProvider,
DirectMLExecutionProvider, ExecutionProviderDispatch, NNAPIExecutionProvider, OneDNNExecutionProvider, OpenVINOExecutionProvider, QNNExecutionProvider,
QNNExecutionProviderPerformanceMode, ROCmExecutionProvider, TVMExecutionProvider, TVMExecutorType, TVMTuningType, TensorRTExecutionProvider,
XNNPACKExecutionProvider
};
pub use self::execution_providers::*;
pub use self::io_binding::IoBinding;
pub use self::memory::{AllocationDevice, Allocator, MemoryInfo};
pub use self::metadata::ModelMetadata;
Expand Down

0 comments on commit dc309c0

Please sign in to comment.