Skip to content

Commit

Permalink
torch compile config standardization update (#3166)
Browse files Browse the repository at this point in the history
* torch.compile config update

* torch.compile config update

* yaml test files

* yaml test files

* Fixed regression failure

* Fixed regression failure

* Fixed regression failure

* Workaround for regression failure

* Workaround for regression failure

* Workaround for regression failure

* skipping torchtext test

* Update test_example_torch_compile.py

* Update test_torch_compile.py

* Rename toy_model.py to model.py

* Update test_torch_compile.py

* Update test_torch_compile.py

* :Addressed review comments

* Addressed review comments
  • Loading branch information
agunapal authored Jun 11, 2024
1 parent 3d17a94 commit d29059f
Show file tree
Hide file tree
Showing 14 changed files with 198 additions and 44 deletions.
6 changes: 5 additions & 1 deletion examples/image_classifier/resnet_18/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,11 @@ Ex: `cd examples/image_classifier/resnet_18`
In this example , we use the following config

```
echo "pt2 : {backend: inductor, mode: reduce-overhead}" > model-config.yaml
echo "pt2:
compile:
enable: True
backend: inductor
mode: reduce-overhead" > model-config.yaml
```

##### Sample commands to create a Resnet18 torch.compile model archive, register it on TorchServe and run image prediction
Expand Down
14 changes: 10 additions & 4 deletions examples/pt2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,22 @@ pip install torchserve-nightly torch-model-archiver-nightly

## torch.compile

PyTorch 2.x supports several compiler backends and you pick which one you want by passing in an optional file `model_config.yaml` during your model packaging
PyTorch 2.x supports several compiler backends and you pick which one you want by passing in an optional file `model_config.yaml` during your model packaging. The default backend with the below minimum config is `inductor`

```yaml
pt2: "inductor"
pt2:
compile:
enable: True
```
You can also pass a dictionary with compile options if you need more control over torch.compile:
You can also pass various compile options if you need more control over torch.compile:
```yaml
pt2 : {backend: inductor, mode: reduce-overhead}
pt2:
compile:
enable: True
backend: inductor
mode: reduce-overhead
```
An example of using `torch.compile` can be found [here](./torch_compile/README.md)
Expand Down
14 changes: 10 additions & 4 deletions examples/pt2/torch_compile/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ Ex: `cd examples/pt2/torch_compile`
In this example , we use the following config

```
echo "pt2 : {backend: inductor, mode: reduce-overhead}" > model-config.yaml
echo "pt2:
compile:
enable: True" > model-config.yaml
```

### Create model archive
Expand Down Expand Up @@ -76,9 +78,13 @@ After a few iterations of warmup, we see the following
#### Measure inference time with `torch.compile`

```
echo "pt2: {backend: inductor, mode: reduce-overhead}" > model-config.yaml && \
echo "handler:" >> model-config.yaml && \
echo " profile: true" >> model-config.yaml
echo "pt2:
compile:
enable: True
backend: inductor
mode: reduce-overhead" > model-config.yaml && \
echo "handler:
profile: true" >> model-config.yaml
```

Once the `yaml` file is updated, create the model-archive, start TorchServe and run inference using the steps shown above.
Expand Down
6 changes: 5 additions & 1 deletion examples/pt2/torch_compile/model-config.yaml
Original file line number Diff line number Diff line change
@@ -1 +1,5 @@
pt2 : {backend: inductor, mode: reduce-overhead}
pt2:
compile:
enable: True
backend: inductor
mode: reduce-overhead
21 changes: 17 additions & 4 deletions examples/pt2/torch_compile_openvino/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,21 @@ In this example, we use the following config:
```bash
echo "minWorkers: 1
maxWorkers: 2
pt2: {backend: openvino}" > model-config.yaml
pt2:
compile:
enable: True
backend: openvino" > model-config.yaml
```

If you want to measure the handler `preprocess`, `inference`, `postprocess` times, use the following config:

```bash
echo "minWorkers: 1
maxWorkers: 2
pt2: {backend: openvino}
pt2:
compile:
enable: True
backend: openvino
handler:
profile: true" > model-config.yaml
```
Expand Down Expand Up @@ -132,7 +138,11 @@ Update the model-config.yaml file to specify the Inductor backend:
```bash
echo "minWorkers: 1
maxWorkers: 2
pt2: {backend: inductor, mode: reduce-overhead}
pt2:
compile:
enable: True
backend: inductor
mode: reduce-overhead
handler:
profile: true" > model-config.yaml
```
Expand All @@ -153,7 +163,10 @@ Update the model-config.yaml file to specify the OpenVINO backend:
```bash
echo "minWorkers: 1
maxWorkers: 2
pt2: {backend: openvino}
pt2:
compile:
enable: True
backend: openvino
handler:
profile: true" > model-config.yaml
```
Expand Down
12 changes: 10 additions & 2 deletions examples/pt2/torch_inductor_caching/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,11 @@ Ex: `cd examples/pt2/torch_inductor_caching`
In this example , we use the following config

```yaml
pt2 : {backend: inductor, mode: max-autotune}
pt2:
compile:
enable: True
backend: inductor
mode: max-autotune
```
### Create model archive
Expand Down Expand Up @@ -126,7 +130,11 @@ Ex: `cd examples/pt2/torch_inductor_caching`
In this example , we use the following config

```yaml
pt2 : {backend: inductor, mode: max-autotune}
pt2:
compile:
enable: True
backend: inductor
mode: max-autotune
```
### Create model archive
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
minWorkers: 4
maxWorkers: 4
responseTimeout: 600
pt2 : {backend: inductor, mode: max-autotune}
pt2:
compile:
enable: True
backend: inductor
mode: max-autotune
handler:
torch_inductor_caching:
torch_inductor_cache_dir: "/home/ubuntu/serve/examples/pt2/torch_inductor_caching/cache"
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
minWorkers: 4
maxWorkers: 4
responseTimeout: 600
pt2 : {backend: inductor, mode: max-autotune}
pt2:
compile:
enable: True
backend: inductor
mode: max-autotune
handler:
torch_inductor_caching:
torch_inductor_fx_graph_cache: true
3 changes: 3 additions & 0 deletions test/pytest/test_data/torch_compile/pt2_enable_default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
pt2:
compile:
enable: True
5 changes: 5 additions & 0 deletions test/pytest/test_data/torch_compile/pt2_enable_false.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pt2:
compile:
enable: False
backend: inductor
mode: reduce-overhead
5 changes: 5 additions & 0 deletions test/pytest/test_data/torch_compile/pt2_enable_true.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pt2:
compile:
enable: True
backend: inductor
mode: reduce-overhead
41 changes: 22 additions & 19 deletions test/pytest/test_example_torch_compile.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import sys
from pathlib import Path

import pytest
Expand Down Expand Up @@ -31,34 +32,36 @@
EXPECTED_RESULTS = ["tabby", "tiger_cat", "Egyptian_cat", "lynx", "plastic_bag"]


@pytest.fixture
def custom_working_directory(tmp_path):
# Set the custom working directory
custom_dir = tmp_path / "model_dir"
custom_dir.mkdir()
os.chdir(custom_dir)
yield custom_dir
# Clean up and return to the original working directory
os.chdir(tmp_path)
@pytest.fixture(scope="function")
def chdir_example(monkeypatch):
# Change directory to example directory
monkeypatch.chdir(EXAMPLE_ROOT_DIR)
monkeypatch.syspath_prepend(EXAMPLE_ROOT_DIR)
yield

# Teardown
monkeypatch.undo()

@pytest.mark.skipif(PT2_AVAILABLE == False, reason="torch version is < 2.0")
@pytest.mark.skip(reason="Skipping as its causing other testcases to fail")
def test_torch_compile_inference(monkeypatch, custom_working_directory):
monkeypatch.syspath_prepend(EXAMPLE_ROOT_DIR)
# Get the path to the custom working directory
model_dir = custom_working_directory
# Delete imported model
model = MODEL_FILE.split(".")[0]
if model in sys.modules:
del sys.modules[model]

try_and_handle(
f"wget https://download.pytorch.org/models/{MODEL_PTH_FILE} -P {model_dir}"
)

@pytest.mark.skipif(PT2_AVAILABLE == False, reason="torch version is < 2.0")
def test_torch_compile_inference(chdir_example):
# Download weights
if not os.path.isfile(EXAMPLE_ROOT_DIR.joinpath(MODEL_PTH_FILE)):
try_and_handle(
f"wget https://download.pytorch.org/models/{MODEL_PTH_FILE} -P {EXAMPLE_ROOT_DIR}"
)

# Handler for Image classification
handler = ImageClassifier()

# Context definition
ctx = MockContext(
model_pt_file=model_dir.joinpath(MODEL_PTH_FILE),
model_pt_file=MODEL_PTH_FILE,
model_dir=EXAMPLE_ROOT_DIR.as_posix(),
model_file=MODEL_FILE,
model_yaml_config_file=MODEL_YAML_CFG_FILE,
Expand Down
76 changes: 74 additions & 2 deletions test/pytest/test_torch_compile.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,16 @@
import os
import platform
import subprocess
import sys
import time
from pathlib import Path

import pytest
import torch
from pkg_resources import packaging
from test_data.torch_compile.compile_handler import CompileHandler

from ts.torch_handler.unit_tests.test_utils.mock_context import MockContext

PT_2_AVAILABLE = (
True
Expand All @@ -20,15 +24,42 @@
CURR_FILE_PATH = Path(__file__).parent
TEST_DATA_DIR = os.path.join(CURR_FILE_PATH, "test_data", "torch_compile")

MODEL_FILE = os.path.join(TEST_DATA_DIR, "model.py")
MODEL = "model.py"
MODEL_FILE = os.path.join(TEST_DATA_DIR, MODEL)
HANDLER_FILE = os.path.join(TEST_DATA_DIR, "compile_handler.py")
YAML_CONFIG_STR = os.path.join(TEST_DATA_DIR, "pt2.yaml") # backend as string
YAML_CONFIG_DICT = os.path.join(TEST_DATA_DIR, "pt2_dict.yaml") # arbitrary kwargs dict
YAML_CONFIG_ENABLE = os.path.join(
TEST_DATA_DIR, "pt2_enable_true.yaml"
) # arbitrary kwargs dict
YAML_CONFIG_ENABLE_FALSE = os.path.join(
TEST_DATA_DIR, "pt2_enable_false.yaml"
) # arbitrary kwargs dict
YAML_CONFIG_ENABLE_DEFAULT = os.path.join(
TEST_DATA_DIR, "pt2_enable_default.yaml"
) # arbitrary kwargs dict


SERIALIZED_FILE = os.path.join(TEST_DATA_DIR, "model.pt")
MODEL_STORE_DIR = os.path.join(TEST_DATA_DIR, "model_store")
MODEL_NAME = "half_plus_two"
EXPECTED_RESULT = 3.5


@pytest.fixture(scope="function")
def chdir_example(monkeypatch):
# Change directory to example directory
monkeypatch.chdir(TEST_DATA_DIR)
monkeypatch.syspath_prepend(TEST_DATA_DIR)
yield

# Teardown
monkeypatch.undo()

# Delete imported model
model = MODEL.split(".")[0]
if model in sys.modules:
del sys.modules[model]


@pytest.mark.skipif(
Expand Down Expand Up @@ -119,7 +150,6 @@ def _response_to_tuples(response_str):
os.environ.get("TS_RUN_IN_DOCKER", False),
reason="Test to be run outside docker",
)
@pytest.mark.skip(reason="Test failing on regression runner")
def test_serve_inference(self):
request_data = {"instances": [[1.0], [2.0], [3.0]]}
request_json = json.dumps(request_data)
Expand All @@ -146,3 +176,45 @@ def test_serve_inference(self):
"Compiled model with backend inductor, mode reduce-overhead"
in model_log
)

@pytest.mark.parametrize(
("compile"), ("disabled", "enabled", "enabled_reduce_overhead")
)
def test_compile_inference_enable_options(self, chdir_example, compile):
# Reset dynamo
torch._dynamo.reset()

# Handler
handler = CompileHandler()

if compile == "enabled":
model_yaml_config_file = YAML_CONFIG_ENABLE_DEFAULT
elif compile == "disabled":
model_yaml_config_file = YAML_CONFIG_ENABLE_FALSE
elif compile == "enabled_reduce_overhead":
model_yaml_config_file = YAML_CONFIG_ENABLE

# Context definition
ctx = MockContext(
model_pt_file=SERIALIZED_FILE,
model_dir=TEST_DATA_DIR,
model_file=MODEL,
model_yaml_config_file=model_yaml_config_file,
)

torch.manual_seed(42 * 42)
handler.initialize(ctx)
handler.context = ctx

# Check that model is compiled using dynamo
if compile == "enabled" or compile == "enabled_reduce_overhead":
assert isinstance(handler.model, torch._dynamo.OptimizedModule)
else:
assert not isinstance(handler.model, torch._dynamo.OptimizedModule)

# Data for testing
data = {"body": {"instances": [[1.0], [2.0], [3.0]]}}

result = handler.handle([data], ctx)

assert result[0] == EXPECTED_RESULT
Loading

0 comments on commit d29059f

Please sign in to comment.