Monitoring Torchserve custom metrics with mtail metrics exporter and prometheus

In this example, we show how to use a pre-trained custom MNIST model and export the custom metrics using mtail and prometheus

We used the following pytorch example to train the basic MNIST model for digit recognition : https://github.com/pytorch/examples/tree/master/mnist

Run the commands given in following steps from the parent directory of the root of the repository. For example, if you cloned the repository into /home/my_path/serve, run the steps from /home/my_path

Steps

Step 1: In this example we introduce a new custom metric SizeOfImage in the custom handler and export it using mtail.

def preprocess(self, data):
  metrics = self.context.metrics
  input = data[0].get('body')
  metrics.add_size('SizeOfImage', len(input) / 1024, None, 'kB')
  return ImageClassifier.preprocess(self, data)

Refer: Custom Metrics Refer: Custom Handler

Step 2: Create a torch model archive using the torch-model-archiver utility to archive the above files.

torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler examples/custom_metrics/mnist_handler.py

Step 3: Register the model on TorchServe using the above model archive file.

mkdir model_store
mv mnist.mar model_store/
torchserve --start --model-store model_store --models mnist=mnist.mar

Step 4: Install mtail

wget https://github.com/google/mtail/releases/download/v3.0.0-rc47/mtail_3.0.0-rc47_Linux_x86_64.tar.gz
tar -xvzf mtail_3.0.0-rc47_Linux_x86_64.tar.gz
chmod +x mtail

Step 5: Create a mtail program. In this example we using a program to export default custom metrics.

Refer: mtail Programming Guide.
Step 6: Start mtail export by running the below command
```
./mtail --progs examples/custom_metrics/torchserve_custom.mtail --logs logs/model_metrics.log
```
The mtail program parses the log file extracts info by matching patterns and presents as JSON, Prometheus and other databases. https://google.github.io/mtail/Interoperability.html
Step 7: Make Inference request
```
curl http://127.0.0.1:8080/predictions/mnist -T examples/image_classifier/mnist/test_data/0.png
```
The inference request logs the time taken for prediction to the model_metrics.log file. Mtail parses the file and is served at 3903 port

http://localhost:3903

Step 8: Sart Prometheus with mtailtarget added to scarpe config

Download Prometheus
Add mtail target added to scrape config in the config file

scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090", "localhost:3903"]

Start Prometheus with config file

./prometheus --config.file prometheus.yml

The exported logs from mtail are scraped by prometheus on 3903 port.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Monitoring Torchserve custom metrics with mtail metrics exporter and prometheus

Steps

Files

README.md

Latest commit

History

README.md

File metadata and controls

Monitoring Torchserve custom metrics with mtail metrics exporter and prometheus

Steps