-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Migrate to new Task and controller architecture
Issue: #136 Before the inital release we are consolidting the controller and crd architectures. This commit also polishes the CLI.
- Loading branch information
Showing
113 changed files
with
8,378 additions
and
5,969 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
[workspace] | ||
members = ["controller", "service", "cli", "web", "lib"] | ||
members = ["controller", "service", "cli", "web", "lib" ] | ||
resolver = "2" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Welcome to MkDocs | ||
|
||
For full documentation visit [mkdocs.org](https://www.mkdocs.org). | ||
|
||
## Commands | ||
|
||
* `mkdocs new [dir-name]` - Create a new project. | ||
* `mkdocs serve` - Start the live-reloading docs server. | ||
* `mkdocs build` - Build the documentation site. | ||
* `mkdocs -h` - Print help message and exit. | ||
|
||
## Project layout | ||
|
||
mkdocs.yml # The configuration file. | ||
docs/ | ||
index.md # The documentation homepage. | ||
... # Other markdown pages, images and other files. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
# Guides | ||
|
||
## From zero to live model | ||
|
||
**This guide is focused on using AME** if you are looking for a deployment guide go [here](todo). | ||
|
||
|
||
|
||
This guide will walk through going from zero to having a model served through an the [V2 inference protocol](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/v2-protocol.html). | ||
it will be split into multiple sub steps which can be consumed in isolation if you are just looking for a smaller guide on that specific step. | ||
|
||
Almost any python project should be usable but if you want to follow along with the exact same project as the guide clone [this]() repo. | ||
|
||
### Setup the CLI | ||
|
||
Before we can initialise an AME project we need to install the ame [CLI](todo) and connect with your AME instance. | ||
|
||
TODO describe installation | ||
|
||
### Initialising AME in your project | ||
|
||
The first step will be creating an `ame.yaml` file in the project directory. | ||
|
||
This is easiet to do with the ame [CLI]() by running `ame project init`. The [CLI]() will ask for a project and then produce a file | ||
that looks like this: | ||
|
||
```yaml | ||
projectName: sklearn_logistic_regression | ||
``` | ||
### The first training | ||
Not very exciting but it is a start. Next we want to set up our model to be run by AME. The most important thing here is the Task that will train the model so | ||
lets start with that. | ||
Here we need to consider a few things, what command is used to train a model, how are dependencies managed in our project, what python version do we need and | ||
how many resources does our model training require. | ||
If you are using the [repo]() for this guide, you will want a task configured as below. | ||
```yaml | ||
|
||
projectid: sklearn_logistic_regression | ||
tasks: | ||
- name: training | ||
!poetry | ||
executor: | ||
pythonVersion: 3.11 | ||
command: python train.py | ||
resources: | ||
memory: 10G | ||
cpu: 4 | ||
storage: 30G | ||
nvidia.com/gpu: 1 | ||
``` | ||
## Your first Task | ||
[`Tasks`](TODO) are an important building block for AME. This guide will walk you through the basic of constructing and running [`Task`](todo). | ||
|
||
We assume that the AME [CLI](todo) is setup and connected to an AME instance. If not see this [guide](todo). | ||
|
||
Before we can run a task we must have a project setup. To init a project follow the commands as shown below, replacing myproject with the | ||
path to your project. | ||
|
||
```sh | ||
cd myproject | ||
ame init | ||
``` | ||
|
||
Now you should have an AME file ame.yaml inside your project: | ||
```yaml | ||
name: myproject | ||
``` | ||
|
||
Not very exciting yet. Next we want to add a Task to this file so we can run it. | ||
Update your file to match the changes shown below. | ||
|
||
```yaml | ||
name: myproject | ||
tasks: | ||
- name: training | ||
!poetry | ||
executor: | ||
pythonVersion: 3.11 | ||
command: python train.py | ||
resources: | ||
memory: 2G | ||
cpu: 2 | ||
storage: 10G | ||
``` | ||
|
||
Here we add a list of tasks for our project, containing a single `Task` called training. Lets look at the anatomy of training. | ||
|
||
First we set the name `name: training`, pretty standard YAML. Next we set the [executor](todo). This syntax might seem a bit confusing | ||
if you have not used this YAML feature before. `!poetry` adds a tag to the executor indicating the executor type. In this case we are | ||
using the poetry executor. It requires two fields to be set. the Python version and the command to run. This tells AME how to execute the [`Task`](todo). | ||
|
||
Finally we set the required resources. 2G ram, 2 cpu threads and 10G of storage. | ||
|
||
To run the task we can use the CLI: | ||
```sh | ||
ame task run | ||
``` | ||
|
||
|
||
|
||
## Validating models before deployment | ||
|
||
To ensure that a new model versions perform well before exposing them AME supports model validation. This is done by providing AME with a `Task` which | ||
will succeed if the model passes validation and fail if not. | ||
|
||
Example from [ame-demo](https://github.com/TeaInSpace/ame-demo): | ||
|
||
```yaml | ||
projectid: sklearn_logistic_regression | ||
models: | ||
- name: logreg | ||
type: mlflow | ||
validationTask: # the validation task is set here. | ||
taskRef: mlflow_validation | ||
training: | ||
task: | ||
taskRef: training | ||
deployment: | ||
auto_train: true | ||
deploy: true | ||
enable_tls: false | ||
tasks: | ||
- name: training | ||
projectid: sklearn_logistic_regression | ||
templateRef: shared-templates.logistic_reg_template | ||
taskType: Mlflow | ||
- name: mlflow_validation | ||
projectid: sklearn_logistic_regression | ||
runcommand: python validate.py | ||
``` | ||
|
||
This approach allows for a lot of flexibility of how models are validated, at the cost of writing the validation your self. In the future AME will provide builtin options for common validation configurations as well, see the [roadmap](todo). | ||
|
||
### Using MLflow metrics | ||
|
||
Here we will walk through how to validate a model based on recorded metrics in MLflow, using the [ame-demo](https://github.com/TeaInSpace/ame-demo) repository as an example. The model is a simple logistic regresser, the training code looks like this: | ||
|
||
```python | ||
import numpy as np | ||
from sklearn.linear_model import LogisticRegression | ||
import mlflow | ||
import mlflow.sklearn | ||
import os | ||
if __name__ == "__main__": | ||
X = np.array([-2, -1, 0, 1, 2, 1]).reshape(-1, 1) | ||
y = np.array([0, 0, 1, 1, 1, 0]) | ||
lr = LogisticRegression() | ||
lr.fit(X, y) | ||
score = lr.score(X, y) | ||
print("Score: %s" % score) | ||
mlflow.log_metric("score", score) | ||
mlflow.sklearn.log_model(lr, "model", registered_model_name="logreg") | ||
print("Model saved in run %s" % mlflow.active_run().info.run_uuid) | ||
``` | ||
|
||
Notice how the score is logged as a metric. We can use that in our validation. | ||
|
||
AME exposes the necessary environment variables to running tasks so we can access the Mlflow instance during validation just by using the Mlflow library. | ||
|
||
```python | ||
TODO | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
<h1>Models</h1> | ||
<p>Models are one of AME's higher level constructs, see what that means <a href="">here</a>. if you are configuring how a model should be trained, deployed, monitored or validated this is the right place. | ||
Models exist in an AME file along side Datasets Tasks and Templates.</p> | ||
<h3>Model training</h3> | ||
<p>Model training is configured described use a <a href="./task.html">Task</a>.</p> | ||
<p>AME can be deployed with a an MLflow instance which will be exposed to the Training Task allowing for simply storage and retrievel of models and metrics.</p> | ||
<pre lang="yaml" style="background-color:#2b303b;"><code><span style="color:#65737e;"># main project ame.yml | ||
</span><span style="color:#bf616a;">project</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">xgboost_project | ||
</span><span style="color:#bf616a;">models</span><span style="color:#c0c5ce;">: | ||
</span><span style="color:#c0c5ce;"> - </span><span style="color:#bf616a;">name</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">product_recommendor | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">training</span><span style="color:#c0c5ce;">: | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">task</span><span style="color:#c0c5ce;">: | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">taskRef</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">train_my_model | ||
</span><span style="color:#bf616a;">tasks</span><span style="color:#c0c5ce;">: | ||
</span><span style="color:#c0c5ce;"> - </span><span style="color:#bf616a;">name</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">train_my_model | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">fromTemplate</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">shared_templates.xgboost_resources | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">executor</span><span style="color:#c0c5ce;">: | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#b48ead;">!poetry | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">pythonVersion</span><span style="color:#c0c5ce;">: </span><span style="color:#d08770;">3.11 | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">command</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">python train.py | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">resources</span><span style="color:#c0c5ce;">: | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">memory</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">10G | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">cpu</span><span style="color:#c0c5ce;">: </span><span style="color:#d08770;">4 | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">storage</span><span style="color:#c0c5ce;">: </span><span style="color:#a3be8c;">30G | ||
</span><span style="color:#c0c5ce;"> </span><span style="color:#bf616a;">nvidia.com/gpu</span><span style="color:#c0c5ce;">: </span><span style="color:#d08770;">1 | ||
</span></code></pre> | ||
<h3>Model deployment</h3> | ||
<h4>Model validation</h4> | ||
<h4>Model monitoring</h4> | ||
<h3>Batch inference</h3> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Models | ||
|
||
Models are one of AME's higher level constructs, see what that means [here](). if you are configuring how a model should be trained, deployed, monitored or validated this is the right place. | ||
Models exist in an AME file along side Datasets Tasks and Templates. | ||
|
||
### Model training | ||
|
||
Model training is configured described use a [Task](tasks.md). | ||
|
||
AME can be deployed with a an MLflow instance which will be exposed to the Training Task allowing for simply storage and retrievel of models and metrics. | ||
|
||
|
||
```yaml | ||
# main project ame.yml | ||
project: xgboost_project | ||
models: | ||
- name: product_recommendor | ||
training: | ||
task: | ||
taskRef: train_my_model | ||
tasks: | ||
- name: train_my_model | ||
fromTemplate: shared_templates.xgboost_resources | ||
executor: | ||
!poetry | ||
pythonVersion: 3.11 | ||
command: python train.py | ||
resources: | ||
memory: 10G | ||
cpu: 4 | ||
storage: 30G | ||
nvidia.com/gpu: 1 | ||
``` | ||
### Model deployment | ||
#### Model validation | ||
#### Model monitoring | ||
### Batch inference |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.