From f71d4e15dd85030513732122a1b9a740316b6124 Mon Sep 17 00:00:00 2001 From: Anish Shah Date: Mon, 8 Jan 2024 13:45:28 -0500 Subject: [PATCH 1/7] Create openai-azure-fine-tuning.md --- .../other/openai-azure-fine-tuning.md | 94 +++++++++++++++++++ 1 file changed, 94 insertions(+) create mode 100644 docs/guides/integrations/other/openai-azure-fine-tuning.md diff --git a/docs/guides/integrations/other/openai-azure-fine-tuning.md b/docs/guides/integrations/other/openai-azure-fine-tuning.md new file mode 100644 index 000000000..483c6f4b6 --- /dev/null +++ b/docs/guides/integrations/other/openai-azure-fine-tuning.md @@ -0,0 +1,94 @@ +--- +slug: /guides/integrations/azure_openai +description: How to Fine-Tune Azure OpenAI models using W&B. +displayed_sidebar: default +--- + +# Azure OpenAI Fine-Tuning + +## Introduction +Fine-tuning GPT-3.5 or GPT-4 models on Microsoft Azure using Weights & Biases allows for detailed tracking and analysis of model performance. This guide extends the concepts from the [OpenAI Fine-Tuning guide](/guides/integrations/openai) with specific steps and features for Azure OpenAI. + +![](/images/integrations/open_ai_auto_scan.png) + +:::info +The Weights and Biases fine-tuning integration works with `openai >= 1.0`. Please install the latest version of `openai` by doing `pip install -U openai`. +::: + +### Check out interactive examples + +* [Demo Colab](http://wandb.me/azure-openai-colab) + +## Prerequisites +- Azure OpenAI service set up as per [official Azure documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/fine-tune). +- Latest versions of `openai`, `wandb`, and other required libraries installed. + +## Sync Azure OpenAI Fine-Tuning Results in Weights & Biases +### Setting Up +- Ensure you have the Azure OpenAI endpoint and key configured in your environment. + +```python +import os +os.environ["AZURE_OPENAI_ENDPOINT"] = None # Replace with your endpoint +os.environ["AZURE_OPENAI_KEY"] = None # Replace with your key +``` + +- Install necessary libraries (`openai`, `requests`, `tiktoken`, `wandb`). + +```python +pip install openai requests tiktoken wandb +``` + +### Preparing Datasets +- Create and validate your training and validation datasets in JSONL format, as demonstrated in the example datasets provided in the guide. + +```python +# Example dataset creation +# %%writefile training_set.jsonl +# {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, ...]} +``` + +### Fine-Tuning on Azure +1. **Upload Datasets:** Use Azure OpenAI SDK to upload your training and validation datasets. + + ```python + from openai import AzureOpenAI + client = AzureOpenAI(azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), api_key=os.getenv("AZURE_OPENAI_KEY")) + + training_response = client.files.create(file=open("training_set.jsonl", "rb"), purpose="fine-tune") + training_file_id = training_response.id +``` + +2. **Start Fine-Tuning:** Initiate the fine-tuning process on Azure with the desired base model, like `gpt-35-turbo-0613`. + +```python + response = client.fine_tuning.jobs.create(training_file=training_file_id, model="gpt-35-turbo-0613") + job_id = response.id +``` + +3. **Track the Fine-Tuning Job: Integrating with Weights & Biases** +- Use the `WandbLogger` from `wandb.integration.openai.fine_tuning` just as in the [OpenAI Fine-Tuning guide](/guides/integrations/openai). +- The `WandbLogger.sync` method takes the fine-tune job ID and other optional parameters to sync your fine-tuning results to Weights & Biases. +- This integration will log training/validation metrics, datasets, model metadata, and establish data and model DAG lineage in Weights & Biases. + +```python +from wandb.integration.openai.fine_tuning import WandbLogger +WandbLogger.sync(fine_tune_job_id=job_id, openai_client=client, project="your_project_name") +``` + +## Visualization and Versioning in Weights & Biases +- Utilize Weights & Biases for versioning and visualizing training and validation data as Tables. +- The datasets and model metadata are versioned as W&B Artifacts, allowing for efficient tracking and version control. + +![](/images/integrations/openai_data_artifacts.png) + +![](/images/integrations/openai_data_visualization.png) + +## Retrieving the Fine-Tuned Model +- The fine-tuned model ID is retrievable from Azure OpenAI and is logged as a part of model metadata in Weights & Biases. + +![](/images/integrations/openai_model_metadata.png) + +## Additional Resources +- [OpenAI Fine-tuning Documentation](https://platform.openai.com/docs/guides/fine-tuning/) +- [Demo Colab](http://wandb.me/azure-openai-colab) \ No newline at end of file From c0206767d7b3f735ca7c1fd58a4c49ad097b0e11 Mon Sep 17 00:00:00 2001 From: Anish Shah Date: Mon, 8 Jan 2024 17:19:11 -0500 Subject: [PATCH 2/7] Update openai-azure-fine-tuning.md fix spacing for code --- .../other/openai-azure-fine-tuning.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/guides/integrations/other/openai-azure-fine-tuning.md b/docs/guides/integrations/other/openai-azure-fine-tuning.md index 483c6f4b6..c3a5d081f 100644 --- a/docs/guides/integrations/other/openai-azure-fine-tuning.md +++ b/docs/guides/integrations/other/openai-azure-fine-tuning.md @@ -49,21 +49,21 @@ pip install openai requests tiktoken wandb ``` ### Fine-Tuning on Azure + 1. **Upload Datasets:** Use Azure OpenAI SDK to upload your training and validation datasets. - ```python - from openai import AzureOpenAI - client = AzureOpenAI(azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), api_key=os.getenv("AZURE_OPENAI_KEY")) - - training_response = client.files.create(file=open("training_set.jsonl", "rb"), purpose="fine-tune") - training_file_id = training_response.id +```python +from openai import AzureOpenAI +client = AzureOpenAI(azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), api_key=os.getenv("AZURE_OPENAI_KEY")) +training_response = client.files.create(file=open("training_set.jsonl", "rb"), purpose="fine-tune") +training_file_id = training_response.id ``` 2. **Start Fine-Tuning:** Initiate the fine-tuning process on Azure with the desired base model, like `gpt-35-turbo-0613`. ```python - response = client.fine_tuning.jobs.create(training_file=training_file_id, model="gpt-35-turbo-0613") - job_id = response.id +response = client.fine_tuning.jobs.create(training_file=training_file_id, model="gpt-35-turbo-0613") +job_id = response.id ``` 3. **Track the Fine-Tuning Job: Integrating with Weights & Biases** From bfa2e5c9596e6db02fc7c7ac8b7c214398ea280c Mon Sep 17 00:00:00 2001 From: Anish Shah Date: Wed, 10 Jan 2024 09:19:39 -0500 Subject: [PATCH 3/7] add azure-openai finetuning to sidebar --- .../other/openai-azure-fine-tuning.md | 28 +++++++++++++------ sidebars.js | 2 ++ 2 files changed, 22 insertions(+), 8 deletions(-) diff --git a/docs/guides/integrations/other/openai-azure-fine-tuning.md b/docs/guides/integrations/other/openai-azure-fine-tuning.md index c3a5d081f..2e718d2af 100644 --- a/docs/guides/integrations/other/openai-azure-fine-tuning.md +++ b/docs/guides/integrations/other/openai-azure-fine-tuning.md @@ -1,5 +1,5 @@ --- -slug: /guides/integrations/azure_openai +slug: /guides/integrations/openai-azure-fine-tuning description: How to Fine-Tune Azure OpenAI models using W&B. displayed_sidebar: default --- @@ -29,13 +29,14 @@ The Weights and Biases fine-tuning integration works with `openai >= 1.0`. Pleas ```python import os -os.environ["AZURE_OPENAI_ENDPOINT"] = None # Replace with your endpoint -os.environ["AZURE_OPENAI_KEY"] = None # Replace with your key + +os.environ["AZURE_OPENAI_ENDPOINT"] = None # Replace with your endpoint +os.environ["AZURE_OPENAI_KEY"] = None # Replace with your key ``` - Install necessary libraries (`openai`, `requests`, `tiktoken`, `wandb`). -```python +```shell-session pip install openai requests tiktoken wandb ``` @@ -54,15 +55,23 @@ pip install openai requests tiktoken wandb ```python from openai import AzureOpenAI -client = AzureOpenAI(azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), api_key=os.getenv("AZURE_OPENAI_KEY")) -training_response = client.files.create(file=open("training_set.jsonl", "rb"), purpose="fine-tune") + +client = AzureOpenAI( + azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), + api_key=os.getenv("AZURE_OPENAI_KEY"), +) +training_response = client.files.create( + file=open("training_set.jsonl", "rb"), purpose="fine-tune" +) training_file_id = training_response.id ``` 2. **Start Fine-Tuning:** Initiate the fine-tuning process on Azure with the desired base model, like `gpt-35-turbo-0613`. ```python -response = client.fine_tuning.jobs.create(training_file=training_file_id, model="gpt-35-turbo-0613") +response = client.fine_tuning.jobs.create( + training_file=training_file_id, model="gpt-35-turbo-0613" +) job_id = response.id ``` @@ -73,7 +82,10 @@ job_id = response.id ```python from wandb.integration.openai.fine_tuning import WandbLogger -WandbLogger.sync(fine_tune_job_id=job_id, openai_client=client, project="your_project_name") + +WandbLogger.sync( + fine_tune_job_id=job_id, openai_client=client, project="your_project_name" +) ``` ## Visualization and Versioning in Weights & Biases diff --git a/sidebars.js b/sidebars.js index 54d5904af..fbf2a709a 100644 --- a/sidebars.js +++ b/sidebars.js @@ -426,6 +426,7 @@ const sidebars = { 'guides/integrations/other/mmf', 'guides/integrations/other/composer', 'guides/integrations/other/openai-api', + 'guides/integrations/other/openai-azure-fine-tuning', 'guides/integrations/other/openai-fine-tuning', 'guides/integrations/other/openai-gym', 'guides/integrations/other/paddledetection', @@ -806,6 +807,7 @@ const sidebars = { 'guides/integrations/other/mmf', 'guides/integrations/other/composer', 'guides/integrations/other/openai-api', + 'guides/integrations/other/openai-azure-fine-tuning', 'guides/integrations/other/openai-fine-tuning', 'guides/integrations/other/openai-gym', 'guides/integrations/other/paddledetection', From 3cf13009d8dd9bcf9be523609d94d6a7251d9b43 Mon Sep 17 00:00:00 2001 From: Anish Shah Date: Fri, 12 Jan 2024 09:27:14 -0500 Subject: [PATCH 4/7] Rename and reduce azure oai docs for maintainability --- ...-tuning.md => azure-openai-fine-tuning.md} | 63 ++++--------------- sidebars.js | 4 +- 2 files changed, 13 insertions(+), 54 deletions(-) rename docs/guides/integrations/other/{openai-azure-fine-tuning.md => azure-openai-fine-tuning.md} (52%) diff --git a/docs/guides/integrations/other/openai-azure-fine-tuning.md b/docs/guides/integrations/other/azure-openai-fine-tuning.md similarity index 52% rename from docs/guides/integrations/other/openai-azure-fine-tuning.md rename to docs/guides/integrations/other/azure-openai-fine-tuning.md index 2e718d2af..572f86c75 100644 --- a/docs/guides/integrations/other/openai-azure-fine-tuning.md +++ b/docs/guides/integrations/other/azure-openai-fine-tuning.md @@ -1,5 +1,5 @@ --- -slug: /guides/integrations/openai-azure-fine-tuning +slug: /guides/integrations/azure-openai-fine-tuning description: How to Fine-Tune Azure OpenAI models using W&B. displayed_sidebar: default --- @@ -15,79 +15,38 @@ Fine-tuning GPT-3.5 or GPT-4 models on Microsoft Azure using Weights & Biases al The Weights and Biases fine-tuning integration works with `openai >= 1.0`. Please install the latest version of `openai` by doing `pip install -U openai`. ::: -### Check out interactive examples - -* [Demo Colab](http://wandb.me/azure-openai-colab) ## Prerequisites - Azure OpenAI service set up as per [official Azure documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/fine-tune). - Latest versions of `openai`, `wandb`, and other required libraries installed. -## Sync Azure OpenAI Fine-Tuning Results in Weights & Biases -### Setting Up -- Ensure you have the Azure OpenAI endpoint and key configured in your environment. - -```python -import os - -os.environ["AZURE_OPENAI_ENDPOINT"] = None # Replace with your endpoint -os.environ["AZURE_OPENAI_KEY"] = None # Replace with your key -``` - -- Install necessary libraries (`openai`, `requests`, `tiktoken`, `wandb`). - -```shell-session -pip install openai requests tiktoken wandb -``` - -### Preparing Datasets -- Create and validate your training and validation datasets in JSONL format, as demonstrated in the example datasets provided in the guide. - -```python -# Example dataset creation -# %%writefile training_set.jsonl -# {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, ...]} -``` - -### Fine-Tuning on Azure - -1. **Upload Datasets:** Use Azure OpenAI SDK to upload your training and validation datasets. +## Sync Azure OpenAI Fine-Tuning Results in Weights & Biases in 2 lines ```python from openai import AzureOpenAI +# Connect to Azure OpenAI client = AzureOpenAI( azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), api_key=os.getenv("AZURE_OPENAI_KEY"), ) -training_response = client.files.create( - file=open("training_set.jsonl", "rb"), purpose="fine-tune" -) -training_file_id = training_response.id -``` - -2. **Start Fine-Tuning:** Initiate the fine-tuning process on Azure with the desired base model, like `gpt-35-turbo-0613`. - -```python -response = client.fine_tuning.jobs.create( - training_file=training_file_id, model="gpt-35-turbo-0613" -) -job_id = response.id -``` -3. **Track the Fine-Tuning Job: Integrating with Weights & Biases** -- Use the `WandbLogger` from `wandb.integration.openai.fine_tuning` just as in the [OpenAI Fine-Tuning guide](/guides/integrations/openai). -- The `WandbLogger.sync` method takes the fine-tune job ID and other optional parameters to sync your fine-tuning results to Weights & Biases. -- This integration will log training/validation metrics, datasets, model metadata, and establish data and model DAG lineage in Weights & Biases. +# Create and validate your training and validation datasets in JSONL format, +# upload them via the client, +# and start a fine-tuning job. -```python from wandb.integration.openai.fine_tuning import WandbLogger +# Sync your fine-tuning results with W&B! WandbLogger.sync( fine_tune_job_id=job_id, openai_client=client, project="your_project_name" ) ``` +### Check out interactive examples + +* [Demo Colab](http://wandb.me/azure-openai-colab) + ## Visualization and Versioning in Weights & Biases - Utilize Weights & Biases for versioning and visualizing training and validation data as Tables. - The datasets and model metadata are versioned as W&B Artifacts, allowing for efficient tracking and version control. diff --git a/sidebars.js b/sidebars.js index 60ca15452..7093ca59a 100644 --- a/sidebars.js +++ b/sidebars.js @@ -397,6 +397,7 @@ const sidebars = { items: [ // 'guides/integrations/intro', 'guides/integrations/add-wandb-to-any-library', + 'guides/integrations/other/azure-openai-fine-tuning', 'guides/integrations/other/catalyst', 'guides/integrations/dagster', 'guides/integrations/other/databricks', @@ -427,7 +428,6 @@ const sidebars = { 'guides/integrations/other/mmf', 'guides/integrations/other/composer', 'guides/integrations/other/openai-api', - 'guides/integrations/other/openai-azure-fine-tuning', 'guides/integrations/other/openai-fine-tuning', 'guides/integrations/other/openai-gym', 'guides/integrations/other/paddledetection', @@ -784,6 +784,7 @@ const sidebars = { items: [ 'guides/integrations/add-wandb-to-any-library', 'guides/integrations/other/catalyst', + 'guides/integrations/other/azure-openai-fine-tuning', 'guides/integrations/dagster', 'guides/integrations/other/databricks', 'guides/integrations/other/deepchecks', @@ -809,7 +810,6 @@ const sidebars = { 'guides/integrations/other/mmf', 'guides/integrations/other/composer', 'guides/integrations/other/openai-api', - 'guides/integrations/other/openai-azure-fine-tuning', 'guides/integrations/other/openai-fine-tuning', 'guides/integrations/other/openai-gym', 'guides/integrations/other/paddledetection', From 1548f2bd408329227792a8054c50bb3c6c8c6531 Mon Sep 17 00:00:00 2001 From: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri, 12 Jan 2024 16:22:09 -0800 Subject: [PATCH 5/7] word smithing --- .../integrations/other/azure-openai-fine-tuning.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/guides/integrations/other/azure-openai-fine-tuning.md b/docs/guides/integrations/other/azure-openai-fine-tuning.md index 572f86c75..029801b35 100644 --- a/docs/guides/integrations/other/azure-openai-fine-tuning.md +++ b/docs/guides/integrations/other/azure-openai-fine-tuning.md @@ -7,7 +7,7 @@ displayed_sidebar: default # Azure OpenAI Fine-Tuning ## Introduction -Fine-tuning GPT-3.5 or GPT-4 models on Microsoft Azure using Weights & Biases allows for detailed tracking and analysis of model performance. This guide extends the concepts from the [OpenAI Fine-Tuning guide](/guides/integrations/openai) with specific steps and features for Azure OpenAI. +Fine-tuning GPT-3.5 or GPT-4 models on Microsoft Azure using W&B allows for detailed tracking and analysis of model performance. This guide extends the concepts from the [OpenAI Fine-Tuning guide](/guides/integrations/openai) with specific steps and features for Azure OpenAI. ![](/images/integrations/open_ai_auto_scan.png) @@ -20,7 +20,7 @@ The Weights and Biases fine-tuning integration works with `openai >= 1.0`. Pleas - Azure OpenAI service set up as per [official Azure documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/fine-tune). - Latest versions of `openai`, `wandb`, and other required libraries installed. -## Sync Azure OpenAI Fine-Tuning Results in Weights & Biases in 2 lines +## Sync Azure OpenAI fine-tuning results in W&B in 2 lines ```python from openai import AzureOpenAI @@ -47,19 +47,19 @@ WandbLogger.sync( * [Demo Colab](http://wandb.me/azure-openai-colab) -## Visualization and Versioning in Weights & Biases -- Utilize Weights & Biases for versioning and visualizing training and validation data as Tables. +## Visualization and versioning in W&B +- Utilize W&B for versioning and visualizing training and validation data as Tables. - The datasets and model metadata are versioned as W&B Artifacts, allowing for efficient tracking and version control. ![](/images/integrations/openai_data_artifacts.png) ![](/images/integrations/openai_data_visualization.png) -## Retrieving the Fine-Tuned Model -- The fine-tuned model ID is retrievable from Azure OpenAI and is logged as a part of model metadata in Weights & Biases. +## Retrieving the fine-tuned model +- The fine-tuned model ID is retrievable from Azure OpenAI and is logged as a part of model metadata in W&B. ![](/images/integrations/openai_model_metadata.png) -## Additional Resources +## Additional resources - [OpenAI Fine-tuning Documentation](https://platform.openai.com/docs/guides/fine-tuning/) - [Demo Colab](http://wandb.me/azure-openai-colab) \ No newline at end of file From 2a777cb61fefc4da75e15aee4e59e853470d4621 Mon Sep 17 00:00:00 2001 From: Anish Shah Date: Mon, 15 Jan 2024 14:46:51 -0500 Subject: [PATCH 6/7] fix page ordering --- sidebars.js | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sidebars.js b/sidebars.js index 7093ca59a..4680eafeb 100644 --- a/sidebars.js +++ b/sidebars.js @@ -783,8 +783,8 @@ const sidebars = { link: { type: 'doc', id: 'guides/integrations/intro' }, items: [ 'guides/integrations/add-wandb-to-any-library', - 'guides/integrations/other/catalyst', 'guides/integrations/other/azure-openai-fine-tuning', + 'guides/integrations/other/catalyst', 'guides/integrations/dagster', 'guides/integrations/other/databricks', 'guides/integrations/other/deepchecks', From aa52373f49c1eeec247c28c969272cfc13473afb Mon Sep 17 00:00:00 2001 From: Anish Shah Date: Mon, 15 Jan 2024 14:50:16 -0500 Subject: [PATCH 7/7] Add azure finetuning docs --- docs/guides/integrations/other/azure-openai-fine-tuning.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/guides/integrations/other/azure-openai-fine-tuning.md b/docs/guides/integrations/other/azure-openai-fine-tuning.md index 029801b35..6ab535918 100644 --- a/docs/guides/integrations/other/azure-openai-fine-tuning.md +++ b/docs/guides/integrations/other/azure-openai-fine-tuning.md @@ -62,4 +62,5 @@ WandbLogger.sync( ## Additional resources - [OpenAI Fine-tuning Documentation](https://platform.openai.com/docs/guides/fine-tuning/) +- [Azure OpenAI Fine-tuning Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython&pivots=programming-language-python) - [Demo Colab](http://wandb.me/azure-openai-colab) \ No newline at end of file