Error while implementing AzureChatOpenAI() using the documented code provided. Missing all values of Answer relevancy and some values of other metrics #1349

koushik9121 · 2024-09-23T16:58:07Z

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug

Exception raised in Job[49]: TimeoutError() this is one sort of error I am getting
Exception raised in Job[29]: BadRequestError(Unsupported data type) this is the other
The exceptions are around 20 in total each of this format

All the values of answer of Answer relevancy are missing. Some values from the rest metrics are missing
Ragas version:0.1.19
Python version:3.11
OpenAI Version:1.33.0

Code to Reproduce
from langchain_openai.chat_models import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from ragas import evaluate
azure_model = AzureChatOpenAI(
openai_api_version="2024-02-15-preview",
azure_endpoint=base_url,
azure_deployment="",
model="gpt-4o",
validate_base_url=False,
api_key="",
)
azure_embeddings = AzureOpenAIEmbeddings(
openai_api_version="2024-02-15-preview",
azure_endpoint=base_url,
azure_deployment="",
model="textembeddingada002",
api_key="",
)
from ragas.metrics import (
context_precision,
answer_relevancy,
faithfulness,
context_recall,
)
metrics = [
faithfulness,
answer_relevancy,
context_recall,
context_precision,
]
from datasets import load_dataset

amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")
print(amnesty_qa)

from datasets import load_dataset
from ragas import RunConfig

run_config = RunConfig(timeout=300, log_tenacity=True)
amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")
amnesty_qa
result = evaluate(
amnesty_qa["eval"], metrics=metrics, llm=azure_model, embeddings=azure_embeddings,run_config=run_config,
)
Error trace

Expected behavior
I am expecting to run fully as this is the code provided by the team to test, I want to test with new data but with the provide sample code ragas isn't working for me

Additional context
Add any other context about the problem here.

dosubot · 2024-09-23T17:00:02Z

Hey @koushik9121! I'm here to help you with any bugs, questions, or contributions. Let's work together to solve the issue you're facing.

To resolve the TimeoutError and BadRequestError (Unsupported data type) exceptions when implementing AzureChatOpenAI and evaluating metrics like answer relevancy, you can follow these steps:

Ensure Correct Configuration: Double-check that your Azure OpenAI configurations are correct, including the base_url, model_deployment, model_name, embedding_deployment, and embedding_name.
Handle TimeoutError: Increase the timeout settings for the API calls if possible. This can be done by setting a higher timeout value in the AzureChatOpenAI and AzureOpenAIEmbeddings initialization.
Handle BadRequestError (Unsupported data type): Ensure that the data types being passed to the API are supported. This might involve converting data to the correct format before making API calls.

Here is the updated code with these considerations:

from langchain_openai.chat_models import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from ragas import evaluate
import os

# Ensure your Azure OpenAI key is set in the environment
os.environ["AZURE_OPENAI_API_KEY"] = "your-azure-openai-api-key"

# Azure OpenAI configurations
azure_configs = {
    "base_url": "https://<your-endpoint>.openai.azure.com/",
    "model_deployment": "your-deployment-name",
    "model_name": "your-model-name",
    "embedding_deployment": "your-deployment-name",
    "embedding_name": "text-embedding-ada-002",  # most likely
}

# Initialize the chat model with a higher timeout
azure_model = AzureChatOpenAI(
    openai_api_version="2023-05-15",
    azure_endpoint=azure_configs["base_url"],
    azure_deployment=azure_configs["model_deployment"],
    model=azure_configs["model_name"],
    validate_base_url=False,
    request_timeout=60  # Increase the timeout to 60 seconds
)

# Initialize the embeddings with a higher timeout
azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version="2023-05-15",
    azure_endpoint=azure_configs["base_url"],
    azure_deployment=azure_configs["embedding_deployment"],
    model=azure_configs["embedding_name"],
    request_timeout=60  # Increase the timeout to 60 seconds
)

# Load the dataset
from datasets import load_dataset
amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")

# Import metrics
from ragas.metrics import (
    context_precision,
    answer_relevancy,
    faithfulness,
    context_recall,
)
from ragas.metrics.critique import harmfulness

# List of metrics to use
metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    harmfulness,
]

# Evaluate the dataset
result = evaluate(
    amnesty_qa["eval"], metrics=metrics, llm=azure_model, embeddings=azure_embeddings
)

# Convert results to pandas DataFrame
df = result.to_pandas()
df.head()

By increasing the request_timeout parameter, you can mitigate TimeoutError issues. Ensure that the data passed to the API is in the correct format to avoid BadRequestError (Unsupported data type) ^[1].

Additionally, there are known issues or limitations with the ragas library version 0.1.19 that could affect the evaluation of metrics like answer relevancy. Specifically, the _calculate_score method has a potential issue where if all generated questions are empty, it logs a warning and sets the score to NaN. Similarly, the _ascore method has a check for None answers, which also results in a NaN score ^[2].

For additional logging and debugging, you can enable debug mode by setting the environment variable RAGAS_DEBUG to true and use the patch_logger function to enable debug logging for specific modules ^[3]^[4]^[5]:

export RAGAS_DEBUG=true

from ragas.utils import patch_logger
import logging

patch_logger('your_module_name', logging.DEBUG)

This setup will help you capture detailed information about the execution and diagnose issues with missing metric values.

To continue talking to Dosu, mention @dosu.

koushik9121 · 2024-09-23T17:20:57Z

I have checked by the keys and the names by making a test run using azure_model.invoke() and the api was returning correct answer. But what you need to understand is i am trying to run the code provided by the RAGAS team itself and still errors are being there. Help me with this issue

dosubot · 2024-09-23T17:22:43Z

To resolve the TimeoutError and BadRequestError (Unsupported data type) exceptions when running the provided RAGAS team code using AzureChatOpenAI and evaluating metrics like answer relevancy, ensure the following:

TimeoutError: Increase the timeout settings for the Azure OpenAI API calls. This can be done by setting the timeout parameter in the AzureChatOpenAI and AzureOpenAIEmbeddings instances.
BadRequestError (Unsupported data type): Ensure that the data types being passed to the API are supported. This typically involves ensuring that the inputs are in the correct format (e.g., strings for text inputs).

Here is the updated code with the necessary changes:

from langchain_openai.chat_models import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from ragas import evaluate

# Configuration for Azure OpenAI
azure_configs = {
    "base_url": "https://<your-endpoint>.openai.azure.com/",
    "model_deployment": "your-deployment-name",
    "model_name": "your-model-name",
    "embedding_deployment": "your-deployment-name",
    "embedding_name": "text-embedding-ada-002",  # most likely
}

# Initialize the chat model with a timeout setting
azure_model = AzureChatOpenAI(
    openai_api_version="2023-05-15",
    azure_endpoint=azure_configs["base_url"],
    azure_deployment=azure_configs["model_deployment"],
    model=azure_configs["model_name"],
    validate_base_url=False,
    timeout=60  # Set a timeout of 60 seconds
)

# Initialize the embeddings with a timeout setting
azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version="2023-05-15",
    azure_endpoint=azure_configs["base_url"],
    azure_deployment=azure_configs["embedding_deployment"],
    model=azure_configs["embedding_name"],
    timeout=60  # Set a timeout of 60 seconds
)

# Load the dataset
from datasets import load_dataset
amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")

# Import metrics
from ragas.metrics import (
    context_precision,
    answer_relevancy,
    faithfulness,
    context_recall,
)
from ragas.metrics.critique import harmfulness

# List of metrics to use
metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    harmfulness,
]

# Initialize the metrics with LLMs and Embeddings
from ragas.run_config import RunConfig
from ragas.metrics.base import MetricWithLLM, MetricWithEmbeddings

def init_ragas_metrics(metrics, llm, embedding):
    for metric in metrics:
        if isinstance(metric, MetricWithLLM):
            metric.llm = llm
        if isinstance(metric, MetricWithEmbeddings):
            metric.embeddings = embedding
        run_config = RunConfig()
        metric.init(run_config)

# Wrappers for LLM and Embeddings
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper

init_ragas_metrics(
    metrics,
    llm=LangchainLLMWrapper(azure_model),
    embedding=LangchainEmbeddingsWrapper(azure_embeddings),
)

# Evaluate the dataset
result = evaluate(
    amnesty_qa["eval"], metrics=metrics, llm=azure_model, embeddings=azure_embeddings
)

# Convert results to pandas DataFrame
df = result.to_pandas()
df.head()

Ensure that the azure_configs dictionary is correctly populated with your Azure OpenAI endpoint details. The timeout parameter is set to 60 seconds, but you can adjust this value based on your needs. Additionally, the init_ragas_metrics function ensures that the metrics are properly initialized with the LLM and embeddings ^[1].

To continue talking to Dosu, mention @dosu.

koushik9121 · 2024-09-23T22:14:52Z

Serializing the data might help

koushik9121 added the bug Something isn't working label Sep 23, 2024

koushik9121 closed this as completed Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while implementing AzureChatOpenAI() using the documented code provided. Missing all values of Answer relevancy and some values of other metrics #1349

Error while implementing AzureChatOpenAI() using the documented code provided. Missing all values of Answer relevancy and some values of other metrics #1349

koushik9121 commented Sep 23, 2024

dosubot bot commented Sep 23, 2024

koushik9121 commented Sep 23, 2024 •

edited

Loading

dosubot bot commented Sep 23, 2024

koushik9121 commented Sep 23, 2024

Error while implementing AzureChatOpenAI() using the documented code provided. Missing all values of Answer relevancy and some values of other metrics #1349

Error while implementing AzureChatOpenAI() using the documented code provided. Missing all values of Answer relevancy and some values of other metrics #1349

Comments

koushik9121 commented Sep 23, 2024

dosubot bot commented Sep 23, 2024

koushik9121 commented Sep 23, 2024 • edited Loading

dosubot bot commented Sep 23, 2024

koushik9121 commented Sep 23, 2024

koushik9121 commented Sep 23, 2024 •

edited

Loading