Skip to content

Commit

Permalink
Implementation of Noise sensitivity metrics from RAGChecker (#1190)
Browse files Browse the repository at this point in the history
Solves: 
- #1185 

- Took inspiration from RAGChecker from AWS Noise sensitivity [noise
sensitivity](https://github.com/amazon-science/RAGChecker/tree/main/ragchecker)
metrics.
- Have tested it locally, it is working giving the results.

### Input
```python
from datasets import Dataset 
from ragas.metrics import noise_sensitivity_relevant, noise_sensitivity_irrelevant
from ragas import evaluate
data_sample = {
    "question": ["What is the Life Insurance Corporation of India (LIC) known for?"],
    "ground_truth": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments."],
    "answer": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country."],
    "contexts": [["The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.",
        "LIC is the largest insurance company in India, with a vast network of policyholders and a huge investments.",
        "As the largest institutional investor in India, LIC manages a substantial funds, contributing to the financial stability of the country.",
        "The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc"]]
}


dataset = Dataset.from_dict(data_sample)
metrics = [noise_sensitivity_relevant, noise_sensitivity_irrelevant]
score = evaluate(dataset,metrics=metrics)
score.to_pandas()
```

---------

Co-authored-by: Shahules786 <[email protected]>
  • Loading branch information
sahusiddharth and shahules786 authored Aug 23, 2024
1 parent d5b60bb commit 8da231d
Show file tree
Hide file tree
Showing 6 changed files with 379 additions and 3 deletions.
2 changes: 2 additions & 0 deletions docs/concepts/metrics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Just like in any machine learning system, the performance of individual componen
- [Context precision](context_precision.md)
- [Context utilization](context_utilization.md)
- [Context entity recall](context_entities_recall.md)
- [Noise Sensitivity](noise_sensitivity.md)
- [Summarization Score](summarization_score.md)

```{toctree}
Expand All @@ -36,6 +37,7 @@ context_precision
context_utilization
context_recall
context_entities_recall
noise_sensitivity
semantic_similarity
answer_correctness
critique
Expand Down
101 changes: 101 additions & 0 deletions docs/concepts/metrics/noise_sensitivity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@


# Noise Sensitivity

Noise sensitivity measures how often a system makes errors by providing incorrect responses when utilizing either relevant or irrelevant retrieved documents. The score ranges from 0 to 1, with lower values indicating better performance. Noise sensitivity is computed using the question, ground truth, answer, and the retrieved context.

To estimate noise sensitivity, each claim in the generated answer is examined to determine whether it is correct based on the ground truth and whether it can be attributed to the relevant (or irrelevant) retrieved context. Ideally, all claims in the answer should be supported by the relevant retrieved context.


```{math}
\text{noise sensitivity (relevant)} = {|\text{Number of incorrect claims in answer}| \over |\text{Number of claims in the Answer}|}
```

```{Hint}
Question: What is the Life Insurance Corporation of India (LIC) known for?
Ground truth: The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments.
Relevant Retrieval:
- The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.
- LIC is the largest insurance company in India, with a vast network of policyholders and a significant role in the financial sector.
- As the largest institutional investor in India, LIC manages a substantial life fund, contributing to the financial stability of the country.
Irrelevant Retrieval:
- The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc.
```


## Example

```{code-block} python
:caption: Noise Sensitivity
from datasets import Dataset
from ragas.metrics import noise_sensitivity_relevant, noise_sensitivity_irrelevant
from ragas import evaluate
data_sample = {
"question": ["What is the Life Insurance Corporation of India (LIC) known for?"],
"ground_truth": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments."],
"answer": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country."],
"contexts": [[
"The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.",
"LIC is the largest insurance company in India, with a vast network of policyholders and a huge investments.",
"As the largest institutional investor in India, LIC manages a substantial funds, contributing to the financial stability of the country.",
"The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc"
]]
}
dataset = Dataset.from_dict(data_sample)
metrics = [noise_sensitivity_relevant, noise_sensitivity_irrelevant]
score = evaluate(dataset,metrics=metrics)
score.to_pandas()
```

## Calculation

Let's examine how noise sensitivity in relevant context was calculated:

- **Step 1:** Identify the relevant contexts from which the ground truth can be inferred.

- Ground Truth:
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments.

- Contexts:
- Context 1: `The Life Insurance Corporation of India (LIC) was established in 1956` following the nationalization of the insurance industry in India.
- Context 2: `LIC is the largest insurance company in India`, with a vast network of policyholders and a significant role in the financial sector.
- Context 3: `As the largest institutional investor in India, LIC manages a substantial funds`, contributing to the financial stability of the country.

- **Step 2:** Verify if the claims in the generated answer can be inferred from the relevant context.

- Answer:
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country.

- Contexts:
- Context 1: The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.
- Context 2: `LIC is the largest insurance company in India`, with a vast network of policyholders and a significant role in the financial sector.
- Context 3: `As the largest institutional investor in India, LIC manages a substantial funds`, `contributing to the financial stability of the country`.


- **Step 3:** Identify any incorrect claims in the answer (i.e., answer statements that are not supported by the ground truth).

- Ground Truth:
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments.

- Answer:
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. `LIC contributs to the financial stability of the country`.

Explanation: The ground truth does not mention anything about LIC contributing to the financial stability of the country. Therefore, this statement in the answer is incorrect.

Incorrect Statement: 1
Total claims: 3

- **Step 4:** Calculate noise sensitivity using the formula:
```{math}
\text{noise sensitivity} = { \text{1} \over \text{3} } = 0.333
```
This results in a noise sensitivity score of 0.333, indicating that one out of three claims in the answer was incorrect.
Credits: Noise senstivity was introduced in [RAGChecker](https://github.com/amazon-science/RAGChecker/tree/main/ragchecker)
1 change: 1 addition & 0 deletions src/ragas/async_utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Async utils."""

import asyncio
from typing import Any, Coroutine, List

Expand Down
6 changes: 3 additions & 3 deletions src/ragas/integrations/langchain.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@ def __init__(self, metric: Metric, **kwargs: t.Any):
t.cast(MetricWithLLM, self.metric).llm = LangchainLLMWrapper(llm)
if isinstance(self.metric, MetricWithEmbeddings):
embeddings = get_or_init(kwargs, "embeddings", OpenAIEmbeddings)
t.cast(
MetricWithEmbeddings, self.metric
).embeddings = LangchainEmbeddingsWrapper(embeddings)
t.cast(MetricWithEmbeddings, self.metric).embeddings = (
LangchainEmbeddingsWrapper(embeddings)
)
self.metric.init(run_config)

@property
Expand Down
8 changes: 8 additions & 0 deletions src/ragas/metrics/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@
)
from ragas.metrics._context_recall import ContextRecall, context_recall
from ragas.metrics._faithfulness import Faithfulness, FaithulnesswithHHEM, faithfulness
from ragas.metrics._noise_sensitivity import (
NoiseSensitivity,
noise_sensitivity_irrelevant,
noise_sensitivity_relevant,
)
from ragas.metrics._rubrics_based import (
LabelledRubricsScore,
ReferenceFreeRubricsScore,
Expand Down Expand Up @@ -43,6 +48,9 @@
"context_entity_recall",
"SummarizationScore",
"summarization_score",
"NoiseSensitivity",
"noise_sensitivity_irrelevant",
"noise_sensitivity_relevant",
"labelled_rubrics_score",
"reference_free_rubrics_score",
"ReferenceFreeRubricsScore",
Expand Down
Loading

0 comments on commit 8da231d

Please sign in to comment.