-
Notifications
You must be signed in to change notification settings - Fork 829
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implementation of Noise sensitivity metrics from RAGChecker (#1190)
Solves: - #1185 - Took inspiration from RAGChecker from AWS Noise sensitivity [noise sensitivity](https://github.com/amazon-science/RAGChecker/tree/main/ragchecker) metrics. - Have tested it locally, it is working giving the results. ### Input ```python from datasets import Dataset from ragas.metrics import noise_sensitivity_relevant, noise_sensitivity_irrelevant from ragas import evaluate data_sample = { "question": ["What is the Life Insurance Corporation of India (LIC) known for?"], "ground_truth": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments."], "answer": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country."], "contexts": [["The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.", "LIC is the largest insurance company in India, with a vast network of policyholders and a huge investments.", "As the largest institutional investor in India, LIC manages a substantial funds, contributing to the financial stability of the country.", "The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc"]] } dataset = Dataset.from_dict(data_sample) metrics = [noise_sensitivity_relevant, noise_sensitivity_irrelevant] score = evaluate(dataset,metrics=metrics) score.to_pandas() ``` --------- Co-authored-by: Shahules786 <[email protected]>
- Loading branch information
1 parent
d5b60bb
commit 8da231d
Showing
6 changed files
with
379 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
|
||
|
||
# Noise Sensitivity | ||
|
||
Noise sensitivity measures how often a system makes errors by providing incorrect responses when utilizing either relevant or irrelevant retrieved documents. The score ranges from 0 to 1, with lower values indicating better performance. Noise sensitivity is computed using the question, ground truth, answer, and the retrieved context. | ||
|
||
To estimate noise sensitivity, each claim in the generated answer is examined to determine whether it is correct based on the ground truth and whether it can be attributed to the relevant (or irrelevant) retrieved context. Ideally, all claims in the answer should be supported by the relevant retrieved context. | ||
|
||
|
||
```{math} | ||
\text{noise sensitivity (relevant)} = {|\text{Number of incorrect claims in answer}| \over |\text{Number of claims in the Answer}|} | ||
``` | ||
|
||
```{Hint} | ||
Question: What is the Life Insurance Corporation of India (LIC) known for? | ||
Ground truth: The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments. | ||
Relevant Retrieval: | ||
- The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India. | ||
- LIC is the largest insurance company in India, with a vast network of policyholders and a significant role in the financial sector. | ||
- As the largest institutional investor in India, LIC manages a substantial life fund, contributing to the financial stability of the country. | ||
Irrelevant Retrieval: | ||
- The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc. | ||
``` | ||
|
||
|
||
## Example | ||
|
||
```{code-block} python | ||
:caption: Noise Sensitivity | ||
from datasets import Dataset | ||
from ragas.metrics import noise_sensitivity_relevant, noise_sensitivity_irrelevant | ||
from ragas import evaluate | ||
data_sample = { | ||
"question": ["What is the Life Insurance Corporation of India (LIC) known for?"], | ||
"ground_truth": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments."], | ||
"answer": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country."], | ||
"contexts": [[ | ||
"The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.", | ||
"LIC is the largest insurance company in India, with a vast network of policyholders and a huge investments.", | ||
"As the largest institutional investor in India, LIC manages a substantial funds, contributing to the financial stability of the country.", | ||
"The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc" | ||
]] | ||
} | ||
dataset = Dataset.from_dict(data_sample) | ||
metrics = [noise_sensitivity_relevant, noise_sensitivity_irrelevant] | ||
score = evaluate(dataset,metrics=metrics) | ||
score.to_pandas() | ||
``` | ||
|
||
## Calculation | ||
|
||
Let's examine how noise sensitivity in relevant context was calculated: | ||
|
||
- **Step 1:** Identify the relevant contexts from which the ground truth can be inferred. | ||
|
||
- Ground Truth: | ||
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments. | ||
|
||
- Contexts: | ||
- Context 1: `The Life Insurance Corporation of India (LIC) was established in 1956` following the nationalization of the insurance industry in India. | ||
- Context 2: `LIC is the largest insurance company in India`, with a vast network of policyholders and a significant role in the financial sector. | ||
- Context 3: `As the largest institutional investor in India, LIC manages a substantial funds`, contributing to the financial stability of the country. | ||
|
||
- **Step 2:** Verify if the claims in the generated answer can be inferred from the relevant context. | ||
|
||
- Answer: | ||
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country. | ||
|
||
- Contexts: | ||
- Context 1: The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India. | ||
- Context 2: `LIC is the largest insurance company in India`, with a vast network of policyholders and a significant role in the financial sector. | ||
- Context 3: `As the largest institutional investor in India, LIC manages a substantial funds`, `contributing to the financial stability of the country`. | ||
|
||
|
||
- **Step 3:** Identify any incorrect claims in the answer (i.e., answer statements that are not supported by the ground truth). | ||
|
||
- Ground Truth: | ||
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments. | ||
|
||
- Answer: | ||
The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. `LIC contributs to the financial stability of the country`. | ||
|
||
Explanation: The ground truth does not mention anything about LIC contributing to the financial stability of the country. Therefore, this statement in the answer is incorrect. | ||
|
||
Incorrect Statement: 1 | ||
Total claims: 3 | ||
|
||
- **Step 4:** Calculate noise sensitivity using the formula: | ||
```{math} | ||
\text{noise sensitivity} = { \text{1} \over \text{3} } = 0.333 | ||
``` | ||
This results in a noise sensitivity score of 0.333, indicating that one out of three claims in the answer was incorrect. | ||
Credits: Noise senstivity was introduced in [RAGChecker](https://github.com/amazon-science/RAGChecker/tree/main/ragchecker) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
"""Async utils.""" | ||
|
||
import asyncio | ||
from typing import Any, Coroutine, List | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.