Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improved answer relevancy #346

Merged
merged 40 commits into from
Dec 8, 2023
Merged
Changes from 37 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
8b8d1fe
add langchain loaders to docs
shahules786 Oct 19, 2023
cd7f411
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Oct 20, 2023
5b18325
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Oct 26, 2023
bb8d984
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Oct 26, 2023
9cbb57d
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Oct 29, 2023
479e636
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 7, 2023
3eeb7ea
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 12, 2023
b09003f
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 17, 2023
0d28d62
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 20, 2023
110cc02
reformat to json format
shahules786 Nov 20, 2023
c4036f2
add qcg to validate
shahules786 Nov 20, 2023
cf14e39
determinism experiments
shahules786 Nov 20, 2023
7ba5f46
json loader
shahules786 Nov 21, 2023
1df47bf
replace with nanmean
shahules786 Nov 21, 2023
d54ef72
move json loader
shahules786 Nov 21, 2023
3b1878d
move json loader
shahules786 Nov 21, 2023
cc128c9
fix type error
shahules786 Nov 22, 2023
24b9e25
Merge branch 'main' of https://github.com/explodinggradients/ragas in…
shahules786 Nov 22, 2023
16821c4
add error string
shahules786 Nov 22, 2023
8e7c0c4
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 24, 2023
14e7440
Merge branch 'main' of https://github.com/explodinggradients/ragas in…
shahules786 Nov 24, 2023
35fb0e6
structured output
shahules786 Nov 24, 2023
dd218e1
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 26, 2023
14375b7
Merge branch 'main' of https://github.com/explodinggradients/ragas in…
shahules786 Nov 26, 2023
ca04c6d
remove defaults
shahules786 Nov 26, 2023
497df5d
prompt improvements
shahules786 Nov 26, 2023
18466a2
default to None
shahules786 Nov 26, 2023
452272d
fix nli for unrelated answering
shahules786 Nov 27, 2023
a78cd49
fix nli for unrelated answering
shahules786 Nov 27, 2023
9ab43a6
load output as json
shahules786 Nov 27, 2023
1b78e13
merge main
shahules786 Nov 27, 2023
eab12df
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 27, 2023
a63aa63
Merge branch 'main' into dev#240
shahules786 Nov 27, 2023
f32ecf1
improve answer relevancy
shahules786 Nov 29, 2023
8fdcd26
Merge branch 'main' of https://github.com/explodinggradients/ragas in…
shahules786 Nov 29, 2023
a0f1b9b
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Nov 29, 2023
96f559f
Merge branch 'main' into dev#240
shahules786 Nov 29, 2023
2487430
Merge branch 'main' of https://github.com/explodinggradients/ragas
shahules786 Dec 1, 2023
19f0a42
resolve merge conflicts
shahules786 Dec 7, 2023
aaa8717
fix linting
shahules786 Dec 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 53 additions & 14 deletions src/ragas/metrics/_answer_relevance.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from ragas.embeddings.base import embedding_factory
from ragas.exceptions import OpenAIKeyNotFound
from ragas.metrics.base import EvaluationMode, MetricWithLLM
from ragas.utils import load_as_json

if t.TYPE_CHECKING:
from langchain.callbacks.manager import CallbackManager
Expand All @@ -21,13 +22,46 @@

QUESTION_GEN = HumanMessagePromptTemplate.from_template(
"""
Generate question for the given answer.
Answer:\nThe PSLV-C56 mission is scheduled to be launched on Sunday, 30 July 2023 at 06:30 IST / 01:00 UTC. It will be launched from the Satish Dhawan Space Centre, Sriharikota, Andhra Pradesh, India
Question: When is the scheduled launch date and time for the PSLV-C56 mission, and where will it be launched from?
Generate a question for the given answer and Identify if answer is noncommittal

Answer:{answer}
Question:
""" # noqa: E501
Answer:
Albert Einstein was born in Germany.
Context:
Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time
Output:
{{"question":"Where was Albert Einstein born?","noncommittal":false}}


Answer:
It can change its skin color based on the temperature of its environment.
Context:
A recent scientific study has discovered a new species of frog in the Amazon rainforest that has the unique ability to change its skin color based on the temperature of its environment.
Output:
{{"question":"What unique ability does the newly discovered species of frog have?","noncommittal":false}}


Answer:
Everest
Context:
The tallest mountain on Earth, measured from sea level, is a renowned peak located in the Himalayas.
Output:
{{"question":"What is the tallest mountain on Earth?","noncommittal":false}}


Answer:
I don't know about the groundbreaking feature of the smartphone invented in 2023 as am unware of information beyong 2022.
Context:
In 2023, a groundbreaking invention was announced: a smartphone with a battery life of one month, revolutionizing the way people use mobile technology.
Output:
{{"question":"What was the groundbreaking feature of the smartphone invented in 2023?", "noncommittal":true}}



Answer:
{answer}
Context:
{context}
Output:""" # noqa: E501
)


Expand All @@ -53,7 +87,7 @@ class AnswerRelevancy(MetricWithLLM):
"""

name: str = "answer_relevancy"
evaluation_mode: EvaluationMode = EvaluationMode.qa
evaluation_mode: EvaluationMode = EvaluationMode.qac
batch_size: int = 15
strictness: int = 3
embeddings: RagasEmbeddings = field(default_factory=embedding_factory)
Expand All @@ -71,26 +105,31 @@ def _score_batch(
callbacks: t.Optional[CallbackManager] = None,
callback_group_name: str = "batch",
) -> list[float]:
questions, answers = dataset["question"], dataset["answer"]
questions, answers, contexts = (
dataset["question"],
dataset["answer"],
dataset["contexts"],
)
with trace_as_chain_group(
callback_group_name, callback_manager=callbacks
) as batch_group:
prompts = []
for ans in answers:
human_prompt = QUESTION_GEN.format(answer=ans)
for ans, ctx in zip(answers, contexts):
human_prompt = QUESTION_GEN.format(answer=ans, context="\n".join(ctx))
prompts.append(ChatPromptTemplate.from_messages([human_prompt]))

results = self.llm.generate(
prompts,
n=self.strictness,
callbacks=batch_group,
)
results = [[i.text for i in r] for r in results.generations]

results = [[load_as_json(i.text) for i in r] for r in results.generations]
scores = []
for question, gen_questions in zip(questions, results):
for question, result in zip(questions, results):
gen_questions = [item.get("question", "") for item in result]
committal = np.any([item.get("noncommittal", False) for item in result])
cosine_sim = self.calculate_similarity(question, gen_questions)
scores.append(cosine_sim.mean())
scores.append(cosine_sim.mean() * int(not committal))

return scores

Expand Down
Loading