-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RAGAS without ground_truth #1379
Comments
Hey @dasilvaviviane! I'm here to help you with any issues or questions you have while waiting for a human maintainer. I'm a bot 🤖 that can assist you with bug fixes, answer questions, and guide you on how to become a contributor. Yes, it is possible to use the RAGAS library without having the ground_truth for any metrics. Ragas can perform reference-free evaluations of various aspects of your RAG pipeline, meaning you don't need ground-truths when running the evaluations and can run it on production traces that you've collected with Langfuse. The library supports both LLM-based and non-LLM-based metrics, allowing for flexible evaluation methods that do not necessarily require ground-truth data [1][2]. |
how to evaluate ragas metrics if any key user input or contexts or answer or reference is not present or null |
you can checkout the General Purpose Metrics mentioned in the docs too - https://docs.ragas.io/en/latest/concepts/metrics (bottom) |
If any key user input, context, answer, or reference is not present or is null, the if answers is None:
return np.nan
if dataset is None:
raise ValueError("Provide dataset!") The raise_exceptions: bool = False, This ensures that the evaluation process can handle missing or null values gracefully by either returning |
Thanks @jjmachan for quite good link that you posted. |
Also like what is the stable version of RAGAS as from pypi we have 0.1.20, but in this version many functions will be depreciated. |
Thanks, from your suggestion, I tried to use https://docs.ragas.io/en/latest/concepts/metrics/available_metrics/answer_relevance/#response-relevancy but I couldn't install SingleTurnSample, do you have any idea on why? |
To evaluate RAGAS metrics using a golden dataset with only question and answer columns, while retrieving answers from an LLM and contexts at runtime, follow these steps:
Here's a code snippet to guide you: from datasets import Dataset
from ragas.evaluation import evaluate as ragas_evaluate
from ragas.executor import Executor
def evaluate_ragas(query_engine, dataset, metrics, llm=None, embeddings=None, raise_exceptions=False):
if dataset is None:
raise ValueError("Provide dataset!")
exec = Executor(
desc="Running Query Engine",
keep_progress_bar=True,
raise_exceptions=raise_exceptions,
)
queries = dataset["question"]
for i, q in enumerate(queries):
exec.submit(query_engine.aquery, q, name=f"query-{i}")
answers = []
contexts = []
try:
results = exec.results()
if results == []:
raise Exception("No results returned")
except Exception as e:
raise e
else:
for r in results:
answers.append(r.response)
contexts.append([n.node.text for n in r.source_nodes])
hf_dataset = Dataset.from_dict(
{
"question": queries,
"contexts": contexts,
"answer": answers,
}
)
results = ragas_evaluate(
dataset=hf_dataset,
metrics=metrics,
llm=llm,
embeddings=embeddings,
raise_exceptions=raise_exceptions,
)
return results This function will handle the evaluation process, leveraging tools like LlamaIndex for building the query engine and RAGAS for evaluation [1]. |
@veenuvinod the metrics useful would be
others are also present but these would help you get started @dasilvaviviane u're using the latest version - try the stable version instead |
we have something like that here https://docs.ragas.io/en/stable/howtos/applications/_metrics_llm_calls/#export-llm-traces @veenuvinod. Could you check it out and let us know if that is what you where looking for? |
To set verbose to True in RAGAS and see how the metrics are evaluated, you can call the results = evaluate(
dataset_name="MyDataset",
llm_or_chain_factory=my_llm,
experiment_name="experiment_1_with_vanila_rag",
verbose=True
)
print(results) Setting |
Hi, is it possible to use the library without having the ground_truth for any metrics?
The text was updated successfully, but these errors were encountered: