Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index Error when trying to evaluate() a simple example #1710

Open
taledv opened this issue Nov 26, 2024 · 1 comment
Open

Index Error when trying to evaluate() a simple example #1710

taledv opened this issue Nov 26, 2024 · 1 comment
Assignees
Labels
answered 🤖 The question has been answered. Will be closed automatically if no new comments bug Something isn't working module-metrics this is part of metrics module question Further information is requested

Comments

@taledv
Copy link

taledv commented Nov 26, 2024

[ ] I checked the issues and you site, and couldn't find an answer to my question.

My understanding + question

As I understand, to use the evaluate(), you can send your LLM (you can also not send and it will take a default chatgpt 4o one). The important point is that you don't need to send an object that combines both the LLM and the retrieval object, because when you send the dataset to the evaluate() it is already after performing retrieval and generation beforehand, and you just want to evaluate (dah).
If my understanding is correct, I don't know why the following code fails with the next Error:

Token indices sequence length is longer than the specified maximum sequence length for this model (1921 > 1024). Running this sequence through the model will result in indexing errors
Exception raised in Job[0]: IndexError(index out of range in self)

As you will see in the code below, I have a very simple example of the data, with probably few dozens of tokens, so why it says it somehow exceeds the maximum input number of tokens for the model (which 1024). I don't understand how it gets to 1921.

Ragas version: 0.2.6
Python version: 3.10.8

Code

from langchain.llms import HuggingFacePipeline
from ragas.llms import LangchainLLMWrapper
from ragas import EvaluationDataset
from ragas.metrics import Faithfulness
from ragas import evaluate
from datasets import Dataset, DatasetDict
import torch

torch.device('mps')

data = {
    'user_input': ["When was America founded?"],
    'retrieved_contexts':  [[
    "The United States has over 331 million people.",
    'The United States of America was founded in 1776']],
    'response': ["America was founded in 1776"],
    'reference': ["America was founded in 1776"]
}

custom_dataset = DatasetDict({"eval": Dataset.from_dict(data)})
eval_dataset_custom = EvaluationDataset.from_hf_dataset(custom_dataset['eval'])

llm = HuggingFacePipeline.from_model_id(
    model_id='gpt2',  
    task="text-generation",
    pipeline_kwargs={"max_new_tokens": 50},
)

results = evaluate(dataset=eval_dataset_custom, metrics=[Faithfulness(llm=LangchainLLMWrapper(llm))])

Additional context

I have closed my other issue (#1700), since there I used a RetrievalQA.from_chain_type(llm=llm, retriever=retriever) in the evaluate(), while as I understand it is not needed, and only the llm object is. Now I get a different error as mentioned above, I would really appreciate your help. I think I wrote the simplest example that should work.

@taledv taledv added the question Further information is requested label Nov 26, 2024
@dosubot dosubot bot added the bug Something isn't working label Nov 26, 2024
@sahusiddharth
Copy link
Collaborator

Hi @taledv

This is happening due to the prompt used in the faithfulness metric. You can check the prompt with the code below to see how it's affecting the token count:

from ragas.metrics import Faithfulness
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI

evaluator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o-mini"))
scorer = Faithfulness(llm=evaluator_llm)
scorer.get_prompts()

If you’d like to modify it, you can follow the instructions in our how to modify metric prompt docs.

@sahusiddharth sahusiddharth added module-metrics this is part of metrics module answered 🤖 The question has been answered. Will be closed automatically if no new comments labels Jan 11, 2025
@sahusiddharth sahusiddharth self-assigned this Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
answered 🤖 The question has been answered. Will be closed automatically if no new comments bug Something isn't working module-metrics this is part of metrics module question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants