-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runner thread raised exception and testset function is not accessible #1333
Comments
Hello, @Sam-364! I'm here to assist you with resolving bugs, answering questions, and becoming a contributor. Let's work together to tackle this issue! There are a few similar issues that might help you resolve this problem:
The Here is a modified version of your code with the import torch
from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
from langchain_ollama import ChatOllama
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_core.documents import Document as LCDocument
import pymupdf
import nest_asyncio
nest_asyncio.apply()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {"device": device}
generator_llm = ChatOllama(model="llama3", temperature=0.2)
critic_llm = ChatOllama(model="llama3.1", temperature=0.2)
embeddings = HuggingFaceEmbeddings(model_name=model_name, model_kwargs=model_kwargs)
def load_pdf_as_document(file_path):
doc = pymupdf.open(file_path)
text = ""
for page in doc:
text += page.get_text()
return LCDocument(page_content=text)
pdf_path = "/content/1.pdf"
document = load_pdf_as_document(pdf_path)
documents = [document]
print(documents)
generator = TestsetGenerator.from_langchain(
generator_llm,
critic_llm,
embeddings,
)
testset = generator.generate_with_langchain_docs(
documents=documents,
test_size=10,
distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
with_debugging_logs=True,
is_async=True,
raise_exceptions=False,
)
testset.to_pandas()
test_df = testset.to_pandas()
test_df.head() Try running this updated script and see if it resolves the issue. If the problem persists, consider checking the detailed error trace for any specific issues related to your environment or dependencies. |
I had previously used nest_asyncio.apply() but later realized the "is_async=True" parameter in testset is also performing the same thing wihout any recurrent effect. |
after downgrading the version about langchain series packages,the same issue was solved.now my packages version are: |
Yes I forgot to comment, I did the same with downgrading the versionS of the packages and it worked for me but now the only issue is - it is taking an eternity to generate the dataset. P.S. - Hope, to have a discussion on that issue later on ! For now, I'm closing the issue. |
I have checked both the documentation and ragas-langchain documentation and couldn't resolve my issue.
The bug
Whenever I try to execute the testset using the generator module having the generate_with_langchain_docs function, I'm getting a thread handling error and the execution abruptly stops. I tried downgrading to ragas==0.1.7 which was the last version where no such errors were there but still it didn't work. Even using generate_with_llamaindex_docs, the same error is encountered so I tried to get the best of the two frameworks by combining the features of both the frameworks(i.e. using the document_loader of langchain and using the generate_with_llamaindex_docs of llamaindex) but the issue persisted. I have followed the documentation thoroughly but the bug couldn't be fixed. Passing the prescribed "raise_exceptions=False" also doesn't have any effect. I have used Ollama based local Llama versions as generator_llm and critic_llm. I had checked on individual arguments but it is not working.
Ragas version: 0.1.20
Python version: 3.10.12
Here is my detailed code:
Error trace
While executing the following code block:
I am getting the following error:
Expected behavior
The script was intended to create a synthetic QnA dataset based on the docs I used for the evaluation of a RAG pipeline but the execution is halted abruptly due to the bug. All the modules except of the testset module are working.
Hoping for quick fix to the issue because I saw many such issues like this in the generator module section.
The text was updated successfully, but these errors were encountered: