Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testset Generation: Is going into continuous loop #662

Open
Vtej98 opened this issue Feb 27, 2024 · 19 comments
Open

Testset Generation: Is going into continuous loop #662

Vtej98 opened this issue Feb 27, 2024 · 19 comments
Labels
question Further information is requested

Comments

@Vtej98
Copy link

Vtej98 commented Feb 27, 2024

Question
I am not sure, what's happening. I see that the testset data isn't generating and just going into a continuous loop, exhausting the tokens of openAI

My Code
from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
from langchain.document_loaders import DirectoryLoader
import os

OPENAI_API_KEY = "sk-xxxxxx"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

def load_docs(directory):
loader = DirectoryLoader(directory)
documents = loader.load()
return documents

documents = load_docs("./source")
for document in documents:
document.metadata['file_name'] = document.metadata['source']

generator = TestsetGenerator.with_openai()

testset = generator.generate_with_langchain_docs(documents, test_size=10, distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})
testset.to_pandas()
testset.to_pandas().to_excel('output_data.xlsx', index=False)

Additional context
I did explore the code, and found that it has retry mechanisms, of 15 retries and a wait time of 90 seconds. But I still waited for a long time no response of completion.

@Vtej98 Vtej98 added the question Further information is requested label Feb 27, 2024
@shahules786
Copy link
Member

Hey @Vtej98 , can you try to use it with a limited number of documents first? Also ensuring with_debugging argument to true so that we have better context on where it is getting stuck?
Also which version of ragas are you using?

@Vtej98
Copy link
Author

Vtej98 commented Mar 1, 2024

Hello @shahules786 ,

The number of documents used is 1, it hardly contains 9 pages.

Please find the debug log:

Filename and doc_id are the same for all nodes.
Generating: 0%| | 0/5 [00:00<?, ?it/s][ragas.testset.filters.DEBUG] node filter: {'score': 7.5}
[ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['Cyberlife Browser', 'Ring alert', 'Correspondence queue', 'Confirmation letter', 'POA document']
[ragas.testset.filters.DEBUG] node filter: {'score': 7.5}
[ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['Cyberlife Browser', 'Ring alert', 'Correspondence queue', 'Confirmation letter', 'POA document']
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 8.0}
[ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['Cyberlife Browser', 'Ring alert', 'Correspondence queue', 'Confirmation letter', 'POA document']
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 7.5}
[ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['Cyberlife Browser', 'Ring alert', 'Correspondence queue', 'Confirmation letter', 'POA document']
[ragas.testset.evolutions.INFO] seed question generated: What is the process for sending a confirmation letter after removing an AIF/POA or guardian/conservator from the system?
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] filtered question: {'reason': 'The question is specific and refers to a particular process, making it clear and answerable.', 'verdict': '1'}
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.evolutions.DEBUG] answer generated: {'answer': 'Send a ring alert to the Correspondence queue per the Ring d.Alert Job Aid (OPE-007) in Related resources to have a confirmation letter sent.', 'verdict': '1'}
Generating: 20%|██████████████▍ | 1/5 [00:10<00:42, 10.75s/it][ragas.testset.filters.DEBUG] node filter: {'score': 7.5}
[ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['Cyberlife Browser', 'Ring alert', 'Correspondence queue', 'Confirmation letter', 'POA document']
[ragas.testset.evolutions.INFO] seed question generated: What should be done after removing an AIF/POA or guardian/conservator from the system in order to have a confirmation letter sent?
[ragas.testset.filters.DEBUG] filtered question: {'reason': 'The question is clear and specific, referring to a particular process and desired outcome.', 'verdict': '1'}
[ragas.testset.evolutions.DEBUG] [ReasoningEvolution] simple question generated: What should be done after removing an AIF/POA or guardian/conservator from the system in order to have a confirmation letter sent?
[ragas.testset.filters.DEBUG] filtered question: {'reason': 'The question is clear and specific, referring to a particular action in a system and asking for the subsequent steps.', 'verdict': '1'}
[ragas.testset.evolutions.DEBUG] [ReasoningEvolution] question compressed: After removing an AIF/POA or guardian/conservator from the system, what action should be taken to ensure that a confirmation letter is sent?
[ragas.testset.filters.DEBUG] evolution filter: {'reason': 'Both questions ask for the same procedure to be followed after removing an AIF/POA or guardian/conservator from the system to ensure a confirmation letter is sent. They share the same depth, breadth, and requirements.', 'verdict': '1'}
[ragas.testset.evolutions.DEBUG] evolution_filter failed, retrying with 1
[ragas.testset.evolutions.INFO] retrying evolution: 1 times
[ragas.testset.filters.DEBUG] node filter: {'score': 7.5}
[ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['POA request', 'Address of record', 'Power of Attorney', 'Notarized signature', 'AIF/POA']
[ragas.testset.evolutions.INFO] seed question generated: What are the requirements for a Power of Attorney document to be considered valid?
[ragas.testset.filters.DEBUG] filtered question: {'reason': 'The question is clear and specific, referring to a particular legal document and its validity requirements.', 'verdict': '1'}
[ragas.testset.evolutions.DEBUG] [ReasoningEvolution] simple question generated: What are the requirements for a Power of Attorney document to be considered valid?
[ragas.testset.filters.DEBUG] filtered question: {'reason': 'The question is clear and specific, outlining the exact information needed regarding a Power of Attorney document.', 'verdict': '1'}
[ragas.testset.evolutions.DEBUG] [ReasoningEvolution] question compressed: What are the criteria for a Power of Attorney document to be considered valid, including the requirements for updating the address of record and the acceptable timeframe for the document's date?
[ragas.testset.filters.DEBUG] evolution filter: {'reason': 'The second question includes additional specific requirements (address updates and document date timeframe) that are not present in the first question, leading to a different depth of inquiry.', 'verdict': '0'}
[ragas.testset.evolutions.DEBUG] answer generated: {'answer': "XXXXX I just hidden the answer, but it has the answer correctly", 'verdict': '1'}
Generating: 40%|████████████████████████████▊ | 2/5 [00:40<01:06, 22.08s/it]

It's getting freeze here, and draining the openAI tokens

Python version: 3.8
ragas==0.1.2

I also tried from python version: 3.12
Building ragas from source

It's the same thing

@Vtej98
Copy link
Author

Vtej98 commented Mar 1, 2024

It's the same for the latest version of ragas as well.
I also tried: ragas 0.1.3

It's just freezing, after 1/3/4 questions generation

@asti009asti
Copy link

Same thing happens when calling generate_with_llamadocs(). It gets stuck forever and insanely consumes tokens in the background. There seems to be a threading lock problem.

Screenshot 2024-03-01 at 11 20 06

@MatousEibich
Copy link

+1, I've got the same issue.

image

It gets stuck at this, doesn't generate anything (but also doesn't consume tokens).

@omerfguzel
Copy link

omerfguzel commented Mar 7, 2024

Same here, it stucks. Looking for a solution.
image

UPDATE
When I tried with installation from source like

git clone https://github.com/explodinggradients/ragas && cd ragas
pip install -e .

the issue seems to be solved.
image

There is a new issue though. I couldn't generate dataset using a single JSON file. I got "Filename and doc_id are the same for all nodes." error. But it is okay I wasn't gonna generate dataset with a single file anyway.

Hope it helps.

@Kelp710
Copy link

Kelp710 commented Mar 15, 2024

I'm facing the same problem too.
And I found out that when I work with documents which are short and in Japanese, the generating phase goes into a continuous loop. However, When I work with documents that are long and in English it works as I expected.
Moreover, When I use documents that are long and in Japanese, I see the tendency to do loops more with the same question(using context_scoring_prompt). but it doesn't go to the continuous loop.

@shahules786
Copy link
Member

@Kelp710 did you do automatic language adaptation before using it with Japanese documents?https://docs.ragas.io/en/stable/howtos/applications/use_prompt_adaptation.html

@Kelp710
Copy link

Kelp710 commented Mar 15, 2024

@shahules786
Thank you for the advice,I have not used automatic language adaptation.
I tried automatic language adaptation. However, it doesn't consider the pdf document I use anymore. and generate irrelevant questions/answers.

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from ragas.testset.evolutions import multi_context, reasoning, simple,conditional
from ragas.testset.generator import TestsetGenerator
import os
import uuid
from llama_index.core import SimpleDirectoryReader

unique_id = uuid.uuid4().hex[0:8]
os.environ["LANGCHAIN_PROJECT"] = f"Tracing Walkthrough - {unique_id}"

loader = SimpleDirectoryReader("./doc")
query_space = "large language models"
documents = loader.load_data()

# generator with openai models
generator_llm = ChatOpenAI(model="gpt-3.5-turbo-16k")
critic_llm = ChatOpenAI(model="gpt-3.5-turbo-16k")
embeddings = OpenAIEmbeddings()

generator = TestsetGenerator.from_langchain(generator_llm, critic_llm, embeddings)
generator.adapt("japanese", evolutions=[simple, reasoning,conditional,multi_context])
distributions = {simple:0.2,multi_context: 0.4, reasoning: 0.1,conditional:0.3}

# generate testset
testset = generator.generate_with_langchain_docs(documents, 3, distributions,with_debugging_logs=True)
testset=testset.to_pandas()

@satyaborg
Copy link

Getting the same error (with llama-index running both on notebook and as a script) - keeps on going even after setting number of docs to 1:

..
Filename and doc_id are the same for all nodes.                   
Generating:   0%|          | 0/1 [00:00<?, ?it/s][ragas.testset.filters.DEBUG] node filter: {'score': 1.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 4.5}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 0.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 3.5}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 1.0}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times
[ragas.testset.filters.DEBUG] node filter: {'score': 3.5}
[ragas.testset.evolutions.INFO] retrying evolution: 0 times

As @asti009asti mentioned, this seems to be a theading issue - @shahules786 do you have any pointers as to what might be causing this? Happy to contribute

@zhuweiji
Copy link

zhuweiji commented Mar 24, 2024

Stuck on generating at 50%, unable to stop the python interpreter too.

image

Tried installing from source like @omerfguzel mentioned, and running on just a single english markdown document. Tried several times, with several different models, but all were stuck at 50%

The document: https://github.com/awsdocs/aws-doc-sdk-examples/tree/main/python/cross_service/apigateway_covid-19_tracker

import os
from pathlib import Path

from langchain_community.document_loaders import DirectoryLoader
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from ragas.testset.evolutions import multi_context, reasoning, simple
from ragas.testset.generator import TestsetGenerator

from get_secrets import project_secrets


os.environ["OPENAI_API_KEY"] = project_secrets['openai_token']

d = Path(__file__).parent / 'documents'

loader = DirectoryLoader(str(d))
documents = loader.load()

# generator with openai models
generator_llm = ChatOpenAI(model="gpt-3.5-turbo-16k")
critic_llm = ChatOpenAI(model="gpt-3.5-turbo-16k")
embeddings = OpenAIEmbeddings()

generator = TestsetGenerator.from_langchain(
    generator_llm,
    critic_llm,
    embeddings
)

# generate testset
testset = generator.generate_with_langchain_docs(documents, test_size=3, distributions={
                                                 simple: 0.5, reasoning: 0.25, multi_context: 0.25})

testset.to_dataset().save_to_disk(d / 'testset')

@asti009asti
Copy link

a couple notes here as it might help:

  1. The generation process gets stuck if you don't adapt the prompt to your language. You should call adapt() method on your TestGenerator object. This solves the issue.

    generator = TestsetGenerator.with_openai(
    generator_llm=llm,
    critic_llm=critic_llm,
    embeddings=embeddings_llm
    )
    ...
    generator.adapt(language="whatever",evolutions=[simple, multi_context, conditional, reasoning])
    generator.save(evolutions=[simple, multi_context, reasoning, conditional])
    ...
    question_dataset = generator.generate_with_llamaindex_docs(
    run_config=run_config,
    documents=documents,
    test_size=number_of_questions,
    distributions=distributions,
    is_async=False,
    raise_exceptions=False,
    )

  2. The last couple of releases solved many related problems. You can follow those conversations on discord. I tried on 0.1.5 this morning and it worked well. There seems to be another issue though in evolutions.py (line: 299), keyphase= is being passed as a named parameter instead of topic= resulting in a ValueError for me. If I change it, the issue goes away.

@YikunHan42
Copy link

I am not sure if the continuous loop is due to the local environment. When I change to Google Colab, instead of running the Python file in Windows PowerShell, it works. The generation using the API is still slow, though.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label May 20, 2024
@eci-aashish
Copy link

Any update on this issue as I am also facing this issue with latest ragas library, it always stuck on 90% test generation ?

@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label May 23, 2024
@anand-kamble
Copy link

I am trying to generate testset by following the Guide from Ragas

And it's stuck at 14% for embedding nodes.

image

Any updates about this issue will be helpful.

@ghousethanedar
Copy link

ghousethanedar commented Jul 1, 2024

Facing the same issue any update on this
image

When i am running this code in jupyter notebook i am able to create the dataset
but when i am running through CLI this is in continuous loop.

image

@jjmachan
Copy link
Member

jjmachan commented Aug 8, 2024

keeping this open for now but we have made a lot of improvements to this in the latest releases - could you folks check that out?

also we are doing a refactor in #1016 and will keep these points in mind

thanks for reporting the issues 🙂 and apologies for the hard time but we'll keep ironing these out 💪🏽

@AmineDjeghri
Copy link

Facing the same issue any update on this image

When i am running this code in jupyter notebook i am able to create the dataset but when i am running through CLI this is in continuous loop.

image

I have the same problem
it works on a notebook but not through CLI

@justinzweep
Copy link

justinzweep commented Aug 24, 2024

I still experience this issue, on 1.14 and also ragas-0.1.15 when using alternative LLM's or Embeddings. For example AzureChatOpenAI and CohereEmbeddings do not work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests