Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faithfulness prompt update to avoid having single quotes in response #1874

Conversation

michaelromagne
Copy link
Contributor

@michaelromagne michaelromagne commented Jan 23, 2025

The error described in this comment is not resolved when computing Faithfulness.

After digging in judge LLMs responses, there is a JSON parse error happening when parsing the output of the judge. For instance, it happens for the below answer:

{
  "statements": [
    {
      "statement": "The Norwegian Dawn cruise ship was denied access to Mauritius.",
      "reason": "The context explicitly states that local authorities denied permission for the Norwegian Dawn ship to access the Mauritius capital of Port Louis.",
      "verdict": 1
    },
    {
      "statement": "The denial of access was due to potential health risks.",
      "reason": "The context directly mentions that the ship was denied access \"citing \\\"potential health risks.\\\"\"",
      "verdict": 1
    },
    {
      "statement": "The specific health risk was a potential cholera outbreak on the Norwegian Dawn cruise ship.",
      "reason": "While the context mentions fears of a potential cholera outbreak in the title, it does not explicitly state that cholera was the specific health risk on the Norwegian Dawn. The context only mentions \'stomach-related illness\' without specifying cholera.",
      "verdict": 0
    }
  ]
}

The error is that the generated context has single quotes as you can see in the last reason. This is not allowed in JSON, and it happens frequently with the current prompt for Faithfulness as it often cite elements from the retrieved context to explain its verdict.

I tried to change the PydanticOutputParser logic in Langchain Core, the one that Ragas uses to parse JSON. I tried to replace single quotes by double quotes with simple string replace but it did not work.

Thus, an immediate solution that worked for me was to specifically ask the judge LLM to only output double quotes, not single quotes, and the error disappears.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jan 23, 2025
@jjmachan
Copy link
Member

hey @michaelromagne thanks you so much for addressing this but I was wondering if we could add this to the main pydantic object here instead

def _generate_output_signature(self, indent: int = 4) -> str:
return (
f"Please return the output in a JSON format that complies with the "
f"following schema as specified in JSON Schema:\n"
f"{self.output_model.model_json_schema()}"
)

that way other metrics will also get the benefit - what do you think?

@jjmachan jjmachan mentioned this pull request Jan 23, 2025
@michaelromagne
Copy link
Contributor Author

Yes indeed, that's perfect. I'll change it tomorrow, thanks 😄

@jjmachan
Copy link
Member

that would be awesome - hopefully we can get this out in the next release (tues morning PT)

@michaelromagne michaelromagne force-pushed the feature/DoubleQuoteFaithfulnessPrompt branch from ef8a536 to 7923102 Compare January 27, 2025 08:09
@michaelromagne
Copy link
Contributor Author

michaelromagne commented Jan 27, 2025

✅ The change works like a charm, thanks for the recommendation

@michaelromagne
Copy link
Contributor Author

@jjmachan is it good for you ?

Copy link
Member

@jjmachan jjmachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet, this works great! thank you so much 🙂

@jjmachan jjmachan merged commit b0478c9 into explodinggradients:main Jan 27, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants