Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in calculating score when output from LLM is broken #317

Conversation

Yongtae723
Copy link
Contributor

If generated answer is too stupid, evaluation of faithfull will be broken.

For example, if you change fake result into "I love christmass" ,which is completely nun sense answer on cell 10 on this notebook, the result of faithfullness will be 1. This is not good. You can try it out.

I debug the reason and showed like if the result is too stupid,
1st and 2nd output from LLM will be
There is no relevant statement that can be created from the given answer.
but ragas implementation is based the assumption there are "verdict: yes" or "verdict: no".

The way to solve this issue is not only one. for example, slitly modify template prompt for stupid answer is one example, but it can be affect the result.

So I make pr for my suggestion.

Maybe this PR will conflict my PR #307, If so I will fix conflict!

Thanks

@shahules786
Copy link
Member

Thanks @Yongtae723 , This makes sense. I'll check this while I modify faithfulness prompts + demonstrations as mentioned here.

@Yongtae723
Copy link
Contributor Author

alright, Make sense. Forcing json output might solve this issue. I am looking forward to hear your experiment result. And if possible, I would like to check it works as our expectation.

So if possible let me know your PR!

jjmachan pushed a commit that referenced this pull request Nov 24, 2023
@Yongtae723
Copy link
Contributor Author

Thanks! and sorry for my late responce

I checked prompt and tested.
I found typo of prompt, so I created new PR(I guess) #338

@Yongtae723 Yongtae723 closed this Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants