Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: custom evaluator and metric name to support llm evaluation #433 #459

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fecet
Copy link

@fecet fecet commented May 15, 2023

I have taken the initiative to develop a project aimed at building a transparent, democratic, and reproducible framework for LLM evaluation, which can be found here https://huggingface.co/spaces/SUSTech/llm-evaluate/. The goal is to enable anyone to utilize datasets and metrics hosted on Hugging Face for evaluating their own LLM models and sharing their results, datasets, metrics, and more.

Currently, the implementation of the evaluate feature does not support a custom evaluator. To overcome this limitation, I have integrated a custom subtask and evaluator within the existing code of the evaluation module, https://huggingface.co/spaces/SUSTech/llm-evaluate/blob/main/utils.py.

However, I encountered a challenge when attempting to utilize the metric config name to define the task. As evident in the evaluator's source code

def prepare_metric(self, metric: Union[str, EvaluationModule]):
, the evaluator.compute function only accepts the metric name as an argument. Consequently, if I wish to pass the metric config name, it seems that I have no option but to override the entire evaluator.compute function.

This approach would lead to unnecessary code complexity and difficulty in maintaining my codebase as the evaluation module grows. So I create this PR for enhancing evaluate and eliminate this inconvenience.

It would be a great pleasure for me to provide any assistance that could contribute to the enhancement of this repository and ensure cleaner code implementation.

I appreciate your time and consideration of my request. Please let me know if there is any additional information or clarification I can provide. I am eagerly looking forward to your valuable feedback and guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant