Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add curator to handle inference for the model being evaluated #51

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

RyanMarten
Copy link
Member

No description provided.

@RyanMarten
Copy link
Member Author

Ok we are going to fix two things in curator to simplify this (can do another PR later when these fixes are released)

  1. Allow for passing in list[messages] directly in llm() instead of requiring to make a dataset
  1. Fixing the rate limit issue with anthropic models so we don't have to have manual if statements setting rate limits

@RyanMarten
Copy link
Member Author

Testing with

  python -m eval.eval \
        --model curator  \
        --tasks alpaca_eval \
        --model_name "gemini/gemini-1.5-flash" \
        --annotator_model "gpt-4o-mini-2024-07-18" \
        --apply_chat_template False \
        --model_args 'tokenized_requests=False' \
        --output_path logs

@RyanMarten
Copy link
Member Author

RyanMarten commented Jan 21, 2025

Testing with

 python -m eval.eval \
        --model curator  \
        --tasks alpaca_eval \
        --model_name "claude-3-5-haiku-20241022" \
        --annotator_model "gpt-4o-mini-2024-07-18" \
        --apply_chat_template False \
        --model_args 'tokenized_requests=False' \
        --debug \
        --output_path logs

Working!

@RyanMarten
Copy link
Member Author

It would be better to pass backend_params (e.g. {"max_requests_per_minute": 2_000, "max_tokens_per_minute": 4_000_000}) via --model_args but couldn't really figure it out how to access these args in the class

@jmercat have you done this before? tried adding a

    @classmethod
    def create_from_arg_string(

https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/ibm_watsonx_ai.py#L72-L73

But ran into bugs, so just hardcoded the rate limits for gemini in the class itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants