Skip to content

Commit

Permalink
Working toward a functional state
Browse files Browse the repository at this point in the history
Signed-off-by: Dan McPherson <[email protected]>
  • Loading branch information
danmcp committed Jun 19, 2024
1 parent ad020ef commit 897c93f
Show file tree
Hide file tree
Showing 16 changed files with 333 additions and 1,173 deletions.
10 changes: 9 additions & 1 deletion .spellcheck-en-custom.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,13 @@
# make spellcheck-sort
# Please keep this file sorted:
# SPDX-License-Identifier: Apache-2.0
eval
Tatsu
TODO
eval
gpt
instructlab
jsonl
justfile
openai
vllm

38 changes: 37 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,40 @@
![Release](https://img.shields.io/github/v/release/instructlab/eval)
![License](https://img.shields.io/github/license/instructlab/eval)

Python library for Evaluation
Python Library for Evaluation


Check failure on line 10 in README.md

View workflow job for this annotation

GitHub Actions / markdown-lint

Multiple consecutive blank lines

README.md:10 MD012/no-multiple-blanks Multiple consecutive blank lines [Expected: 1; Actual: 2] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md012.md
## MT-Bench Testing Steps

TODO: Figure out the right version. Latest fails with openai.types not found

```shell
pip install vllm==0.3.3
```

You should run with `--tensor-parallel-size <NUM GPUS>` and possibly increase `--max-model-len` to increase the context length

```shell
python -m vllm.entrypoints.openai.api_server --model instructlab/granite-7b-lab
```

```shell
OPENAI_API_KEY="NO_API_KEY" python3 test_gen_answers.py
```

results are in data/mt_bench/model_answer/instructlab/granite-7b-lab.jsonl

For running judge model with vllm make sure you run with `--served-model-name gpt-4`

You should run with `--tensor-parallel-size <NUM GPUS>` and possibly increase `--max-model-len` to increase the context length

```shell
python -m vllm.entrypoints.openai.api_server --model instructlab/granite-7b-lab --served-model-name gpt-4
```

```shell
OPENAI_API_KEY="NO_API_KEY" python3 test_judge_answers.py
```

results are in data/mt_bench/model_judgment/gpt-4_single.jsonl

80 changes: 0 additions & 80 deletions data/mt_bench/model_answer/instructlab/granite-7b-lab.jsonl

This file was deleted.

160 changes: 0 additions & 160 deletions data/mt_bench/model_judgment/gpt-4_single.jsonl

This file was deleted.

2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
FastChat
shortuuid
openai<1.0.0
anthropic
psutil
torch
transformers
accelerate
pandas
pandas-stubs
Loading

0 comments on commit 897c93f

Please sign in to comment.