-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[TEST Framework] Add query http requests & openai sdk tests (#83)
* first commit query_http * fix format * fix proxy * add martix * add openai test case * change to githubci * add * add * add * add * add * add * add * add * add * add * add * add * add * add * add * github ci * only gpt2 * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * f## * change to github ci * change to github ci * change to github ci * change to github ci * change to github ci * change to dare&docker * change to dare&docker * change to dare&docker * change to dare&docker * change to dare&docker * change to docker * change to docker * change to docker * ls * ls * test ls * test ls * test ls * test ls * test ls * Organize code * update openai * fix openai * fix ci * fix ci * fix openai * fix openai * add * fix key * remove checkoutpath * remove checkoutpath * remove checkoutpath * fix checkout * fix bash -c * update req * reduce req * reduce code * fix review229 * fix review229 * change os and req * change os and req * fix path * docker python version * docker python version * docker python version * docker python version * change name * after pr106 fix * after pr106 fix * after pr106 fix * after pr106 fix * after pr106 fix * fix review * fix lint * fix lint
- Loading branch information
1 parent
c5c076a
commit bad9cb9
Showing
9 changed files
with
321 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
port: 8000 | ||
name: gpt2 | ||
route_prefix: /gpt2 | ||
cpus_per_worker: 2 | ||
gpus_per_worker: 0 | ||
deepspeed: false | ||
workers_per_group: 2 | ||
device: CPU | ||
ipex: | ||
enabled: true | ||
precision: bf16 | ||
model_description: | ||
model_id_or_path: gpt2 | ||
tokenizer_name_or_path: gpt2 | ||
chat_processor: ChatModelGptJ | ||
gpt_base_model: true | ||
prompt: | ||
intro: '' | ||
human_id: '' | ||
bot_id: '' | ||
stop_words: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# syntax=docker/dockerfile:1 | ||
FROM ubuntu:22.04 | ||
|
||
ARG python_v | ||
|
||
ENV LANG C.UTF-8 | ||
|
||
WORKDIR /root/llm-on-ray | ||
|
||
RUN --mount=type=cache,target=/var/cache/apt apt-get update -y \ | ||
&& apt-get install -y build-essential cmake wget curl git vim htop ssh net-tools \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
ENV CONDA_DIR /opt/conda | ||
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh && \ | ||
/bin/bash ~/miniconda.sh -b -p /opt/conda | ||
ENV PATH $CONDA_DIR/bin:$PATH | ||
|
||
# setup env | ||
SHELL ["/bin/bash", "--login", "-c"] | ||
|
||
RUN --mount=type=cache,target=/opt/conda/pkgs conda init bash && \ | ||
unset -f conda && \ | ||
export PATH=$CONDA_DIR/bin/:${PATH} && \ | ||
conda config --add channels intel && \ | ||
conda install python==${python_v} | ||
|
||
COPY ./pyproject.toml . | ||
COPY ./MANIFEST.in . | ||
|
||
# create llm_on_ray package directory to bypass the following 'pip install -e' command | ||
RUN mkdir ./llm_on_ray | ||
|
||
RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[cpu,deepspeed] --extra-index-url https://download.pytorch.org/whl/cpu \ | ||
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/ | ||
|
||
RUN ds_report | ||
|
||
# Used to invalidate docker build cache with --build-arg CACHEBUST=$(date +%s) | ||
ARG CACHEBUST=1 | ||
COPY ./dev/scripts/install-oneapi.sh /tmp | ||
RUN /tmp/install-oneapi.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
import subprocess | ||
import pytest | ||
import os | ||
|
||
|
||
def script_with_args(model_name, streaming_response, max_new_tokens, temperature, top_p): | ||
current_path = os.path.dirname(os.path.abspath(__file__)) | ||
|
||
config_path = os.path.join( | ||
current_path, "../../.github/workflows/config/" + model_name + "-ci.yaml" | ||
) | ||
|
||
os.path.join(current_path, "../../inference/serve.py") | ||
|
||
cmd_serve = ["llm_on_ray-serve", "--config_file", config_path] | ||
|
||
result_serve = subprocess.run(cmd_serve, capture_output=True, text=True) | ||
|
||
# Print the output of subprocess.run for checking if output is expected | ||
print(result_serve) | ||
|
||
# Ensure there are no errors in the serve script execution | ||
assert "Error" not in result_serve.stderr | ||
|
||
example_http_path = os.path.join( | ||
current_path, "../../examples/inference/api_server_openai/query_http_requests.py" | ||
) | ||
|
||
cmd_http = [ | ||
"python", | ||
example_http_path, | ||
"--model_name", | ||
model_name, | ||
] | ||
|
||
if streaming_response: | ||
cmd_http.append("--streaming_response") | ||
|
||
if max_new_tokens is not None: | ||
cmd_http.extend(["--max_new_tokens", str(max_new_tokens)]) | ||
|
||
if temperature is not None: | ||
cmd_http.extend(["--temperature", str(temperature)]) | ||
|
||
if top_p is not None: | ||
cmd_http.extend(["--top_p", str(top_p)]) | ||
|
||
result_http = subprocess.run(cmd_http, capture_output=True, text=True) | ||
|
||
# Print the output of subprocess.run for checking if output is expected | ||
print(result_http) | ||
|
||
# Ensure there are no errors in the http query script execution | ||
assert "Error" not in result_http.stderr | ||
|
||
assert isinstance(result_http.stdout, str) | ||
|
||
assert len(result_http.stdout) > 0 | ||
|
||
|
||
@pytest.mark.parametrize( | ||
"model_name,streaming_response,max_new_tokens,temperature,top_p", | ||
[ | ||
(model_name, streaming_response, max_new_tokens, temperature, top_p) | ||
for model_name in ["gpt2"] | ||
for streaming_response in [False, True] | ||
for max_new_tokens in [None, 128] | ||
for temperature in [None, 0.8] | ||
for top_p in [None, 0.7] | ||
], | ||
) | ||
def test_script(model_name, streaming_response, max_new_tokens, temperature, top_p): | ||
script_with_args(model_name, streaming_response, max_new_tokens, temperature, top_p) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
import subprocess | ||
import pytest | ||
import os | ||
|
||
os.environ["no_proxy"] = "localhost,127.0.0.1" | ||
os.environ["OPENAI_API_BASE"] = "http://localhost:8000/v1" | ||
os.environ["OPENAI_API_KEY"] = "YOUR_OPEN_AI_KEY" | ||
os.environ["OPENAI_BASE_URL"] = "http://localhost:8000/v1" | ||
|
||
|
||
def script_with_args(api_base, model_name, streaming_response, max_new_tokens, temperature, top_p): | ||
# Other OpenAI SDK tests | ||
if api_base != "http://localhost:8000/v1": | ||
os.environ["OPENAI_API_BASE"] = api_base | ||
os.environ["OPENAI_BASE_URL"] = api_base | ||
|
||
current_path = os.path.dirname(os.path.abspath(__file__)) | ||
|
||
config_path = os.path.join( | ||
current_path, "../../.github/workflows/config/" + model_name + "-ci.yaml" | ||
) | ||
|
||
os.path.join(current_path, "../../inference/serve.py") | ||
|
||
cmd_serve = ["llm_on_ray-serve", "--config_file", config_path] | ||
|
||
result_serve = subprocess.run(cmd_serve, capture_output=True, text=True) | ||
|
||
# Print the output of subprocess.run for checking if output is expected | ||
print(result_serve) | ||
|
||
# Ensure there are no errors in the serve script execution | ||
assert "Error" not in result_serve.stderr | ||
|
||
example_openai_path = os.path.join( | ||
current_path, "../../examples/inference/api_server_openai/query_openai_sdk.py" | ||
) | ||
|
||
cmd_openai = [ | ||
"python", | ||
example_openai_path, | ||
"--model_name", | ||
model_name, | ||
] | ||
|
||
if streaming_response: | ||
cmd_openai.append("--streaming_response") | ||
|
||
if max_new_tokens is not None: | ||
cmd_openai.extend(["--max_new_tokens", str(max_new_tokens)]) | ||
|
||
if temperature is not None: | ||
cmd_openai.extend(["--temperature", str(temperature)]) | ||
|
||
if top_p is not None: | ||
cmd_openai.extend(["--top_p", str(top_p)]) | ||
|
||
result_openai = subprocess.run(cmd_openai, capture_output=True, text=True) | ||
|
||
# Print the output of subprocess.run for checking if output is expected | ||
print(result_openai) | ||
|
||
# Ensure there are no errors in the OpenAI API query script execution | ||
assert "Error" not in result_openai.stderr | ||
|
||
assert isinstance(result_openai.stdout, str) | ||
|
||
assert len(result_openai.stdout) > 0 | ||
|
||
|
||
# Parametrize the test function with different combinations of parameters | ||
@pytest.mark.parametrize( | ||
"api_base,model_name,streaming_response,max_new_tokens,temperature,top_p", | ||
[ | ||
(api_base, model_name, streaming_response, max_new_tokens, temperature, top_p) | ||
for api_base in ["http://localhost:8000/v1"] | ||
for model_name in ["gpt2"] | ||
for streaming_response in [False, True] | ||
for max_new_tokens in [None, 128] | ||
for temperature in [None, 0.8] | ||
for top_p in [None, 0.7] | ||
], | ||
) | ||
def test_script(api_base, model_name, streaming_response, max_new_tokens, temperature, top_p): | ||
script_with_args(api_base, model_name, streaming_response, max_new_tokens, temperature, top_p) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,3 @@ | ||
pytest==7.4.4 | ||
torch==2.1.0 | ||
transformers==4.36.0 | ||
starlette==0.36.2 | ||
pydantic==1.10.13 | ||
pydantic-yaml==1.2.0 | ||
pydantic_core==2.14.5 | ||
pytest | ||
openai | ||
async-timeout |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
#!/bin/bash | ||
set -eo pipefail | ||
cd $(dirname $0) | ||
|
||
|
||
# Run pytest with the test file | ||
pytest -vs ./inference | ||
pytest -vv --capture=tee-sys --show-capture=all ./inference | ||
|
||
echo "Pytest finished running tests." |