Skip to content

Procrastinating loaders - load the data only when actually needs to stream through Unitxt #1556

Procrastinating loaders - load the data only when actually needs to stream through Unitxt

Procrastinating loaders - load the data only when actually needs to stream through Unitxt #1556

Workflow file for this run

name: Test Performance
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
concurrency:
group: ${{ github.workflow }}-${{ github.event_name == 'pull_request' && github.event.pull_request.number || github.ref_name }}
cancel-in-progress: true
jobs:
performance:
runs-on: ubuntu-latest
env:
OS: ubuntu-latest
UNITXT_DEFAULT_VERBOSITY: error
UNITXT_MOCK_INFERENCE_MODE: "True"
DATASETS_VERBOSITY: error
HF_HUB_VERBOSITY: error
HF_DATASETS_DISABLE_PROGRESS_BARS: "True"
TQDM_DISABLE: "True"
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install Requirements
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install --system ".[tests,watsonx,inference-tests]"
uv pip install --system litellm
uv pip install --system diskcache
huggingface-cli login --token ${{ secrets.UNITXT_READ_HUGGINGFACE_HUB_FOR_TESTS }}
- name: Prepare the dirs for performance evaluation in main
run: |
mkdir -p performance_action
cp performance/bluebench_profiler.py performance_action/bluebench_profiler.py
cp performance/compare_benchmark_performance_results.py performance_action/compare_benchmark_performance_results.py
- name: Run performance on PR just to warm the cache, output will be overwritten
run : |
python performance_action/bluebench_profiler.py --output_file performance_action/pr_results.json
- name: Checkout main branch
uses: actions/checkout@v4
with:
ref: main
clean: false
- name: Run performance on main branch
run: |
python performance_action/bluebench_profiler.py --output_file performance_action/main_results.json
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ github.head_ref }}
clean: false
- name: Run performance on PR branch
run: |
python performance_action/bluebench_profiler.py --output_file performance_action/pr_results.json
- name: Compare main and PR performance results
run: |
python performance_action/compare_benchmark_performance_results.py performance_action/main_results.json performance_action/pr_results.json >> $GITHUB_STEP_SUMMARY