Skip to content

Commit

Permalink
update docs. add version specifiers. use uv to publish.
Browse files Browse the repository at this point in the history
  • Loading branch information
rectalogic committed Oct 3, 2024
1 parent ffacac2 commit 5e29fa9
Show file tree
Hide file tree
Showing 5 changed files with 78 additions and 86 deletions.
37 changes: 11 additions & 26 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,44 +8,29 @@ permissions:
contents: read

jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: pip
cache-dependency-path: pyproject.toml
- name: Install dependencies
run: |
pip install -e '.[test]'
- name: Run tests
run: |
python -m pytest
call-test:
uses: ./.github/workflows/test.yml
deploy:
runs-on: ubuntu-latest
needs: [test]
needs: [call-test]
environment: release
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- name: Set up Python
- name: Install uv
uses: astral-sh/setup-uv@v2
with:
enable-cache: true
cache-dependency-glob: "uv.lock"
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: pip
cache-dependency-path: pyproject.toml
- name: Install dependencies
run: |
pip install setuptools wheel build
run: uv sync --all-extras --dev --python 3.12 --python-preference only-system
- name: Build
run: |
python -m build
uv build
- name: Publish
uses: pypa/gh-action-pypi-publish@release/v1
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Test

on: [push, pull_request]
on: [push, pull_request, workflow_call]

permissions:
contents: read
Expand Down
71 changes: 39 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,33 @@
[![Tests](https://github.com/rectalogic/llm-transformers/actions/workflows/test.yml/badge.svg)](https://github.com/rectalogic/llm-transformers/actions/workflows/test.yml)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/rectalogic/llm-transformers/blob/main/LICENSE)

Plugin for llm adding support for [🤗 Hugging Face Transformers](https://huggingface.co/docs/transformers/index)
Plugin for [llm](https://llm.datasette.io/) adding support for 🤗 [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) [pipeline](https://huggingface.co/docs/transformers/pipeline_tutorial) tasks.

## Installation

Install this plugin in the same environment as [LLM](https://llm.datasette.io/).
```bash
llm install llm-transformers
```
Some pipelines that accept audio/video inputs require the [ffmpeg](https://ffmpeg.org/) executable to be installed.
The [`document-question-answering`](#document-question-answering) pipeline uses `pytesseract` which requires the [tesseract](https://tesseract-ocr.github.io/) executable.

## Usage

XXX document `-o verbose True`
XXX HF_TOKEN/key usage
This plugin exposes 🤗 Hugging Face transformers pipelines, the "model" name is `transformers` and the pipeline task and/or Hugging Face model are specified as model options, e.g.:
```sh-session
$ llm -m transformers -o task text-generation "A dog has"
$ llm -m transformers -o model facebook/musicgen-small "techno music"
```
If only `-o task <task>` is specified, the default model for that task will be used.
If only `-m model <model>` is specified, the task will be inferred from the model.
If both are specified, then the model must be compatible with the task.

Transformers logging is verbose and disabled by default.
Specify the `-o verbose True` model option to enable it.

Most models are freely accessible, some of them require accepting a license agreement and using a Hugging Face [API token](https://huggingface.co/settings/tokens) that has access to the model.
Most 🤗 Hugging Face models are freely accessible, some of them require accepting a license agreement
and using a Hugging Face [API token](https://huggingface.co/settings/tokens) that has access to the model.
You can use `llm keys set huggingface`, or set the `HF_TOKEN` env var, or use the `--key` option to `llm`.

```sh-session
Expand All @@ -29,6 +42,15 @@ $ llm --key hf_******************** -m transformers -o model meta-llama/Llama-3.
A dog has been named as the killer of a woman who was found dead in her home.
```

Some pipelines generate binary (audio, image, video) output, these are written to a temporary file
and the path to the file is returned.
A specific file can be specified with the `-o output <path.suffx>` model option.
The suffix specifies the file type (e.g. `.png` vs `.jpg` etc).

Pipelines can be tuned by passing additional keyword arguments to the pipeline call.
These are specified as a JSON string in the `-o kwargs '<json>'` model option.
See the documentation for a specific pipeline for information on additional keyword arguments.

## Transformer Pipeline Tasks

### [audio-classification](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.AudioClassificationPipeline)
Expand Down Expand Up @@ -62,7 +84,6 @@ XXX embed image here?
### [document-question-answering](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.DocumentQuestionAnsweringPipeline)

The `document-question-answering` task requires a `context` option which is a file or URL to an image:

```sh-session
$ llm -m transformers -o task document-question-answering -o context https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png "What is the invoice number?"
us-001
Expand All @@ -74,7 +95,6 @@ Not supported.
### [fill-mask](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.FillMaskPipeline)

`fill-mask` requires a placeholder in the prompt, thiis is typically `<mask>` but is different for different models:

```sh-session
$ llm -m transformers -o task fill-mask "My <mask> is about to explode"
My brain is about to explode (score=0.09140042215585709)
Expand Down Expand Up @@ -174,7 +194,6 @@ $ llm -m transformers -o task summarization -o kwargs '{"min_length": 2, "max_le
### [table-question-answering](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TableQuestionAnsweringPipeline)

`table-question-answering` takes a required `context` option - a path to a CSV file.

```sh-session
$ cat <<EOF > /tmp/t.csv
> Repository,Stars,Contributors,Programming language
Expand Down Expand Up @@ -205,7 +224,6 @@ POSITIVE (0.9997681975364685)
### [text-generation](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TextGenerationPipeline)

Some `text-generation` models can be chatted with.

```sh-session
$ llm -m transformers -o task text-generation "I am going to elect"
I am going to elect the president of Mexico and that president should vote for our president," he said. "That's not very popular. That's not the American way. I would not want voters to accept the fact that that guy's running a
Expand All @@ -224,12 +242,14 @@ Your question was: "What is the capital of France?"

### [text-to-audio](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TextToAudioPipeline)

`text-to-audio` generates audio, the response is the path to the audio file.
`text-to-audio` generates audio, the response is the path to the generated audio file.
```sh-session
$ llm -m transformers -o kwargs '{"generate_kwargs": {"max_new_tokens": 100}}' -o model facebook/musicgen-small "techno music"
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpoueh05y6.wav
$ llm -m transformers -o task text-to-audio "Hello world"
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpmpwhkd8p.wav
$ llm -m transformers -o task text-to-audio -o model facebook/mms-tts-eng -o output /tmp/speech.flac "Hello world"
/tmp/speech.flac
```

### [token-classification](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TokenClassificationPipeline)
Expand All @@ -243,7 +263,6 @@ London (I-LOC: 0.998397171497345)
### [translation_xx_to_yy](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TranslationPipeline)

Substitute the from and to language codes into the task name, e.g. from `en` to `fr` would use task `translation_en_to_fr`:

```sh-session
$ llm -m transformers -o task translation_en_to_fr "How old are you?"
quel âge êtes-vous?
Expand All @@ -252,7 +271,6 @@ $ llm -m transformers -o task translation_en_to_fr "How old are you?"
### [video-classification](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.VideoClassificationPipeline)

`video-classification` task expects a video path or URL as the prompt:

```sh-session
$ llm -m transformers -o task video-classification https://huggingface.co/datasets/Xuehai/MMWorld/resolve/main/Amazing%20street%20dance%20performance%20from%20Futunity%20UK%20-%20Move%20It%202013/Amazing%20street%20dance%20performance%20from%20Futunity%20UK%20-%20Move%20It%202013.mp4
dancing ballet (0.006608937866985798)
Expand All @@ -265,7 +283,6 @@ punching bag (0.00565463537350297)
### [visual-question-answering](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.VisualQuestionAnsweringPipeline)

`visual-question-answering` task requires an `context` option - a file or URL to an image:

```sh-session
$ llm -m transformers -o task visual-question-answering -o context https://huggingface.co/datasets/Narsil/image_dummy/raw/main/lena.png "What is she wearing?"
hat (0.9480269551277161)
Expand All @@ -278,7 +295,6 @@ nothing (0.0020962499547749758)
### [zero-shot-classification](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.ZeroShotClassificationPipeline)

`zero-shot-classification` requires a comma separated list of labels to be specified in the `context` model option:

```sh-session
$ llm -m transformers -o task zero-shot-classification -o context "urgent,not urgent,phone,tablet,computer" "I have a problem with my iphone that needs to be resolved asap!!"
urgent (0.5036348700523376)
Expand All @@ -291,7 +307,6 @@ tablet (0.0023087668232619762)
### [zero-shot-image-classification](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.ZeroShotImageClassificationPipeline)

`zero-shot-image-classification` requires a comma separated list of labels to be specified in the `context` model option. The prompt is a path or URL to an image:

```sh-session
$ llm -m transformers -o task zero-shot-image-classification -o context "black and white,photorealist,painting" https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
black and white (0.9736384749412537)
Expand All @@ -301,8 +316,8 @@ painting (0.004946451168507338)

### [zero-shot-audio-classification](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.ZeroShotAudioClassificationPipeline)

`zero-shot-audio-classification` requires a comma separated list of labels to be specified in the `context` model option. The prompt is a path or URL to an audio:

`zero-shot-audio-classification` requires a comma separated list of labels to be specified in the `context` model option.
The prompt is a path or URL to an audio:
```sh-session
$ llm -m transformers -o task zero-shot-audio-classification -o context "Sound of a bird,Sound of a dog" https://huggingface.co/datasets/s3prl/Nonspeech/resolve/main/animal_sound/n52.wav
Sound of a bird (0.9998763799667358)
Expand All @@ -313,7 +328,6 @@ Sound of a dog (0.00012355657236184925)

`zero-shot-object-detection` requires a comma separated list of labels to be specified in the `context` model option. The prompt is a path or URL to an image.
The response is JSON and includes a bounding box for each label:

```sh-session
$ llm -m transformers -o task zero-shot-object-detection -o context "cat,couch" http://images.cocodataset.org/val2017/000000039769.jpg
[
Expand Down Expand Up @@ -350,20 +364,13 @@ $ llm -m transformers -o task zero-shot-object-detection -o context "cat,couch"
]
```


## Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:
```bash
cd llm-transformers
python -m venv venv
source venv/bin/activate
```
Now install the dependencies and test dependencies:
```bash
llm install -e '.[test]'
```
To run the tests:
```bash
python -m pytest
```
To set up this plugin locally, first checkout the code and install [`uv`](https://docs.astral.sh/uv/).
`uv sync` to create a `venv` and install, then run tests:
```sh-session
$ uv sync --dev
$ uv run pytest
$ uv run ruff check
$ uv run ruff format --check
```
18 changes: 9 additions & 9 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ classifiers = [
dependencies = [
"llm>=0.16",
"transformers>=4.45.1",
"torch",
"soundfile",
"pillow",
"pytesseract",
"timm",
"protobuf",
"pandas",
"av",
"torch>=2.4.1",
"soundfile>=0.12.1",
"pillow>=10.4.0",
"pytesseract>=0.3.13",
"timm>=1.0.9",
"protobuf>=5.28.2",
"pandas>=2.2.3",
"av>=13.0.0",
"numpy>=2.1.1",
]

Expand Down Expand Up @@ -84,4 +84,4 @@ markers = [
"llm1: marks a subset of llm tests into their own group",
"llm2: marks a subset of llm tests into their own group",
"llm3: marks a subset of llm tests into their own group",
]
]
36 changes: 18 additions & 18 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 5e29fa9

Please sign in to comment.