Releases · oobabooga/text-generation-webui

05 Nov 20:19

github-actions

snapshot-2023-11-05

e18a046

snapshot-2023-11-05

What's Changed

updated wiki link by @senadev42 in #4415
Bump AutoAWQ to v0.1.5 by @casper-hansen in #4410
Bump exllamav2 version to 0.0.7 by @Soefati in #4417
Bugfix: Updating the shared settings object when loading a model by @ziadloo in #4425
[Fix] OpenOrca-Platypus2 models should use correct instruction_template when matching against models/config.yaml by @deevis in #4435
make torch.load a bit safer by @julien-c in #4448
transformers: Add a flag to force load from safetensors by @julien-c in #4450
Implement Min P as a sampler option in HF loaders by @kalomaze in #4449
Add temperature_last parameter by @oobabooga in #4472
Bump AWQ to 0.1.6 by @casper-hansen in #4470
fixed two links in the ui by @wvanderp in #4452
add use_flash_attention_2 to param for Model loader Transformers by @fenglui in #4373
Bump transformers to 4.35.* by @Soefati in #4474
Merge dev branch by @oobabooga in #4475
Merge dev branch by @oobabooga in #4476
Fix openai extension not working because of absent new defaults by @kabachuha in #4477

New Contributors

@senadev42 made their first contribution in #4415
@Soefati made their first contribution in #4417
@ziadloo made their first contribution in #4425
@deevis made their first contribution in #4435
@julien-c made their first contribution in #4448
@wvanderp made their first contribution in #4452
@fenglui made their first contribution in #4373

Full Changelog: snapshot-2023-10-29...snapshot-2023-11-05

Contributors

fenglui, julien-c, and 9 other contributors

Assets 2

29 Oct 20:19

github-actions

snapshot-2023-10-29

262f8ae

snapshot-2023-10-29

What's Changed

Add additive_repetition_penalty sampler setting. by @tdrussell in #3627
Fix training.py tutorial url by @adrianfiedler in #4367
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty by @tdrussell in #4376
Replace hashlib.sha256 with hashlib.file_digest, so we don't need to load entire files into ram before hashing them. by @LightningDragon in #4383
Fix Gradio warning message regarding custom value by @GuizzyQC in #4391
Intel Gpu support initialization by @abhilash1910 in #4340
Update accelerate requirement from ==0.23.* to ==0.24.* by @dependabot in #4400
Adding platform_system to autoawq by @jamesbraza in #4390

New Contributors

@adrianfiedler made their first contribution in #4367
@LightningDragon made their first contribution in #4383
@abhilash1910 made their first contribution in #4340

Full Changelog: snapshot-2023-10-22...snapshot-2023-10-29

Contributors

LightningDragon, tdrussell, and 5 other contributors

Assets 2

22 Oct 20:19

github-actions

snapshot-2023-10-22

b818314

snapshot-2023-10-22

What's Changed

Fix for using Torch with CUDA 11.8 by @sammcj in #4298
openai: fix wrong models list on query present in /v1/models by @hronoas in #4139
More silero languages by @missionfloyd in #3950
ExLlamav2_HF: Convert logits to FP32 by @turboderp in #4310
Support LLaVA v1.5 by @haotian-liu in #4305
Structured requirements && Python 3.11 support by @mjbogusz in #4233
Enable special token support for exllamav2 by @JohanAR in #4314
Add flash-attention 2 for windows by @bdashore3 in #4235
Docker: Remove explicit CUDA 11.8 Reference by @whiteadam in #4343
Add a proper documentation by @oobabooga in #3885
USE_CUDA118 from ENV remains null one_click.py + cuda-toolkit by @mongolu in #4352
Training PRO a month worth of updates by @FartyPants in #4345
Support LLaVA v1.5 7B by @cnut1648 in #4348
Option to select/target additional linear modules/layers in LORA training by @computerman00 in #4178
Allow multiple loaded LoRAs to simultaneously influence the output by @Googulator in #3120

New Contributors

@hronoas made their first contribution in #4139
@mjbogusz made their first contribution in #4233
@whiteadam made their first contribution in #4343
@mongolu made their first contribution in #4352
@cnut1648 made their first contribution in #4348
@computerman00 made their first contribution in #4178
@Googulator made their first contribution in #3120

Full Changelog: snapshot-2023-10-15...snapshot-2023-10-22

Contributors

sammcj, whiteadam, and 13 other contributors

Assets 2

15 Oct 20:27

github-actions

snapshot-2023-10-15

3bb4046

snapshot-2023-10-15

Switching to a rolling release model with weekly snapshots.

What's Changed

Flash attention fix redux. by @Ph0rk0z in #4247
Bump safetensors from 0.3.2 to 0.4.0 by @dependabot in #4249
Support LLaVA-LLaMA-2 by @haotian-liu in #3377
Bump to latest gradio (3.47) by @oobabooga in #4258
Add HTTPS Support to openai extension by @chuyqa in #4270
Add ChatML support + Mistral-OpenOrca by @netrunnereve in #4275
Use Pytorch 2.1 exllama wheels by @jllllll in #4285
Exllamav2 lora support by @Ph0rk0z in #4229
Relax numpy version requirements by @JohanAR in #4291

New Contributors

@haotian-liu made their first contribution in #3377
@chuyqa made their first contribution in #4270

Full Changelog: v1.7...snapshot-2023-10-15

Contributors

jllllll, JohanAR, and 6 other contributors

Assets 2

08 Oct 20:26

oobabooga

v1.7

2e47107

v1.7

What's Changed

Check '--model-dir' for no models warning in one-click-installer by @jllllll in #4067
Supercharging superbooga by @HideLord in #3272
Fix old install migration for WSL installer by @jllllll in #4093
Expand MacOS llama.cpp support in requirements.txt by @jllllll in #4094
Bump exllamav2 to 0.0.4 and use pre-built wheels by @jllllll in #4095
Enable NUMA feature for llama_cpp_python by @StoyanStAtanasov in #4040
fix: add missing superboogav2 dep by @sammcj in #4099
Delete extensions/Training_PRO/readme.md by @missionfloyd in #4112
Bump llama-cpp-python to 0.2.7 by @jllllll in #4110
fix: update superboogav2 requirements.txt by @wangcx18 in #4100
Update one_click.py to initialize site_packages_path variable by @Psynbiotik in #4118
Let model downloader download *.tiktoken as well by @happyme531 in #4121
Bump llama-cpp-python to 0.2.11 by @jllllll in #4142
Add grammar to transformers and _HF loaders by @oobabooga in #4091
Ignoring custom changes to CMD_FLAGS.txt on update. by @berkut1 in #4181
Fix off-by-one error in exllama_hf caching logic by @tdrussell in #4145
AutoAWQ: initial support by @cal066 in #3999
Bump ExLlamaV2 to 0.0.5 by @turboderp in #4186
Bump AutoAWQ to v0.1.4 by @casper-hansen in #4203
Fix python wheels for avx requirements by @AG-w in #4189
Bump to pytorch 11.8 by @oobabooga in #4209
Use GPTQ wheels compatible with Pytorch 2.1 by @jllllll in #4210
Fix CFG init with Llamacpp_HF by @bdashore3 in #4219
Text Generation: Abort if EOS token is reached by @bdashore3 in #4213
README for superboogav2 by @jamesbraza in #4212
Move import in llama_attn_hijack.py by @Ph0rk0z in #4231

New Contributors

@StoyanStAtanasov made their first contribution in #4040
@Psynbiotik made their first contribution in #4118
@turboderp made their first contribution in #4186
@casper-hansen made their first contribution in #4203
@AG-w made their first contribution in #4189
@bdashore3 made their first contribution in #4219

Full Changelog: 1.6.1...v1.7

Contributors

sammcj, Psynbiotik, and 16 other contributors

Assets 2

26 Sep 03:37

oobabooga

1.6.1

019371c

1.6.1

What's Changed

Use call for conda deactivate in Windows installer by @jllllll in #4042
[extensions/openai] Fix error when preparing cache for embedding models by @wangcx18 in #3995
Create alternative requirements.txt with AMD and Metal wheels by @oobabooga in #4052
Add a grammar editor to the UI by @oobabooga in #4061
Avoid importing torch in one-click-installer by @jllllll in #4064

Full Changelog: v1.6...1.6.1

Contributors

jllllll, wangcx18, and oobabooga

Assets 2

22 Sep 22:17

oobabooga

v1.6

7b9ad64

v1.6

The one-click-installers have been merged into the repository. Migration instructions can be found here.

The updated one-click install features an installation size several GB smaller and a more reliable update procedure.

What's Changed

sd_api_pictures: Widen sliders for image size minimum and maximum by @GuizzyQC in #3326
Bump exllama module to 0.0.9 by @jllllll in #3338
Add an extension that makes chat replies longer by @oobabooga in #3363
add chat instruction config for BaiChuan-chat model by @CrazyShipOne in #3332
[extensions/openai] +Array input (batched) , +Fixes by @matatonic in #3309
Add a scrollbar to notebook/default textboxes, improve chat scrollbar style by @jparmstr in #3403
Add auto_max_new_tokens parameter by @oobabooga in #3419
Add the --cpu option for llama.cpp to prevent CUDA from being used by @oobabooga in #3432
Use character settings from API properties if present by @rafa-9 in #3428
Add standalone Dockerfile for NVIDIA Jetson by @toolboc in #3336
More models: +StableBeluga2 by @matatonic in #3415
[extensions/openai] include content-length for json replies by @matatonic in #3416
Fix llama.cpp truncation by @jparmstr in #3400
Remove unnecessary chat.js by @missionfloyd in #3445
Add back silero preview by @missionfloyd by @oobabooga in #3446
Add SSL certificate support by @oobabooga in #3453
Bump bitsandbytes to 0.41.1 by @jllllll in #3457
[Bug fix] Remove html tags form the Prompt sent to Stable Diffusion by @SodaPrettyCold in #3151
Fix: Mirostat fails on models split across multiple GPUs. by @Ph0rk0z in #3465
Bump exllama wheels to 0.0.10 by @jllllll in #3467
Create logs dir if missing when saving history by @jllllll in #3462
Fix chat message order by @missionfloyd in #3461
Add Classifier Free Guidance (CFG) for Transformers/ExLlama by @oobabooga in #3325
Refactor everything by @oobabooga in #3481
Use chat_instruct_command in API by @jllllll in #3482
Make dockerfile respect specified cuda version by @sammcj in #3474
Fixed a typo when displaying parameters on the llamm.cpp model did not correctly display "rms_norm_eps" by @berkut1 in #3494
Add option for named cloudflare tunnels by @Fredddi43 in #3364
Fix superbooga when using regenerate by @oderwat in #3362
Added the logic for starchat model series by @giprime in #3185
Streamline GPTQ-for-LLaMa support by @jllllll in #3526
Add Vicuna-v1.5 detection by @berkut1 in #3524
ctransformers: another attempt by @cal066 in #3313
Bump ctransformers wheel version by @jllllll in #3558
ctransformers: move thread and seed parameters by @cal066 in #3543
Unify the 3 interface modes by @oobabooga in #3554
Various ctransformers fixes by @netrunnereve in #3556
Add "save defaults to settings.yaml" button by @oobabooga in #3574
Add the --disable_exllama option for AutoGPTQ by @clefever in #3545
ctransformers: Fix up model_type name consistency by @cal066 in #3567
Add a "Show controls" button to chat UI by @oobabooga in #3590
Improved chat scrolling by @oobabooga in #3601
fixes error when not specifying tunnel id by @ausboss in #3606
Fix print CSS by @missionfloyd in #3608
Bump llama-cpp-python by @oobabooga in #3610
Bump llama_cpp_python_cuda to 0.1.78 by @jllllll in #3614
Refactor the training tab by @oobabooga in #3619
llama.cpp: make Stop button work with streaming disabled by @cebtenzzre in #3620
Unescape last message by @missionfloyd in #3623
Improve readability of download-model.py by @Thutmose3 in #3497
Add probability dropdown to perplexity_colors extension by @SeanScripts in #3148
Add a simple logit viewer by @oobabooga in #3636
Fix whitespace formatting in perplexity_colors extension. by @tdrussell in #3643
ctransformers: add mlock and no-mmap options by @cal066 in #3649
Update requirements.txt by @tkbit in #3651
Add missing extensions to Dockerfile by @sammcj in #3544
Implement CFG for ExLlama_HF by @oobabooga in #3666
Add CFG to llamacpp_HF (second attempt) by @oobabooga in #3678
ctransformers: gguf support by @cal066 in #3685
Fix ctransformers threads auto-detection by @jllllll in #3688
Use separate llama-cpp-python packages for GGML support by @jllllll in #3697
GGUF by @oobabooga in #3695
Fix ctransformers model unload by @marella in #3711
Add ffmpeg to the Docker image by @kelvie in #3664
accept floating-point alpha value on the command line by @cebtenzzre in #3712
Bump llama-cpp-python to 0.1.81 by @jllllll in #3716
Make it possible to scroll during streaming by @oobabooga in #3721
Bump llama-cpp-python to 0.1.82 by @jllllll in #3730
Bump ctransformers to 0.2.25 by @jllllll in #3740
Add max_tokens_second param by @oobabooga in #3533
Update requirements.txt by @VishwasKukreti in #3725
Update llama.cpp.md by @q5sys in #3702
Bump llama-cpp-python to 0.1.83 by @jllllll in #3745
Update download-model.py (Allow single file download) by @bet0x in #3732
Allow downloading single file from UI by @missionfloyd in #3737
Bump exllama to 0.0.14 by @jllllll in #3758
Bump llama-cpp-python to 0.1.84 by @jllllll in #3854
Update transformers requirement from ==4.32.* to ==4.33.* by @dependabot in #3865
Bump exllama to 0.1.17 by @jllllll in #3847
Exllama new rope settings by @Ph0rk0z in #3852
fix lora training with alpaca_lora_4bit by @johnsmith...

Contributors

kelvie, oderwat, and 43 other contributors

Assets 2

26 Jul 14:14

oobabooga

v1.5

b17893a

v1.5

What's Changed

Add a detailed extension example and update the extension docs. The example can be found here: example/script.py.
Introduce a new chat_input_modifier extension function and deprecate the old input_hijack.
Change rms_norm_eps to 5e-6 for ~~llama-2-70b ggml~~ all llama-2 models -- this value reduces the perplexities of the models.
Remove FlexGen support. It has been made obsolete by the lack of Llama support and the emergence of llama.cpp and 4-bit quantization. I can add it back if it ever gets updated.
Use the dark theme by default.
Set the correct instruction template for the model when switching from default/notebook modes to chat mode.

Bug fixes

[extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matasonic in #3122
Fix typo in README.md by @eltociear in #3286
README updates and improvements by @netrunnereve in #3198
Ignore values in training.py which are not string by @Foxtr0t1337 in #3287

Contributors

eltociear, Foxtr0t1337, and netrunnereve

Assets 2

24 Jul 19:42

oobabooga

v1.4

a07d070

v1.4

What's Changed

Add llama-2-70b GGML support by @oobabooga in #3285
Bump bitsandbytes to 0.41.0 by @jllllll in #3258 -- faster speeds
Bump exllama module to 0.0.8 by @jllllll in #3256 -- expanded LoRA support

Bug fixes

Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading by @jllllll in #3225

Extensions

[extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matatonic in #3122

Contributors

jllllll, matatonic, and oobabooga

Assets 2

19 Jul 14:22

oobabooga

v1.3.1

0d7f432

v1.3.1

Changes

Add missing EOS and BOS tokens to Llama-2 template
Bump transformers for better Llama-2 support
Bump llama-cpp-python for better unicode support (untested)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Bug fixes

Contributors

What's Changed

Bug fixes

Extensions

Contributors

Changes

Releases: oobabooga/text-generation-webui

snapshot-2023-11-05

What's Changed

New Contributors

Contributors

snapshot-2023-10-29

What's Changed

New Contributors

Contributors

snapshot-2023-10-22

What's Changed

New Contributors

Contributors

snapshot-2023-10-15

What's Changed

New Contributors

Contributors

v1.7

What's Changed

New Contributors

Contributors

1.6.1

What's Changed

Contributors

v1.6

What's Changed

Contributors

v1.5

What's Changed

Bug fixes

Contributors

v1.4

What's Changed

Bug fixes

Extensions

Contributors

v1.3.1

Changes