All notable changes to this project will be documented in this file. The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- Installation instructions for binary releases
- Warning if non-customized PyTorch version is detected which can not calculate gradients for non-complex tensor types
- Updated development scripts for binary releases
- Adjusting rpaths in .so files (based on PyTorch's implemented solution)
- Docker base image changed to manywheel builder image
- Development scripts for preparing binary releases
- Updated build instructions to clarify torchvision installation
- Adapted
setup.py
logic for preparing binary releases
- Broken build process by setting setuptools version
- Tuned the hyperparameters of DiodeMix optimizer for sft.
- Added sft-support for the classical gptq-style models.
- Implemented qzeros update in finetuning process.
- Extended pack_fp_weight function.
- Enhanced the performance of MPQLinearCUDA layer.
- Fixed various errors in DiodeMix update function.
- Enhanced the performance of the MBWQ linear layer for processing long sequences, addressing previous inefficiencies.
- Building instructions (adding a section for cutlass)
- Checksums for custom torch builds (within docker)
- An error in
pack_fp_weight
- Broken links in README.md and index.rst
- Quantized layers with different acceleration options
- QConv (binary, quantized) - CPU, Cutlass
- QLinear (binary, quantized, mixed bit-width) - CUDA, Cutlass, MPS
- QEmbedding (binary)
- Optimizer(s) for quantized layers
- Hybrid optimizer
diode_beta
based on Diode v1 (binary) and AdamW (quantized) for memory-efficient training - Initial support for galore projection
- Hybrid optimizer
- Examples
- MNIST training script with and without PyTorch Lightning
The first release of basic functionality.