AutoQNN

AutoQNN is an efficient, end-to-end framework designed for the automatic quantization of various layers within Deep Neural Networks (DNNs), employing diverse schemes and bitwidths without any manual intervention. Your can find our paper here. This repository contains the open-source implementation of AutoQNN using PyTorch.

AutoQNN aims to identify optimal quantization schemes and mixed-precision policies for mainstream DNN models in an efficient manner by incorporating techniques such as Quantization Scheme Search (QSS) and Quantizing Precision Learning (QPL). QSS introduces a diverse set of typical quantizers as candidates for the search process and employs the Differentiable Neural Architecture Search (DNAS) algorithm to identify the desired scheme for each layer or the entire model. QPL is an innovative approach that learns mixed-precision policies by reparameterizing the bitwidths associated with quantization schemes. It efficiently optimizes both the classification loss and the precision loss of DNNs, achieving a relatively optimal mixed-precision model while maintaining a constrained model size and memory footprint.

Installation

We provide the Python package for installation, and you can use the following command to install AutoQNN. You can also clone this repository to modify the source code.

# Install AutoQNN form Gitee
pip install git+git://gitee.com/GooCee/AutoQNN-pytorch.git

# Install AutoQNN form Github
pip install git+git://github.com/GongCheng1919/autoqnn-pytorch.git

Usage of AutoQNN Framework

We offer a user-friendly quantization framework for PyTorch named AutoQNN-PyTorch, which includes many classic quantization algorithms such as TWNs, Fixed Quantizer, Residual Quantizer, VecQ, μL2Q, and ESB Quantizer. You can find all the quantization algorithms in the directory here.

Moreover, you can refer to our quantization interface to easily define your own quantization algorithm. Below is an example of how to use this framework to quantize a PyTorch model.

import sys
import torch
import numpy as np
import matplotlib.pyplot as plt
import autoqnn
from torch import nn
from autoqnn.utils import view_module
import torch.nn.functional as F

module = YourModel()
q_model = autoqnn.core.convert(module,
    quantize_config_dict={
        "weight_quant":autoqnn.quantizers.FixedQuant(bitwidth=4),
        "act_quant":autoqnn.quantizers.FixedQuantAct(bitwidth=4)
    }
)

Usage of ESB quantizer

The ESB quantizer is a novel nonlinear quantizer designed for Deep Neural Networks (DNNs). It offers an accurate and efficient quantization mapping along with a hardware-friendly quantizing format, enabling the simultaneous maintenance of both accuracy and hardware efficiency. You can find the associated paper here.

We have implemented the ESB quantizer within the AutoQNN framework. To enable the ESB quantizer in your quantization process, you can use the following code snippet:

module = YouModel()
esb_quantizer = autoqnn.quantizers.ESB_PerChannel(
    bitwidth=4,
    linear_bits=1, // Number of significant bits
    per_channel=False, 
    mean_shift=False // Do not enable symmetric quantization
)
q_model = autoqnn.core.convert(module, 
    quantize_config_dict={
        "weight_quant":esb_quantizer,
        "act_quant":autoqnn.quantizers.FixedQuantAct(bitwidth=4)
    }
)

We provide a histogram comparing the floating-point weights and the quantized weights using the ESB quantizer in the following figure:

Citations

If you find our work useful, please consider citing them in your publications.

AutoQNN:

@article{gong2024autoqnn,
  title={AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks},
  author={Gong, Cheng and Lu, Ye and Dai, Su-Rong and Deng, Qian and Du, Cheng-Kun and Li, Tao},
  journal={Journal of Computer Science and Technology},
  volume={39},
  number={2},
  pages={401--420},
  year={2024},
  publisher={Springer}
}

ESB:

@article{gong2022elastic,
  title={Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks},
  author={Gong, Cheng and Lu, Ye and Xie, Kunpeng and Jin, Zongming and Li, Tao and Wang, Yanzhi},
  journal={IEEE Transactions on Parallel \& Distributed Systems},
  volume={33},
  number={11},
  pages={3178--3193},
  year={2022},
  publisher={IEEE Computer Society}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
autoqnn		autoqnn
examples		examples
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
docker-compose.yml		docker-compose.yml
image-1.png		image-1.png
image.png		image.png
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoQNN

Installation

Usage of AutoQNN Framework

Usage of ESB quantizer

Citations

About

Releases

Packages

Contributors 2

Languages

License

GongCheng1919/autoqnn-pytorch

Folders and files

Latest commit

History

Repository files navigation

AutoQNN

Installation

Usage of AutoQNN Framework

Usage of ESB quantizer

Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages