Model Quantization Benchmark

MQBench is a benchmark and framework for evaluating the quantization algorithms under real world hardware deployments. Integrated with the latest features of Pytorch, MQBench can automated trace a full precision model and convert it to quantized model. It provides numerous hardware & algorithms for researchers to benchmark the deployability and reproducibility for quantization. We open source the MQBench library to facilitate the community.

Overview

Reproducibility: MQBench unifies the training hyper-parameters and compare different algorithms fairly.
Deployability: MQBench sumarizes the quantization schemes of 5 deep learning acceleraters and align the quantization point by a flexible toolkit.
These two points are always nelegected by previous works. More detailed infomation see our benchmark paper, toolkit and documentation.

Key Features

Various Advanced Algorithms

Multiple Hardware Support

Unified PyTorch Toolkit

Available Leaderboards

TensorRT ACL TVM SNPE FBGEMM

TensorRT Leaderboard

ACL Leaderboard

TVM Leaderboard

SNPE Leaderboard

FBGEMM Leaderboard

Submission

MQBench is flexible to add support for new hardware or quantization algorithms. WELCOME to contribute and submit new results following the README instruction.

Citation

Consider citing our benchmark paper if you want to reference our leaderboard:

@article{2021MQBench,
    title={MQBench: Towards Reproducible and Deployable Model Quantization Benchmark},
    author={Yuhang Li* and Mingzhu Shen* and Jian Ma* and Yan Ren* and Mingxin Zhao* and Qi Zhang* and
                   Ruihao Gong and Fengwei Yu and Junjie Yan},
    journal={https://openreview.net/forum?id=TUplOmF8DsM},
    year={2021}
}

Contact Us

Toolchain Team
Sensetime Research

https://github.com/TheGreatCold/mqbench

Wechat Group Helper

QQ Group