TPU-MLIR Quick Start

- Address:
Building 1, Zhongguancun Integrated Circuit Design Park (ICPARK), No. 9 Fenghao East Road, Haidian District, Beijing
- Postcode:
100094
- URL:
- Email:
- Tel:
010-57590723
Version |
Release date |
Explanation |
---|---|---|
v1.18.0 |
2025.05.01 |
YOLO series adds automatic mixed precision setting; Added SmoothQuant option for run_calibration; New one-click compilation script for LLM |
v1.17.0 |
2025.04.03 |
Significant improvement in LLM model compilation speed; TPULang supports PPL operator integration; Fixed random error issue with Trilu bf16 on Mars3 |
v1.16.0 |
2025.03.03 |
TPULang ROI_Extractor support; Einsum supports abcde,abfge->abcdfg pattern; LLMC adds Vila model support |
v1.15.0 |
2025.02.05 |
Added LLMC quantization support; Address boundary check in codegen; Fixed several comparison issues |
v1.14.0 |
2025.01.02 |
Added post-processing fusion for yolov8/v11; Support for Conv3D stride > 15; Improved FAttention accuracy |
v1.13.0 |
2024.12.02 |
Streamlined Release package; Performance optimization for MaxPoolWithMask training operator; Added support for large RoPE operators |
v1.12.0 |
2024.11.06 |
tpuv7-runtime cmodel integration; BM1690 multi-core LayerGroup optimization; Support for PPL backend operator development |
v1.11.0 |
2024.09.27 |
Added soc mode for BM1688 tdb; bmodel supports fine-grained merging; Fixed several performance degradation issues |
v1.10.0 |
2024.08.15 |
Added yolov10 support; New quantization tuning section; Optimized tpu-perf log output |
v1.9.0 |
2024.07.16 |
BM1690 added 40 model regression tests; New quantization algorithms: octav, aciq_guas and aciq_laplace |
v1.8.0 |
2024.05.30 |
BM1690 supports multi-core MatMul operator; TPULang supports input/output order specification; tpuperf removes patchelf dependency |
v1.7.0 |
2024.05.15 |
CV186X dual-core changed to single-core; BM1690 testing process aligned with BM1684X; Support for gemma/llama/qwen models |
v1.6.0 |
2024.02.23 |
Added Pypi release format; Support for user-defined Global operators; Added CV186X processor platform support |
v1.5.0 |
2023.11.03 |
More Global Layer support for multi-core parallel processing |
v1.4.0 |
2023.09.27 |
System dependencies upgraded to Ubuntu22.04; Added BM1684 Winograd support |
v1.3.0 |
2023.07.27 |
Added manual floating-point operation region specification; Added supported frontend framework operator list; Added NNTC vs TPU-MLIR quantization comparison |
v1.2.0 |
2023.06.14 |
Adjusted mixed quantization examples |
v1.1.0 |
2023.05.26 |
Added post-processing using intelligent deep learning processor |
v1.0.0 |
2023.04.10 |
PyTorch support, added section for PyTorch model conversion |
v0.8.0 |
2023.02.28 |
Added pre-processing using intelligent deep learning processor |
v0.6.0 |
2022.11.05 |
Added section for mixed precision operation process |
v0.5.0 |
2022.10.20 |
Added model-zoo specification to test all models within |
v0.4.0 |
2022.09.20 |
Caffe support, added section for Caffe model conversion |
v0.3.0 |
2022.08.24 |
TFLite support, added section for TFLite model conversion |
v0.2.0 |
2022.08.02 |
Added chapter for running test samples in SDK |
v0.1.0 |
2022.07.29 |
Initial release, supports |
Table of contents
- 1. TPU-MLIR Introduction
- 2. Environment Setup
- 3. Compile the ONNX model
- 4. Compile the Torch Model
- 5. Compile the Caffe model
- 6. Compile the TFLite model
- 7. Quantization and optimization
- 8. Use Tensor Computing Processor for Preprocessing
- 9. Use Tensor Computing Processor for Postprocessing
- 10. Compile LLM Model
- 11. Appendix.01: Reference for converting model to ONNX format
- 12. Appendix.02: CV18xx Guidance
- 13. Appendix.03: BM168x Guidance
- 14. Appendix.04: Model-zoo test
- 15. Appendix.05:TPU Profile Tool Guidance
- 16. Appendix.06: TDB Guidance
- 17. Appendix.07: Supported Operations