Keep up with the latest news from TPUMLIR ?

Supporting Forward Operation Part of DragGAN Model on TPU-MLIR

2023/10/11 18:20:51

DragGAN Background#DragGAN originates from the paper titled ‘Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold,’ which was presented at SIGGRAPH 2023. This model takes as input an original image and user interactions (points) and allows for image modifications guid ...

Analyse TPU Performance with TPU Profile

2023/09/18 19:10:12

1. Software and Hardware Framework of TPU#As the following figure showd, a whole TPU application depends the cooperation of software and hardware: For software, Host provides libsophon, driver software packs. Driver abstracts the mechanism of basic communication and resource management, defines func ...

Implementing LLM INT8 Quantization Deployment with TPU-MLIR

2023/09/18 18:52:30

1. Background#In July 2023, we completed the deployment of ChatGLM2-6B on the BM1684X processor by static graph approach, with F16 quantization mode and 12GB model size. The average speed is 3 token/s (ChatGLM2-6B Workflow Analysis and TPU-MLIR Deployment). In order to further enhance the model ...

Transforming and Deploying Stable Diffusion with TPU-MLIR

2023/08/18 12:47:40

The TPU-MLIR compiler is capable of transforming deep learning models running on platforms such as GPUs into bmodel models that can run on arithmetic capability chips. This document provides an overview of how to use TPU-MLIR to port the stable diffusion model running on GPUs to arithmetic capabilit ...

Introduction to sensitive layer search of TPU-MLIR

2023/08/18 12:42:40

Background#The TPU-MLIR compiler converts machine learning models into bmodels that can be run on Sophgo chips. The calculation of the floating-point number consumes more computing resources and storage space than the fixed-point number. Thus, the quantized model (also known as fixed-point model) is ...