User Interface

This chapter introduces the user interface.

Introduction

The basic procedure is transforming the model into a mlir file with model_transform.py, and then transforming the mlir into the corresponding model with model_deploy.py. Calibration is required if you need to get the INT8 model. The general process is shown in the figure (User interface 1).

Other complex cases such as image input with preprocessing and multiple inputs are also supported, as shown in the figure (User interface 2).

TFLite model conversion is also supported, with the following command:

 # TFLite conversion example
 $ model_transform.py \
     --model_name resnet50_tf \
     --model_def  ../resnet50_int8.tflite \
     --input_shapes [[1,3,224,224]] \
     --mean 103.939,116.779,123.68 \
     --scale 1.0,1.0,1.0 \
     --pixel_format bgr \
     --test_input ../image/dog.jpg \
     --test_result resnet50_tf_top_outputs.npz \
     --mlir resnet50_tf.mlir
$ model_deploy.py \
    --mlir resnet50_tf.mlir \
    --quantize INT8 \
    --asymmetric \
    --chip bm1684x \
    --test_input resnet50_tf_in_f32.npz \
    --test_reference resnet50_tf_top_outputs.npz \
    --tolerance 0.95,0.85 \
    --model resnet50_tf_1684x.bmodel

Supporting the conversion of Caffe models, the commands are as follows:

# Caffe conversion example
$ model_transform.py \
    --model_name resnet18_cf \
    --model_def  ../resnet18.prototxt \
    --model_data ../resnet18.caffemodel \
    --input_shapes [[1,3,224,224]] \
    --mean 104,117,123 \
    --scale 1.0,1.0,1.0 \
    --pixel_format bgr \
    --test_input ../image/dog.jpg \
    --test_result resnet50_cf_top_outputs.npz \
    --mlir resnet50_cf.mlir
# The call of model_deploy is consistent with onnx
# ......

model_transform.py

Used to convert various neural network models into MLIR files, the supported parameters are shown below:

Function of model_transform parameters

Name

Required?

Explanation

model_name

Y

Model name

model_def

Y

Model definition file (e.g., ‘.onnx’, ‘.tflite’ or ‘.prototxt’ files)

model_data

N

Specify the model weight file, required when it is caffe model (corresponding to the ‘.caffemodel’ file)

input_shapes

N

The shape of the input, such as [[1,3,640,640]] (a two-dimensional array), which can support multiple inputs

resize_dims

N

The size of the original image to be adjusted to. If not specified, it will be resized to the input size of the model

keep_aspect_ratio

N

Whether to maintain the aspect ratio when resize. False by default. It will pad 0 to the insufficient part when setting

mean

N

The mean of each channel of the image. The default is 0.0,0.0,0.0

scale

N

The scale of each channel of the image. The default is 1.0,1.0,1.0

pixel_format

N

Image type, can be rgb, bgr, gray or rgbd

output_names

N

The names of the output. Use the output of the model if not specified, otherwise use the specified names as the output

test_input

N

The input file for validation, which can be an image, npy or npz. No validation will be carried out if it is not specified

test_result

N

Output file to save validation result

excepts

N

Names of network layers that need to be excluded from validation. Separated by comma

mlir

Y

The output mlir file name (including path)

post_handle_type

N

fuse the post handle op into bmodel, set the type of post handle op such as yolo、ssd

After converting to an mlir file, a ${model_name}_in_f32.npz file will be generated, which is the input file for the subsequent models.

run_calibration.py

Use a small number of samples for calibration to get the quantization table of the network (i.e., the threshold/min/max of each layer of op).

Supported parameters:

Function of run_calibration parameters

Name

Required?

Explanation

(None)

Y

Mlir file

dataset

N

Directory of input samples. Images, npz or npy files are placed in this directory

data_list

N

The sample list (cannot be used together with “dataset”)

input_num

N

The number of input for calibration. Use all samples if it is 0

tune_num

N

The number of fine-tuning samples. 10 by default

histogram_bin_num

N

The number of histogram bins. 2048 by default

o

Y

Name of output calibration table file

model_deploy.py

Convert the mlir file into the corresponding model, the parameters are as follows:

Function of model_deploy parameters

Name

Required?

Explanation

mlir

Y

Mlir file

quantize

Y

Quantization type (F32/F16/BF16/INT8)

chip

Y

The platform that the model will use. Support bm1684x/bm1684/cv183x/cv182x/cv181x/cv180x.

calibration_table

N

The quantization table path. Required when it is INT8 quantization

tolerance

N

Tolerance for the minimum similarity between MLIR quantized and MLIR fp32 inference results

test_input

N

The input file for validation, which can be an image, npy or npz. No validation will be carried out if it is not specified

test_reference

N

Reference data for validating mlir tolerance (in npz format). It is the result of each operator

excepts

N

Names of network layers that need to be excluded from validation. Separated by comma

model

Y

Name of output model file (including path)

Other Tools

model_runner.py

Model inference. bmodel/mlir/onnx/tflite supported.

Example:

$ model_runner.py \
   --input sample_in_f32.npz \
   --model sample.bmodel \
   --output sample_output.npz

Supported parameters:

Function of model_runner parameters

Name

Required?

Explanation

input

Y

Input npz file

model

Y

Model file (bmodel/mlir/onnx/tflite)

dump_all_tensors

N

Export all the results, including intermediate ones, when specified

npz_tool.py

npz will be widely used in TPU-MLIR project for saving input and output results, etc. npz_tool.py is used to process npz files.

Example:

# Check the output data in sample_out.npz
$ npz_tool.py dump sample_out.npz output

Supported functions:

npz_tool functions

Function

Description

dump

Get all tensor information of npz

compare

Compare difference of two npz files

to_dat

Export npz as dat file, contiguous binary storage

visual.py

visual.py is an visualized network/tensor compare application with interface in web browser, if accuracy of quantized network is not as good as expected, this tool can be used to investigate the accuracy in every layer.

Example:

# use TCP port 9999 in this example
$ visual.py --fp32_mlir f32.mlir --quant_mlir quant.mlir --input top_input_f32.npz --port 9999

Supported functions:

visual 功能

Function

Description

f32_mlir

fp32 mlir file

quant_mlir

quantized mlir file

input

test input data for networks, can be in jpeg or npz format.

port

TCP port used for UI, default port is 10000,the port should be mapped when starting docker

manual_run

if net will be automaticall inferenced when UI is opened, default is false for auto inference