Using aiminify.minify
in your Python code
To compress a model using a Python script it is possible to import aiminify.minify
and call the main(...)
function. As shown in the example below:
from aiminify.minify import main
from your_project import model
compressed_model, _ = main(model)
Depending on the type of model you are compressing it can be stored using the appropriate methods. As aiminify
supports PyTorch and Tensorflow as backends, the different storing methods are implemented in the save_model()
function.
#
# Using PyTorch as a backend
#
from aiminify.minify import main as minify
from aiminify import save_model
from your_project import model
# For Quantized Torch models, the input shape is needed to save the model
compressed_model, _ = minify(model)
save_model(compressed_model, "./compressed-pytorch-model.pt", input_shape)
# For non Quantized models, you can save without the input shape
compressed_model, _ = main(model, quantization=false)
save_model(compressed_model, "./compressed-pytorch-model.pt")
#
# Using Tensorflow as a backend
#
from aiminify.minify import main as minify
from aiminify import save_model
from your_project import model
compressed_model, _ = minify(model)
save_model(compressed_model, "./compressed-tensorflow-model.keras")
Name | Type and default value | |
model | Any, required | The model to compress |
compression_strength | int(0, 5), default 3 | Compression strength to use for defining the compression strategy |
training_generator | Any, default None | Training set used for fine tuning |
validation_generator | Any, default None | Validation set used for fine tuning |
loss_function | Any, default None | Loss function used for fine tuning |
verbose | int(0, 5), default 0 | Verbosity level |
quatization | bool, default True | Quantize model |
fine_tune | bool, default True | Fine tune the model after compression |
smart_pruning | bool, default True | Use a strategy for determining the amount of filters to prune instead of flat pruning x % of all filters. |
debug_mode | bool, default False | Debug logging mode |
aiminify.minify.main
argumentsThe return value of main
is (compressed_model, feedback_dictionary)
. compressed_model
is, as the name would suggest, your compressed model using the same backend as the input (Tensorflow, PyTorch). feedback_dictionary
contains logs and miscellaneous messages from the compression algorithm.
training_generator
, validation_generator
and loss_function
can be specific to the backend you’re using. For example when using pytorch the training_generator
and validation_generator
need to be a subclass of torch.utils.data.Dataset
. For Tensorflow these can be a subclass of tensorflow.data.Dataset
. Similar with loss_function
for PyTorch this can be any member of torch.nn.modules.loss
and for Tensorflow this can be any implementation of tf.keras.losses.Loss
.