Using AIminify

Using aiminify.minify in your Python code

To compress a model using a Python script it is possible to import aiminify.minify and call the main(...) function. As shown in the example below:

from aiminify.minify import main
from your_project import model

compressed_model, _ = main(model)

Depending on the type of model you are compressing it can be stored using the appropriate methods. As aiminify supports PyTorch and Tensorflow as backends, the different storing methods are implemented in the save_model() function.

#
# Using PyTorch as a backend
#
from aiminify.minify import main as minify
from aiminify import save_model
from your_project import model

# For Quantized Torch models, the input shape is needed to save the model
compressed_model, _ = minify(model)
save_model(compressed_model, "./compressed-pytorch-model.pt", input_shape)

# For non Quantized models, you can save without the input shape
compressed_model, _ = main(model, quantization=false)
save_model(compressed_model, "./compressed-pytorch-model.pt")
#
# Using Tensorflow as a backend
#
from aiminify.minify import main as minify
from aiminify import save_model
from your_project import model

compressed_model, _ = minify(model)
save_model(compressed_model, "./compressed-tensorflow-model.keras")

NameType and default value
modelAny, requiredThe model to compress
compression_strengthint(0, 5), default 3Compression strength to use for defining the compression strategy
training_generatorAny, default NoneTraining set used for fine tuning
validation_generatorAny, default NoneValidation set used for fine tuning
loss_functionAny, default NoneLoss function used for fine tuning
verboseint(0, 5), default 0Verbosity level
quatizationbool, default TrueQuantize model
fine_tunebool, default TrueFine tune the model after compression
smart_pruningbool, default TrueUse a strategy for determining the amount of filters to prune instead of flat pruning x % of all filters.
debug_modebool, default FalseDebug logging mode
aiminify.minify.main arguments

The return value of main is (compressed_model, feedback_dictionary). compressed_model is, as the name would suggest, your compressed model using the same backend as the input (Tensorflow, PyTorch). feedback_dictionary contains logs and miscellaneous messages from the compression algorithm.

training_generator, validation_generator and loss_function can be specific to the backend you’re using. For example when using pytorch the training_generator and validation_generator need to be a subclass of torch.utils.data.Dataset. For Tensorflow these can be a subclass of tensorflow.data.Dataset. Similar with loss_function for PyTorch this can be any member of torch.nn.modules.loss and for Tensorflow this can be any implementation of tf.keras.losses.Loss.

Quick Links

Connect

Follow us

Copyright© 2024 AIminify. All rights reserved.