1. Table of Contents

  • Accuracy graph under diffrerent quantization metrics: ✅
  • Max value within the layers: ✅

2. Accuracy graph under diffrerent quantization metrics:

weigt

activation

As we can observe from both graphs, activation is clearly influced more by quantization.

3. Max value within the layers

weight_amax

activation_amax

In the first graph, we can see that the max value within the weigh ranges from 0.1 to 2.94, while in the second graph, we can find an interesting max value pattern, with its value ranging from 8 to 53.74, which also explains why activation is influenced more by quantization.

4. What’s next?

  • Add a customized quantizer for the sparse conv3d layer.
  • Add a customized quantizer for operation in SmoothQuant.