What has been done?

This week’s work:

  • Setup nuScenes training and validation dataset for MMDetection3D framework: ✅
  • Setup waymo training and validation dataset for MMDetection3D framework: ✅
  • Wrote API for PTQ under pytorch-quantization framework. (Now just need model and dataloader definition): ✅
  • Complete walkthrough of CAT-Seg, a SOTA Open Vocabulary Segmentation (OVS) model: ✅

What to discuss?

  • Is this quantization way appropriate?
  • Any advice on the changing of model structure (CAT-Seg)?

What’s next (in the order of priority)

  • Modify pytorch-quantization source code to implement weight-only quantization (below is a potential way):

    class _QuantConvNd(torch.nn.modules.conv._ConvNd, _utils.QuantMixin):
    		...
        ...
        def _quant(self, input):
            """Apply quantization on input and weight
    
            Function called by the classes lower in the hierarchy, which actually performs the quantization before forward
            in the derivate class the particular Function.
    
            Arguments:
                input: in_features to quantize
            Returns:
                A tuple: (quant_in_feature, quant_weight)
            """
            quant_input = self._input_quantizer(input)
            quant_weight = self._weight_quantizer(self.weight)
    
            return (quant_input, quant_weight)
    

    Modified to:

    class _QuantConvNd(torch.nn.modules.conv._ConvNd, _utils.QuantMixin):
    		...
        ...
        def _quant(self, input):
            """Apply quantization on input and weight
    
            Function called by the classes lower in the hierarchy, which actually performs the quantization before forward
            in the derivate class the particular Function.
    
            Arguments:
                input: in_features to quantize
            Returns:
                A tuple: (quant_in_feature, quant_weight)
            """
            if self.quant_act:
            		quant_input = self._input_quantizer(input)
            else:
              	quant_input = input
            quant_weight = self._weight_quantizer(self.weight)
    
            return (quant_input, quant_weight)
    

    with the modification of the initialization of the parent class (taking QuantConv2d as an example):

    class QuantConv2d(_QuantConvNd):
        """Quantized 2D conv"""
    
        default_quant_desc_weight = tensor_quant.QUANT_DESC_8BIT_CONV2D_WEIGHT_PER_CHANNEL
    
        def __init__(self,
                     in_channels,
                     out_channels,
                     kernel_size,
                     stride=1,
                     padding=0,
                     dilation=1,
                     groups=1,
                     bias=True,
                     # quant_act=True
                     padding_mode='zeros',
                     **kwargs):
    
            # self.quant_act = quant_act
            kernel_size = _pair(kernel_size)
            stride = _pair(stride)
            padding = _pair(padding)
            dilation = _pair(dilation)
    
            quant_desc_input, quant_desc_weight = _utils.pop_quant_desc_in_kwargs(self.__class__, **kwargs)
            super(QuantConv2d, self).__init__(in_channels, out_channels, kernel_size, stride, padding, dilation, False,
                                              _pair(0), groups, bias, padding_mode,
                                              quant_desc_input=quant_desc_input, quant_desc_weight=quant_desc_weight)
    
  • Get the CenterPoint model and its dataloader definition, and use them in the PTQ interfaces to get the evaluation of the model under weight-only, weight+activation.

  • Setup waymo and nuScenes for OpenPCDet.

  • Try model structure modification advice from professor.