convlayer.h

The convlayer.h file describes the HLS implementation of the convolutional layer.

Functions

template<unsigned int ConvKernelDim, unsigned int IFMChannels, unsigned int IFMDim, unsigned int OFMChannels, unsigned int OFMDim, unsigned int SIMD, unsigned int PE, typename TSrcI = Identity, typename TDstI = Identity, typename TWeightI = Identity, int InStreamW, int OutStreamW, typename TW, typename TA, typename R>
void ConvLayer_Batch(hls::stream<ap_uint<InStreamW>> &in, hls::stream<ap_uint<OutStreamW>> &out, TW const &weights, TA const &activation, unsigned const reps, R const &r)

Convolutional layer implementation.

The function implements a generic convolutional layer, and it’s basically composed of the sliding window generator implemeting the im2col algorithm and the Matrix_Vector_Activate_Batch function to perform computation.

Template Parameters
  • ConvKernelDim: Dimension of the convolutional kernel (assumed square)

  • IFMChannels: Number of Input Feature Maps

  • IFMDim: Width and Height of the Input Feature Map (assumed square)

  • OFMChannels: Number of Output Feature Maps

  • OFMDim: Width and Height of the Output Feature Map (assumed square)

  • SIMD: Number of input columns computed in parallel

  • PE: Number of output rows computed in parallel

  • TSrcI: DataType of the input activation (as used in the MAC)

  • TDstI: DataType of the output activation (as generated by the activation)

  • TWeightI: DataType of the weights (as used in the MAC)

  • InStreamW: Width of the input stream

  • OutStreamW: Width of the output stream

  • TW: DataType of the weights matrix - safely deducible from the paramaters

  • TA: DataType of the activation class (e.g. thresholds) - safely deducible from the paramaters

  • R: DataType for the resource used for FPGA implementation of the MAC - safely deducible from the paramaters

Parameters
  • in: Input stream

  • out: Output stream

  • weights: Weights matrix (currently supports BinaryWeights or FixedPointWeights)

  • activation: Activation class

  • reps: Number of time the function has to be repeatedly executed (e.g. number of images)

  • r: Resource type for the hardware implementation of the MAC block

template<unsigned int ConvKernelDim, unsigned int IFMChannels, unsigned int IFMDim, unsigned int OFMChannels, unsigned int OFMDim, unsigned int SIMD, unsigned int PE, unsigned int NUM_RED, unsigned int REDF, unsigned int MAX_CH_WIDTH, typename TSrcI = Identity, typename TDstI = Identity, typename TWeightI = Identity, int InStreamW, int OutStreamW, typename TW, typename TA, typename R>
void ConvLayer_Batch_TMR(hls::stream<ap_uint<InStreamW>> &in, hls::stream<ap_uint<OutStreamW>> &out, TW const &weights, TA const &activation, unsigned const reps, R const &r, ap_uint<2> &errortype, ap_uint<OFMChannels> channel_mask, ap_uint<MAX_CH_WIDTH> red_cha_index[NUM_RED])

Convolutional layer implementation with STMR.

The function implements a generic convolutional layer, and it’s basically composed of the sliding window generator implemeting the im2col algorithm and the Matrix_Vector_Activate_Batch function to perform computation. Additionally, a TMR checker function performs error checks and outputs valid data.

Template Parameters
  • ConvKernelDim: Dimension of the convolutional kernel (assumed square)

  • IFMChannels: Number of Input Feature Maps

  • IFMDim: Width and Height of the Input Feature Map (assumed square)

  • OFMChannels: Number of Output Feature Maps

  • OFMDim: Width and Height of the Output Feature Map (assumed square)

  • SIMD: Number of input columns computed in parallel

  • PE: Number of output rows computed in parallel

  • NUM_RED: Number of redundancies (or triplicated channels)

  • REDF: Redundancy factor (3 to triplicate)

  • MAX_CH_WIDTH: Value to determine the precision of channel indexes

  • TSrcI: DataType of the input activation (as used in the MAC)

  • TDstI: DataType of the output activation (as generated by the activation)

  • TWeightI: DataType of the weights (as used in the MAC)

  • InStreamW: Width of the input stream

  • OutStreamW: Width of the output stream

  • TW: DataType of the weights matrix - safely deducible from the paramaters

  • TA: DataType of the activation class (e.g. thresholds) - safely deducible from the paramaters

  • R: DataType for the resource used for FPGA implementation of the MAC - safely deducible from the paramaters

Parameters
  • in: Input stream

  • out: Output stream

  • weights: Weights matrix (currently supports BinaryWeights or FixedPointWeights)

  • activation: Activation class

  • reps: Number of time the function has to be repeatedly executed (e.g. number of images)

  • r: Resource type for the hardware implementation of the MAC block

  • errortype: Flag to inform redundancy check results. 0 if no faults, 1 if one PE is faulty, 2 if all differ

  • channel_mask: Value with binary channel masks (1 if channel is triplicated, 0 otherwise)

  • red_ch_index: Array of redundant triplets’ indexes. Each position stores the first triplicated channel index of a triplet

template<unsigned int ConvKernelDim, unsigned int IFMChannels, unsigned int IFMDim, unsigned int OFMChannels, unsigned int OFMDim, unsigned int STRIDE, unsigned int SIMD, unsigned int PE, unsigned int MMV, typename TSrcI = Identity, typename TDstI = Identity, typename TWeightI = Identity, int InStreamW, int OutStreamW, typename TW, typename TA, typename R>
void ConvLayer_Batch_MMV(hls::stream<ap_uint<InStreamW>> &in, hls::stream<ap_uint<OutStreamW>> &out, TW const &weights, TA const &activation, unsigned const reps, R const &r)

Convolutional layer implementation.

The function implements a generic convolutional layer, and it’s basically composed of the sliding window generator implemeting the im2col algorithm and the Matrix_Vector_Activate_Batch function to perform computation.

Template Parameters
  • ConvKernelDim: Dimension of the convolutional kernel (assumed square)

  • IFMChannels: Number of Input Feature Maps

  • IFMDim: Width and Height of the Input Feature Map (assumed square)

  • OFMChannels: Number of Output Feature Maps

  • OFMDim: Width and Height of the Output Feature Map (assumed square)

  • STRIDE: Stride of the convolutional kernel

  • SIMD: Number of input columns computed in parallel

  • PE: Number of output rows computed in parallel

  • MMV: Number of output pixels computed in parallel

  • TSrcI: DataType of the input activation (as used in the MAC)

  • TDstI: DataType of the output activation (as generated by the activation)

  • TWeightI: DataType of the weights (as used in the MAC)

  • InStreamW: Width of the input stream

  • OutStreamW: Width of the output stream

  • TW: DataType of the weights matrix - safely deducible from the paramaters

  • TA: DataType of the activation class (e.g. thresholds) - safely deducible from the paramaters

  • R: DataType for the resource used for FPGA implementation of the MAC - safely deducible from the paramaters

Parameters
  • in: Input stream

  • out: Output stream

  • weights: Weights matrix (currently supports BinaryWeights or FixedPointWeights)

  • activation: Activation class

  • reps: Number of time the function has to be repeatedly executed (e.g. number of images)

  • r: Resource type for the hardware implementation of the MAC block