The batched matrix-matrix multiplication kernels are templated on:
m, n, k
minblocks), depending on the algorithm.
The batched transpose kernels are templated on:
The input features for the predictive models can be 'raw' parameters (left-most-column in the figure below), or hand-engineered features 'derived' from the raw features (matrix sizes, launch parameters and resource usage estimations).