The batched matrix-matrix multiplication kernels are templated on:
m, n, k
M
, N
, w
, v
, threads
, grouping
, minblocks
), depending on the algorithm.The batched transpose kernels are templated on:
m, n
The input features for the predictive models can be 'raw' parameters (left-most-column in the figure below), or hand-engineered features 'derived' from the raw features (matrix sizes, launch parameters and resource usage estimations).