dbcsr_tas_mm Module

Matrix multiplication for tall-and-skinny matrices. This uses the k-split (non-recursive) CARMA algorithm that is communication-optimal as long as the two smaller dimensions have the same size. Submatrices are obtained by splitting a dimension of the process grid. Multiplication of submatrices uses DBCSR Cannon algorithm. Due to unknown sparsity pattern of result matrix, parameters (group sizes and process grid dimensions) can not be derived from matrix dimensions and need to be set manually.



Contents


Variables

TypeVisibilityAttributesNameInitial
character(len=*), private, parameter:: moduleN ='dbcsr_tas_mm'

Functions

private function dist_compatible(mat_a, mat_b, split_rc_a, split_rc_b, unit_nr)

Check whether matrices have same distribution and same split.

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(in) :: mat_a
type(dbcsr_tas_type), intent(in) :: mat_b
integer, intent(in) :: split_rc_a
integer, intent(in) :: split_rc_b
integer, intent(in), optional :: unit_nr

Return Value logical

private function split_factor_estimate(max_mm_dim, nze_a, nze_b, nze_c, numnodes) result(nsplit)

Estimate optimal split factor for AxB=C from occupancies (number of non-zero elements) This estimate is based on the minimization of communication volume whereby the communication of CARMA n-split step and CANNON-multiplication of submatrices are considered. \result estimated split factor

Arguments

TypeIntentOptionalAttributesName
integer, intent(in) :: max_mm_dim
integer(kind=int_8), intent(in) :: nze_a

number of non-zeroes in A number of non-zeroes in B number of non-zeroes in C

integer(kind=int_8), intent(in) :: nze_b

number of non-zeroes in A number of non-zeroes in B number of non-zeroes in C

integer(kind=int_8), intent(in) :: nze_c

number of non-zeroes in A number of non-zeroes in B number of non-zeroes in C

integer, intent(in) :: numnodes

number of MPI ranks

Return Value integer


Subroutines

public recursive subroutine dbcsr_tas_multiply(transa, transb, transc, alpha, matrix_a, matrix_b, beta, matrix_c, optimize_dist, split_opt, filter_eps, flop, move_data_a, move_data_b, retain_sparsity, simple_split, result_index, unit_nr, log_verbose)

tall-and-skinny matrix-matrix multiplication. Undocumented dummy arguments are identical to arguments of dbcsr_multiply (see dbcsr_mm, dbcsr_multiply_generic).

Arguments

TypeIntentOptionalAttributesName
character(len=1), intent(in) :: transa
character(len=1), intent(in) :: transb
character(len=1), intent(in) :: transc
type(dbcsr_scalar_type), intent(in) :: alpha
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_a
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_b
type(dbcsr_scalar_type), intent(in) :: beta
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_c
logical, intent(in), optional :: optimize_dist

Whether distribution should be optimized internally. In the current implementation this guarantees optimal parameters only for dense matrices.

type(dbcsr_tas_split_info), intent(out), optional :: split_opt

optionally return split info containing optimal grid and split parameters. This can be used to choose optimal process grids for subsequent matrix multiplications with matrices of similar shape and sparsity.

real(kind=real_8), intent(in), optional :: filter_eps
integer(kind=int_8), intent(out), optional :: flop
logical, intent(in), optional :: move_data_a

memory optimization: move data to matrix_c such that matrix_a is empty on return memory optimization: move data to matrix_c such that matrix_b is empty on return for internal use only

logical, intent(in), optional :: move_data_b

memory optimization: move data to matrix_c such that matrix_a is empty on return memory optimization: move data to matrix_c such that matrix_b is empty on return for internal use only

logical, intent(in), optional :: retain_sparsity

memory optimization: move data to matrix_c such that matrix_a is empty on return memory optimization: move data to matrix_c such that matrix_b is empty on return for internal use only

logical, intent(in), optional :: simple_split

memory optimization: move data to matrix_c such that matrix_a is empty on return memory optimization: move data to matrix_c such that matrix_b is empty on return for internal use only

integer(kind=int_8), intent(out), optional DIMENSION(:, :), ALLOCATABLE:: result_index
integer, intent(in), optional :: unit_nr

unit number for logging output

logical, intent(in), optional :: log_verbose

only for testing: verbose output

private subroutine redistribute_and_sum(matrix_in, matrix_out, local_copy, alpha)

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_type), intent(in) :: matrix_in
type(dbcsr_type), intent(inout) :: matrix_out
logical, intent(in), optional :: local_copy
type(dbcsr_scalar_type), intent(in), optional :: alpha

private subroutine reshape_mm_small(mp_comm, matrix_in, matrix_out, transposed, trans, nodata, move_data)

Make sure that smallest matrix involved in a multiplication is not split and bring it to the same process grid as the other 2 matrices.

Arguments

TypeIntentOptionalAttributesName
integer, intent(in) :: mp_comm

communicator that defines Cartesian topology

type(dbcsr_tas_type), intent(inout) :: matrix_in
type(dbcsr_tas_type), intent(out) :: matrix_out
logical, intent(in) :: transposed

Whether matrix_out should be transposed

character(len=1), intent(inout) :: trans

update transpose flag for DBCSR mm according to 'transposed' argument

logical, intent(in), optional :: nodata

Data of matrix_in should not be copied to matrix_out memory optimization: move data such that matrix_in is empty on return.

logical, intent(in), optional :: move_data

Data of matrix_in should not be copied to matrix_out memory optimization: move data such that matrix_in is empty on return.

private subroutine reshape_mm_compatible(matrix1_in, matrix2_in, matrix1_out, matrix2_out, new1, new2, trans1, trans2, optimize_dist, nsplit, opt_nsplit, split_rc_1, split_rc_2, nodata1, nodata2, move_data_1, move_data_2, comm_new, unit_nr)

Reshape either matrix1 or matrix2 to make sure that their process grids are compatible with the same split factor.

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout), TARGET:: matrix1_in
type(dbcsr_tas_type), intent(inout), TARGET:: matrix2_in
type(dbcsr_tas_type), intent(out), POINTER:: matrix1_out
type(dbcsr_tas_type), intent(out), POINTER:: matrix2_out
logical, intent(out) :: new1

Whether matrix1_out is a new matrix or simply pointing to matrix1_in Whether matrix2_out is a new matrix or simply pointing to matrix2_in

logical, intent(out) :: new2

Whether matrix1_out is a new matrix or simply pointing to matrix1_in Whether matrix2_out is a new matrix or simply pointing to matrix2_in

character(len=1), intent(inout) :: trans1

transpose flag of matrix1_in for multiplication transpose flag of matrix2_in for multiplication

character(len=1), intent(inout) :: trans2

transpose flag of matrix1_in for multiplication transpose flag of matrix2_in for multiplication

logical, intent(in), optional :: optimize_dist

experimental: optimize matrix splitting and distribution

integer, intent(in), optional :: nsplit

Optimal split factor (set to 0 if split factor should not be changed)

logical, intent(in), optional :: opt_nsplit
integer, intent(inout) :: split_rc_1

Whether to split rows or columns for matrix 1 Whether to split rows or columns for matrix 2

integer, intent(inout) :: split_rc_2

Whether to split rows or columns for matrix 1 Whether to split rows or columns for matrix 2

logical, intent(in), optional :: nodata1

Don't copy matrix data from matrix1_in to matrix1_out Don't copy matrix data from matrix2_in to matrix2_out

logical, intent(in), optional :: nodata2

Don't copy matrix data from matrix1_in to matrix1_out Don't copy matrix data from matrix2_in to matrix2_out

logical, intent(inout), optional :: move_data_1

memory optimization: move data such that matrix1_in may be empty on return. memory optimization: move data such that matrix2_in may be empty on return.

logical, intent(inout), optional :: move_data_2

memory optimization: move data such that matrix1_in may be empty on return. memory optimization: move data such that matrix2_in may be empty on return.

integer, intent(out), optional :: comm_new

returns the new communicator only if optimize_dist

integer, intent(in), optional :: unit_nr

output unit

private subroutine change_split(matrix_in, matrix_out, nsplit, split_rowcol, is_new, opt_nsplit, move_data, nodata)

Change split factor without redistribution

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_in
type(dbcsr_tas_type), intent(out), POINTER:: matrix_out
integer, intent(in) :: nsplit

new split factor, set to 0 to not change split of matrix_in

integer, intent(in) :: split_rowcol

split rows or columns

logical, intent(out) :: is_new

whether matrix_out is new or a pointer to matrix_in

logical, intent(in), optional :: opt_nsplit

whether nsplit should be optimized for current process grid

logical, intent(inout), optional :: move_data

memory optimization: move data such that matrix_in is empty on return.

logical, intent(in), optional :: nodata

Data of matrix_in should not be copied to matrix_out

private subroutine reshape_mm_template(template, matrix_in, matrix_out, trans, split_rc, nodata, move_data)

Reshape matrix_in s.t. it has same process grid, distribution and split as template

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(in) :: template
type(dbcsr_tas_type), intent(inout) :: matrix_in
type(dbcsr_tas_type), intent(out) :: matrix_out
character(len=1), intent(inout) :: trans
integer, intent(in) :: split_rc
logical, intent(in), optional :: nodata
logical, intent(in), optional :: move_data

public subroutine dbcsr_tas_result_index(transa, transb, transc, matrix_a, matrix_b, matrix_c, filter_eps, unit_nr, blk_ind, nze, retain_sparsity)

Estimate sparsity pattern of C resulting from A x B = C by multiplying the block norms of A and B Same dummy arguments as dbcsr_tas_multiply

Arguments

TypeIntentOptionalAttributesName
character(len=1), intent(in) :: transa
character(len=1), intent(in) :: transb
character(len=1), intent(in) :: transc
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_a
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_b
type(dbcsr_tas_type), intent(inout), TARGET:: matrix_c
real(kind=real_8), intent(in), optional :: filter_eps
integer, intent(in), optional :: unit_nr
integer(kind=int_8), intent(out), optional DIMENSION(:, :), ALLOCATABLE:: blk_ind
integer(kind=int_8), intent(out), optional :: nze
logical, intent(in), optional :: retain_sparsity

private subroutine create_block_norms_matrix(matrix_in, matrix_out, nodata)

Create a matrix with block sizes one that contains the block norms of matrix_in

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout) :: matrix_in
type(dbcsr_tas_type), intent(out) :: matrix_out
logical, intent(in), optional :: nodata

private subroutine convert_to_new_pgrid(mp_comm_cart, matrix_in, matrix_out, move_data, nodata, optimize_pgrid)

Convert a DBCSR matrix to a new process grid

Arguments

TypeIntentOptionalAttributesName
integer, intent(in) :: mp_comm_cart

new process grid

type(dbcsr_type), intent(inout) :: matrix_in
type(dbcsr_type), intent(out) :: matrix_out
logical, intent(in), optional :: move_data

memory optimization: move data such that matrix_in is empty on return. Data of matrix_in should not be copied to matrix_out

logical, intent(in), optional :: nodata

memory optimization: move data such that matrix_in is empty on return. Data of matrix_in should not be copied to matrix_out

logical, intent(in), optional :: optimize_pgrid

Whether to change process grid

public subroutine dbcsr_tas_batched_mm_init(matrix)

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout) :: matrix

public subroutine dbcsr_tas_batched_mm_finalize(matrix)

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout) :: matrix

public subroutine dbcsr_tas_set_batched_state(matrix, state, opt_grid)

set state flags during batched multiplication

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout) :: matrix
integer, intent(in), optional :: stateRead more…
logical, intent(in), optional :: opt_grid

whether process grid was already optimized and should not be changed

public subroutine dbcsr_tas_batched_mm_complete(matrix, warn)

Arguments

TypeIntentOptionalAttributesName
type(dbcsr_tas_type), intent(inout) :: matrix
logical, intent(in), optional :: warn