DBCSR performance tests:
To run all the tests, use:
make test
Or run individual tests from the build
directory, as follows:
srun -N 1 --ntasks-per-core 2 --ntasks-per-node 12 --cpus-per-task 2 ./tests/dbcsr_unittest_1
Note that the tests of libsmm_acc (the GPU-backend) do not use MPI since libsmm_acc only operates on-node.
Note that if you are using OpenMP builds, then you have to set the environment variable OMP_NESTED=false
.
The test suite comes with a performance driver (dbcsr_performance_driver), which evaluates the performance of matrix-matrix multiplication in DBCSR.
Input matrices can be specified in an input file, passed to the executable as standard input, for example:
a) To test pure MPI performance test using [n] nodes:
mpirun -np [n] ./build/tests/dbcsr_perf tests/input.perf 2>&1 | tee perf.log
b) To test hybrid MPI/OpenMP performance test using [n] nodes, each spanning [t] threads:
export OMP_NUM_THREADS=[t]; mpirun -np [n] ./build/tests/dbcsr_perf tests/input.perf 2>&1 | tee perf.log
Examples of input files can be found in tests/inputs
for different sizes of matrices and different block sizes.
You can also write custom input files: for more information, follow the template in tests/input.perf
.