dbcsr_iterator_start Subroutine

public subroutine dbcsr_iterator_start(iterator, matrix, shared, dynamic, dynamic_byrows, contiguous_pointers, read_only)

Sets up an iterator

Contiguous pointers Contiguous pointers may incur reallocation penalties but enable quick passing of arrays to routines with unspecified interfaces (i.e., direct calls to BLACS or MPI).

Threading The TYPE(dbcsr_iterator) variable should be thread-private.

The iterator has several modes of operation when used with OpenMP. Two options can be set to influence the behavior.

Threading: shared vs. non-shared The "shared" flag specifies that several threads will be iterating through the same matrix. - Sharing is the default when called from an active parallel region. In the shared mode no two threads will receive the same block; i.e., the work is split among the threads. - If each (or one) thread needs to iterator through all blocks then shared should be set to .FALSE.. (E.g., when called from an enclosing MASTER region or when each thread has its own matrix.) - It is safe to use an iterator in non-shared mode with only one thread. No thread synchronization constructs are used in this case)

Threading in shared mode When in shared mode there are three possibilities to select how the blocks are distributed to the threads.

Thread distribution
The default is to use the thread distribution. The thread distribution statically maps rows to threads and should be used whenever retaining a consistent mapping among subsequent iterations is important.
Dynamic scheduling
If the dynamic flag is .TRUE., then blocks are given to threads dynamically. By default the assignment is grouped by rows (to minimize synchronization); however, if the dynamic_byrows flag is .FALSE. then every block is assigned dynamically.

Arguments

Type IntentOptional Attributes Name
type(dbcsr_iterator), intent(out) :: iterator

the iterator

type(dbcsr_type), intent(in) :: matrix

DBCSR matrix

logical, intent(in), optional :: shared

The matrix is shared between several iterators. Default is .TRUE. Threads are given blocks regardless of the thread distribution; default is .FALSE. Threads are given blocks regardless of the thread distribution, but still grouped by rows; default is .FALSE. Whether returned pointers need to be contiguous; default is FALSE. User promises not to change returned data; default is FALSE

logical, intent(in), optional :: dynamic

The matrix is shared between several iterators. Default is .TRUE. Threads are given blocks regardless of the thread distribution; default is .FALSE. Threads are given blocks regardless of the thread distribution, but still grouped by rows; default is .FALSE. Whether returned pointers need to be contiguous; default is FALSE. User promises not to change returned data; default is FALSE

logical, intent(in), optional :: dynamic_byrows

The matrix is shared between several iterators. Default is .TRUE. Threads are given blocks regardless of the thread distribution; default is .FALSE. Threads are given blocks regardless of the thread distribution, but still grouped by rows; default is .FALSE. Whether returned pointers need to be contiguous; default is FALSE. User promises not to change returned data; default is FALSE

logical, intent(in), optional :: contiguous_pointers

The matrix is shared between several iterators. Default is .TRUE. Threads are given blocks regardless of the thread distribution; default is .FALSE. Threads are given blocks regardless of the thread distribution, but still grouped by rows; default is .FALSE. Whether returned pointers need to be contiguous; default is FALSE. User promises not to change returned data; default is FALSE

logical, intent(in), optional :: read_only

The matrix is shared between several iterators. Default is .TRUE. Threads are given blocks regardless of the thread distribution; default is .FALSE. Threads are given blocks regardless of the thread distribution, but still grouped by rows; default is .FALSE. Whether returned pointers need to be contiguous; default is FALSE. User promises not to change returned data; default is FALSE


Source Code

   SUBROUTINE dbcsr_iterator_start(iterator, matrix, shared, dynamic, &
                                   dynamic_byrows, contiguous_pointers, read_only)
      !! Sets up an iterator
      !!
      !! Contiguous pointers
      !! Contiguous pointers may incur reallocation penalties but enable quick
      !! passing of arrays to routines with unspecified interfaces (i.e., direct
      !! calls to BLACS or MPI).
      !!
      !! Threading
      !! The TYPE(dbcsr_iterator) variable should be thread-private.
      !!
      !! The iterator has several modes of operation when used with
      !! OpenMP. Two options can be set to influence the behavior.
      !!
      !! Threading: shared vs. non-shared
      !! The "shared" flag specifies that several threads will be
      !! iterating through the same matrix.
      !! - Sharing is the default when called from an active parallel
      !! region. In the shared mode no two threads will receive the
      !! same block; i.e., the work is split among the threads.
      !! - If each (or one) thread needs to iterator through all blocks
      !! then shared should be set to .FALSE.. (E.g., when called
      !! from an enclosing MASTER region or when each thread has its
      !! own matrix.)
      !! - It is safe to use an iterator in non-shared mode with only
      !! one thread.  No thread synchronization constructs are used
      !! in this case)
      !!
      !! Threading in shared mode
      !! When in shared mode there are three possibilities to select
      !! how the blocks are distributed to the threads.
      !! <DL>
      !! <DT>Thread distribution</DT>
      !! <DD>The default is to use the thread distribution. The thread
      !! distribution statically maps rows to threads and should be
      !! used whenever retaining a consistent mapping among
      !! subsequent iterations is important.</DD>
      !! <DT>Dynamic scheduling</DT>
      !! <DD>If the dynamic flag is .TRUE., then blocks are given to
      !! threads dynamically. By default the assignment is grouped
      !! by rows (to minimize synchronization); however, if the
      !! dynamic_byrows flag is .FALSE. then every block is
      !! assigned dynamically.</DD></DL>

      TYPE(dbcsr_iterator), INTENT(OUT)                  :: iterator
         !! the iterator
      TYPE(dbcsr_type), INTENT(IN)                       :: matrix
         !! DBCSR matrix
      LOGICAL, INTENT(IN), OPTIONAL                      :: shared, dynamic, dynamic_byrows, &
                                                            contiguous_pointers, read_only
         !! The matrix is shared between several iterators. Default is .TRUE.
         !! Threads are given blocks regardless of the thread distribution; default is .FALSE.
         !! Threads are given blocks regardless of the thread distribution, but still grouped by rows; default is .FALSE.
         !! Whether returned pointers need to be contiguous; default is FALSE.
         !! User promises not to change returned data; default is FALSE

      CHARACTER(len=*), PARAMETER :: routineN = 'dbcsr_iterator_start'

      INTEGER                                            :: error_handle
      TYPE(dbcsr_distribution_obj)                       :: dist

!   ---------------------------------------------------------------------------

      MARK_USED(dynamic) ! only used with OMP

      CALL timeset(routineN, error_handle)
      iterator%shared = .TRUE.
!$    iterator%shared = omp_in_parallel()
      IF (PRESENT(shared)) iterator%shared = shared
      iterator%dynamic = .TRUE.
!$    iterator%dynamic = .FALSE.
!$    IF (PRESENT(dynamic)) iterator%dynamic = dynamic
      IF (PRESENT(dynamic_byrows)) THEN
         iterator%dynamic_byrows = dynamic_byrows
         IF (iterator%dynamic_byrows) iterator%dynamic = .TRUE.
      ELSE
         iterator%dynamic_byrows = iterator%dynamic
!$       iterator%dynamic_byrows = iterator%dynamic
      END IF
!$    IF (.NOT. iterator%shared) THEN
!$       iterator%dynamic = .FALSE.
!$    END IF
      dist = dbcsr_distribution(matrix)
!$    IF (.NOT. dbcsr_distribution_has_threads(dist)) &
!$       DBCSR_WARN("Thread distribution should be defined for OpenMP.")
      IF (.NOT. iterator%dynamic .AND. .NOT. dbcsr_distribution_has_threads(dist)) &
         DBCSR_ABORT("Thread distribution must be defined for non-dynamic iterator.")
!$    IF (omp_in_parallel() .AND. omp_get_num_threads() /= dbcsr_distribution_num_threads(dist)) &
!$       CALL dbcsr_abort(__LOCATION__, &
!$                        "Number of threads has changed from "// &
!$                        stringify(dbcsr_distribution_num_threads(dist))// &
!$                        " to "//stringify(omp_get_num_threads())//"!")
      !Synchronize the positions
      NULLIFY (iterator%common_pos)
      IF (iterator%dynamic) THEN
         ! All threads point into the master thread's data space
         ! (temporarily using the common_int_pointer variable). This is
         ! not the nicest OpenMP way of doing this but it is also not
         ! explicitly forbidden.
         !
!$OMP        BARRIER
!$OMP        MASTER
         ALLOCATE (iterator%common_pos)
         common_int_pointer => iterator%common_pos
         common_int_pointer = 0
!$OMP        FLUSH (common_int_pointer)
!$OMP        END MASTER
!$OMP        BARRIER
         IF (.NOT. ASSOCIATED(iterator%common_pos)) THEN
            iterator%common_pos => common_int_pointer
         END IF
!$OMP        BARRIER
      END IF
      !
      IF (PRESENT(contiguous_pointers)) THEN
         iterator%contiguous_pointers = contiguous_pointers
      ELSE
         iterator%contiguous_pointers = .TRUE.
      END IF
      IF (PRESENT(read_only)) THEN
         iterator%read_only = read_only
      ELSE
         iterator%read_only = .FALSE.
      END IF
      iterator%row = 0
      iterator%pos = 0
      iterator%rbs => array_data(matrix%row_blk_size)
      iterator%cbs => array_data(matrix%col_blk_size)
      iterator%roff => array_data(matrix%row_blk_offset)
      iterator%coff => array_data(matrix%col_blk_offset)

      iterator%local_indexing = matrix%local_indexing
      !IF(iterator%local_indexing .AND. .NOT. iterator%dynamic) &
      !   DBCSR_ABORT("Locally-indexed matrices can only have a dynamic iterator.")
      IF (iterator%local_indexing .AND. .NOT. array_exists(matrix%local_rows)) &
         CALL dbcsr_abort(__LOCATION__, &
                          "Local rows mapping array should exist when local indexing is used.")
      IF (iterator%local_indexing .AND. .NOT. array_exists(matrix%global_rows)) &
         CALL dbcsr_abort(__LOCATION__, &
                          "Global rows mapping array should exist when local indexing is used.")
      iterator%global_rows => array_data(matrix%global_rows)
      iterator%local_rows => array_data(matrix%local_rows)

      iterator%transpose = .FALSE. !matrix%transpose
      iterator%nblks = matrix%nblks
      IF (iterator%transpose) THEN
         iterator%nblkrows_total = matrix%nblkcols_total
      ELSE
         iterator%nblkrows_total = matrix%nblkrows_total
      END IF

      iterator%row_p => matrix%row_p
      iterator%col_i => matrix%col_i
      iterator%blk_p => matrix%blk_p
!$OMP     CRITICAL (crit_data)
      iterator%data_area = matrix%data_area
      CALL dbcsr_data_hold(iterator%data_area)
!$OMP     END CRITICAL (crit_data)
      iterator%row_size = 0
      IF (.NOT. iterator%dynamic) THEN
         iterator%tdist => array_data(dbcsr_distribution_thread_dist(dist))
      ELSE
         NULLIFY (iterator%tdist)
      END IF
!$    IF (iterator%dynamic) THEN
!$OMP           SINGLE
!$       IF (iterator%dynamic_byrows) THEN
!$          iterator%common_pos = omp_get_num_threads()
!$       END IF
!$OMP           END SINGLE
!$       CALL dbcsr_iterator_seek(iterator, omp_get_thread_num() + 1)
!$    ELSE
         CALL dbcsr_iterator_seek(iterator, 1)
!$    END IF
      CALL timestop(error_handle)
   END SUBROUTINE dbcsr_iterator_start