[comp.sys.alliant] Local storage within concurrent loop

pmontgom@euphemia.math.ucla.edu (Peter Montgomery) (09/21/90)

C	My first posting of this used "*" rather than "C" on these initial
C	comments -- the news software deleted them all.

C	How do I allocate an array inside a concurrent loop?
C	In this toy example, each call to TRIG returns an array (sc)
C	whose contents are used only for the duration of that
C	iteration of the loop by the caller.  If the loop in the
C	main program is to execute concurrently, then each 
C	processor must have a private copy of the array, 
C	but I want to avoid allocating separate copies for each iteration
C	of the outer loop (i.e., on an 8-processor system I am willing
C	to have 8 copies of the temporary array, but not 100000 copies).
C	How should I declare this?  If I use "allocate", the compiler
C	declines to optimize the loop.

	program test
	implicit none
	integer IMAX, i
	parameter (IMAX = 100000)
	double precision sc1(2), sc2(2), sc3(2, IMAX), sc4(:)
	double precision sum1, sum2, sum3, sum4

	sum1 = 0
	do i = 1, IMAX					! Sequential loop
	    call TRIG(DBLE(i), sc1)
	    sum1 = sum1 + sc1(1)*sc1(2)**2
	end do

CVD$L	CNCALL
	sum2 = 0
	do i = 1, IMAX					! Executes incorrectly
	    call TRIG(DBLE(i), sc2)
	    sum2 = sum2 + sc2(1)*sc2(2)**2
	end do

CVD$L	CNCALL
	sum3 = 0
	do i = 1, IMAX					! sc3 array too big
	    call TRIG(DBLE(i), sc3(1,i))
	    sum3 = sum3 + sc3(1,i)*sc3(2,i)**2
	end do

CVD$L	CNCALL
	sum4 = 0
	do i = 1, IMAX					! allocate inhibits 
							! concurrency
	    allocate(sc4(1:2))
	    call TRIG(DBLE(i), sc4)
	    sum4 = sum4 + sc4(1)*sc4(2)**2
	    deallocate (sc4)
	end do

	print *, 'Sums are ', sum1, sum2, sum3, sum4	! Should be identical
	end
	recursive subroutine TRIG(angle, sc)
	implicit none
	double precision angle, sc(2)
	sc(1) = SIN(angle)
	sc(2) = COS(angle)
	end
--
        Peter L. Montgomery 
        pmontgom@MATH.UCLA.EDU 
        Department of Mathematics, UCLA, Los Angeles, CA 90024-1555
If I spent as much time on my dissertation as I do reading news, I'd graduate.

jac@Alliant.COM (Jim Chmura) (09/27/90)

	The trick here, which gives you 8 copies of the array, is to
declare the array to be dimensioned (n,0:7) and then use the library routine
lib_processor_number to determine which processor a particular iteration
is running on.  This ensures that each processor works in a separate copy:

	       dimension array(n,0:7)
		   .
		   .
	cvd$ cncall
	       do i=1,100000
		   .
		   .
		 np=lib_processor_number()
		 call sub(...array(1,np)...)
		   .
		   .
	       enddo

	Be sure to include -lcommon in your link string to pick up the
lib_processor_number routine.

Regards,
Jim Chmura

Area Analyst Manager
Alliant - Marietta GA