[comp.lang.apl] Tim Budd's apl compiler : some fixes

jbww@ukc.ac.uk (J.B.W.Webber) (10/19/90)

In article <9014@latcs1.oz.au> burns@latcs1.lat.oz.au (Jonathan Burns) writes:
>
>I would very much appreciate the advice of someone who has experience with
>Tim Budd's APL compiler. 
>
>I have it compiled on a Sony NEWS machine under Unix. The Parse pass fails
>on the ulam.apl test file, though not if I cut that file back to one
>function definition. The subsequent pass fails on the output from that.
>
>I added a fflush(stdio) to pass.c before the normal exit, but no luck.
>
>This looks like something simple, like a system default on pipe buffer
>sizes, but I can't spot it. Any clues?
>
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


There have been some queries about upgrades to Tim Budd's apl compiler.
I did some work on a couple of versions, a few years ago, and I must say,
as much as Tim may warn of its limitations, I have found it most useful.

	It really is nice, when a research student needs a piece of code,
	to just be able to take a few apl routines, say aplc, and get
	something that is reasonably optimised. (If I can do better, why,
	just build the algorithm into the code generation, so all similar
	examples can use it.)

I don't know which version you have, or if the bug fixes I reported are
now included, but I will list them in case they are of use - I know Tim
is busy (I had to bother him a year or so later - the archive tape I had
saved sources onto had been reused as a regular backup ! I know that then
these fixes were not installed; fortunately I did have a copy of the
following, and was able to restore a fixed version.)

	(This version also failed on ulam, but now works fine)
	I have built the compiler on a number of hosts (Vax, Clipper, ST, ..,
	and have compiled output C code on more (even Transputers under Helios).
	The main requirement is a C compiler that gets structures right.
	A good optimiser to chuck away unwanted routines is nice.
	
 	J.B. Webber 	jbww@ukc.ac.uk
			Physics Lab, University of Kent at Canterbury, Kent, UK.

O.K. here follows my letter to Tim (about June 1987) :

___________________________________________________________________________

        Thanks for the latest copy of aplc.  
	.
	.
	.
        The code produced runs faster and is about twice as compact
        as that produced by the earlier version   that  I have been
        working with to-date.
        -      most impressive  (it also looks a lot cleaner)

It passes all the `snags' that I had problems with before (such as 
failing when catenating items of zero length).

I have found two small typos in axis.c 
     .
     .
Bug  finding and fixing has been proceeding so well,  that I  have 
just kept going.  Although the changes I have had to make are few, 
they  have  made  a  great difference to  the  actual  ability  to 
compile  and  run   my programs.   I have  mainly  looked  at  the 
`production' stages,  but the compiler really does now work rather 
well (with just two `holes' that I know of).
Two  things  that  I  have looked  briefly  at,  but  do  not  yet  
understand, are :

1)   looping at the top level does not appear to be allowed -  
     one is allowed a `simplestatement' but not an `lstatement'
     -    is this a bug or a feature ?  
     -    is there a conflict with direct definition ?

At  the  least  this should be handled by  something  rather  less 
cryptic than a core dump !              -  any ideas ?

[ in apl.h :
object:            simplestatement
                        {gstmts = addstmt(gstmts, newstate(NILCHAR, $1)); }
                |  dclstmt
                |  function
                |  fident COLON expression NL
                                {direct($1, $3);}
                |  uident COLON expression NL
                                {direct($1, $3);}
                ;
]


2)   arbitary axis catenate and laminate is not yet supported -
     the comma bar function does not appear to be supported
     -    is the missing code quite as widespread as the  comments 
          would suggest ?
     -    would it be easier to add comma bar as a first step ?

This  would appear to be the thing that is now the most  important 
item in need of completion. 

Optimisation.
 
The  way should now be clear to complete some final  optimisations 
of the production stage.  i.e.  - Some of the things that are  now 
fixed,  are  clearly intended to allow the elimiation of code  for 
loops which it is known will only be executed once.

        In gendrop in trans.c at line 364
        there is some optimisation code for a known loop count,
        that now works correctly.

        (This appears to be the only example of this particular
        optimisation -  for rkloop, but could also be used for `take',
        which makes the only other call to rkloop.)

Could a similar optimisation be worked into other loops
        -    such as izloop ?  ( 6 calls )    What were your plans ?
_____________________________________________________________________________

Thanks indeed for the latest copy. 
	.
	.
	.
                cheers
                             beau webber			jbww@UKC.AC.UK
                                        UKC, Canterbury, England.


P.S. Is Atari St disc a useful way of exchanging bug fixes,etc. (SS/DS ?) 
__________________________________________________________________


O.K.,    so here are the changes,    first in human  readable form,
                                     then as a shell script.

                I have taken care to mark all changes,  so  people 
                know who to blame,  if I have broken anything else.  
__________________________________________________________________

1) dyadic rotate :
        for :

                                d _ 5 5 .ro .io 25
                                V _ 1 2 2 2 3
                                
                                h _  V .rv d

        was getting :
                                i5 = (*(V.value.ip + i6) + i2 % i4;
        cause :

        In genrotate   in Axis.c at line 650 
        an  unpaired open parenthesis   lp()
                -       I just commented the line out, though I expect
                        inserting the matching one would be more correct.

axis.c
        genrotate -  generate code for dyadic rotate instruction
*** 647,653
  
                        /* add value to appropriate position of rhs */
                        ieq(node->ptr3);
! /*                    lp();                                   jbww UKC 6/87 */
                        ctgen(LEFT->values.c);
                        printf(" + ");
                        asubi(node, RIGHT, node->index, node->ptr1, node->ptr2);


2) subscripted indexing :
        for :                   (above example continued)

                                v _  V .rv[1] d         

        was getting :

                                i6 = (*(V.value.ip + i7) + (i2 / i4) i5;

        cause :

        In asubi  in  Axis.c at line 211

        in a printf statement :   
                        a   %     where there should be  %%   , of course

axis.c
        asubi -  generate code to produce the i'th element 
*** 208,214
        else if (node->info & LASTAXIS)
                printf("i%d %% i%d", index, s);
        else
!               printf("(i%d / i%d) %% i%d", index, e, s); /* jbww UKC 6/87 */
! /*            printf("(i%d / i%d) % i%d", index, e, s);                   */
  }

3) reduction,
         along a specified axis :
                        i.e.            +/[2] 4 5 6 .ro .io 120

        There were two bugs here -
        : firstly the new shape was being generated incorrectly
        : then the values were being selected in the wrong order  

        In  cpwo in  runmem.c at line 146
        :       both  pointers were being incremented together
                (or not), but always need the one to step on. 
        (I have gone for clarity, this may be written better.)  

        In genred in ops.c at line 500
        :        in the call to divmod, div and mod were swapped.  
        Took  me  a while to see what had happened -  on  the  way 
        realised that these two lines can be simplified quite a bit 
         - more on optimisation another time, just doing fixes now !
 
runmem.c
          cpwo - copy an integer vector without a specific element 
*** 141,149
  register int *to, *from;
  {     int i;
  
!       for (i = 0; i < size; i++) {
                if (i != wo)
!                       *to++ = *from;                  /* jbww UKC 6/87 */
!               from++;
!       }

ops.c 
                genred - generate code for reduction function
*** 496,506
                                printf("i%d = (i%d + 1) * i%d - 1;\n", 
                                   RIGHT->index, node->index, node->ptr3);
                        else {
  
!                               divmod(node->index, node->ptr2, node->ptr9,
!                                RIGHT->index );
!                               /* div, mod were swapped.     jbww ukc 6/87 */
                                printf("i%d += (i%d + 1) * i%d - i%d;\n", 
                                   RIGHT->index, node->ptr9,
                                   node->ptr8, node->ptr2);
  
                                }


4) drop, 
        with a positive value, and a vector argument

                        i.e                 5 .da .io 15

        was giving the same result as 
                                         .ng5 .da .io 15



There were two bugs here :

        In gendrop in trans.c at line 364
        in some optimisation code for a known left value,
        the test for the sign was the wrong way round :

        This then exposed a problem in the evaluation of the constant :

        In valvalue in  cutil.c at line 95
        this had clearly been edited from valshape, and was still
        returning the shape information.



trans.c
        gendrop - generate code for drop function
*** 360,368
                case COMBINE:
                        if (is_scalar(LEFT->right)) {
                                if (LEFT->info & VALUESKNOWN) {
                                        top = iconsts[valvalue(LEFT)];
!                                       if (top > 0) {     /*   jbww ukc 6/87 */
! /*                                    if (top < 0) {                        */
                                                printf("*mp%d.ip += %d;\n",
                                                        node->ptr2, top);
                                                }
                                       }

cutil.c
*** 86,99
/* valvalue - return the constant offset of a known value */
int valvalue(node)
struct node *node;
{       struct codetree *child;

        if (! (node->info & VALUESKNOWN))
                gcant_happen(113);
        if ((node->values.c)->cop != cnst)
                gcant_happen(10);
!       child = (node->values.c)->c0.cleft;     /*              jbww UKC 6/87 */
! /*    child = (node->shape.c)->c0.cleft;                                    */
        if (child->cop != icnst)
                gcant_happen(1112);
        return(child->c0.cindex);
}

5) looping :
        Three problems here, 

                .go   sel / label
        Compression  by  a  scalar was not  handled  the  same  as 
        compression by a vector with one element
        :       I just pretended they were the same.

        The `assignments to right of arrow not recognised' problem 
        :       move the `break' production from after the stmtno
                assignment to after the finish code
        :       ensure that code will `fall through' if no branch,
                by incrementing stmtno just before the test

        If  more  than one label is present,  each  one  that  was 
        declared in the c program was followed by a `}' 
        :       caused by a copied sectiom in gcdcls, that was  
                incompletely edited  


axis.c
        compshape -  generate shape information for compress
*** 306,312
  {
        printf("valloc(&mp%d, ", mpval);
  
!       if ( (is_vector(child) || is_scalar(child)) ){  /*      jbww UKC 6/87 */
! /*    if (is_vector(child)) {                                               */
                printf("1, INT);\n");
                setmptoi(mpval, sp);
                }


misc.c  
        gengo -  generate code for unconditional goto
*** 384,407
                case SHAPE:
                        adjdcls(node);
                        switchbox(RIGHT, SHAPE, 0);
                        getsize(node->ptr1, RIGHT);
+                       printf("stmtno++;\n");          /*      jbww UKC 6/87 */
                        printf("if (i%d > 0) {\n", node->ptr1);
                        savetree = looprank;
                        looprank = RIGHT->rank.c;

                        if (! (RIGHT->info & NOINDEX))
                                seticon(RIGHT->index, 0);
                        switchbox(RIGHT, VALUE, 0);
                        mkrktype(RIGHT, INT, node->ptr2);
                        printf("stmtno = ");
                        ctgen(RIGHT->values.c);
                        seminl();
! /*                    printf("break;\n");                                   */
                        rbr();
                        looprank = savetree;
                        break;
  
                case FINISH:
                        switchbox(RIGHT, FINISH, 0);
+                       printf("break;\n");             /*      jbww UKC 6/87 */
                        break;
        }
  }

dcls.c
       gcdcls - generate constant declarations 
*** 77,85
         if (lctop) {
                printf("int l_%s[%d] = {", funname, lctop);
                for (i = 0; i < lctop; i++) {
                        printf("%d%c ", lconsts[i].label_num, 
!                               ((i+1)<lctop?',':'}'));     /*  jbww UKC 6/87 */
! /*                            ((i+1)<rctop?',':'}'));                       */
                        if (i % 10 == 9) nl();
                        }
                seminl();
                }

__________________________________________________________________

Here is a shar  file. 	Place it in a directory with the source of aplc,
			and run it through  /bin/sh ,

			It will preserve copies of the original source,
			and insert the bugfixes where necessary.

__________________________________________________________________


#	This is a shell archive.
#	Remove everything above and including the cut line.
#	Then run the rest of the file through sh.
#-----cut here-----cut here-----cut here-----cut here-----
#!/bin/sh
# shar:	Shell Archiver
#	Run the following text with /bin/sh to create:
#	mod.sh
#	axis.ed
#	dcls.ed
#	misc.ed
#	ops.ed
#	runmem.ed
#	trans.ed
#	util.ed
# This archive created: Tue Jul  7 21:11:28 1987
echo shar: extracting mod.sh '(380 characters)'
sed 's/^XX//' << \SHAR_EOF > mod.sh
XX#! /bin/sh -e
XX#         preserves originals, then inserts bugfixes  dated 6/87
XX#         from j.b. webber at University of Kent, Canterbury, UK. 
XX
XXmv axis.c axis.c.orig
XXmv runmem.c runmem.c.orig
XXmv ops.c ops.c.orig
XXmv trans.c trans.c.orig
XXmv cutil.c cutil.c.orig
XXmv misc.c misc.c.orig
XXmv dcls.c dcls.c.orig
XX
XXforeach i (*.ed)
XX(cat $i ; echo '1,$p' ) | ed $i:r.c.orig > $i:r.c
XXend
XX
SHAR_EOF
if test 380 -ne "`wc -c mod.sh`"
then
echo shar: error transmitting mod.sh '(should have been 380 characters)'
fi
echo shar: extracting axis.ed '(423 characters)'
sed 's/^XX//' << \SHAR_EOF > axis.ed
XX650c
XX/*                      lp();                                   jbww UKC 6/87 */
XX.
XX309c
XX        if ( (is_vector(child) || is_scalar(child)) ){  /*      jbww UKC 6/87 */
XX/*      if (is_vector(child)) {                                               */
XX.
XX211c
XX                printf("(i%d / i%d) %% i%d", index, e, s); /* jbww UKC 6/87 */
XX/*              printf("(i%d / i%d) % i%d", index, e, s);                   */
XX.
XX
SHAR_EOF
if test 423 -ne "`wc -c axis.ed`"
then
echo shar: error transmitting axis.ed '(should have been 423 characters)'
fi
echo shar: extracting dcls.ed '(168 characters)'
sed 's/^XX//' << \SHAR_EOF > dcls.ed
XX81c
XX                                ((i+1)<lctop?',':'}'));     /*  jbww UKC 6/87 */
XX/*                              ((i+1)<rctop?',':'}'));                       */
XX.
SHAR_EOF
if test 168 -ne "`wc -c dcls.ed`"
then
echo shar: error transmitting dcls.ed '(should have been 168 characters)'
fi
echo shar: extracting misc.ed '(265 characters)'
sed 's/^XX//' << \SHAR_EOF > misc.ed
XX404a
XX                        printf("break;\n");             /*      jbww UKC 6/87 */
XX.
XX398c
XX/*                      printf("break;\n");                                   */
XX.
XX387a
XX                        printf("stmtno++;\n");          /*      jbww UKC 6/87 */
XX.
XX
SHAR_EOF
if test 265 -ne "`wc -c misc.ed`"
then
echo shar: error transmitting misc.ed '(should have been 265 characters)'
fi
echo shar: extracting ops.ed '(500 characters)'
sed 's/^XX//' << \SHAR_EOF > ops.ed
XX503a
XX */
XX.
XX499c
XX
XX                                divmod(node->index, node->ptr2, node->ptr9,
XX                                 RIGHT->index );
XX                                /* div, mod were swapped.     jbww ukc 6/87 */
XX                                printf("i%d += (i%d + 1) * i%d - i%d;\n", 
XX                                   RIGHT->index, node->ptr9,
XX                                   node->ptr8, node->ptr2);
XX
XX/*                              divmod(node->index, node->ptr2, RIGHT->index, 
XX.
XX
SHAR_EOF
if test 500 -ne "`wc -c ops.ed`"
then
echo shar: error transmitting ops.ed '(should have been 500 characters)'
fi
echo shar: extracting runmem.ed '(272 characters)'
sed 's/^XX//' << \SHAR_EOF > runmem.ed
XX146c
XX                        *to++ = *from;                  /* jbww UKC 6/87 */
XX                from++;
XX        }
XX/*      for (i = 0; i < size; i++) 
XX *              if (i != wo)
XX *                      *to++ = *from++;
XX */
XX.
XX144c
XX        for (i = 0; i < size; i++) {
XX.
XX
SHAR_EOF
if test 272 -ne "`wc -c runmem.ed`"
then
echo shar: error transmitting runmem.ed '(should have been 272 characters)'
fi
echo shar: extracting trans.ed '(170 characters)'
sed 's/^XX//' << \SHAR_EOF > trans.ed
XX364c
XX                                        if (top > 0) {     /*   jbww ukc 6/87 */
XX/*                                      if (top < 0) {                        */
XX.
XX
SHAR_EOF
if test 170 -ne "`wc -c trans.ed`"
then
echo shar: error transmitting trans.ed '(should have been 170 characters)'
fi
echo shar: extracting util.ed '(169 characters)'
sed 's/^XX//' << \SHAR_EOF > util.ed
XX95c
XX        child = (node->values.c)->c0.cleft;     /*              jbww UKC 6/87 */
XX/*      child = (node->shape.c)->c0.cleft;                                    */
XX.
XX
SHAR_EOF
if test 169 -ne "`wc -c util.ed`"
then
echo shar: error transmitting util.ed '(should have been 169 characters)'
fi
#	End of shell archive
echo
echo     preserving original aplc source,  inserting bugfixes 
/bin/csh mod.sh
exit 0