jbww@ukc.ac.uk (J.B.W.Webber) (10/19/90)
In article <9014@latcs1.oz.au> burns@latcs1.lat.oz.au (Jonathan Burns) writes: > >I would very much appreciate the advice of someone who has experience with >Tim Budd's APL compiler. > >I have it compiled on a Sony NEWS machine under Unix. The Parse pass fails >on the ulam.apl test file, though not if I cut that file back to one >function definition. The subsequent pass fails on the output from that. > >I added a fflush(stdio) to pass.c before the normal exit, but no luck. > >This looks like something simple, like a system default on pipe buffer >sizes, but I can't spot it. Any clues? > >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There have been some queries about upgrades to Tim Budd's apl compiler. I did some work on a couple of versions, a few years ago, and I must say, as much as Tim may warn of its limitations, I have found it most useful. It really is nice, when a research student needs a piece of code, to just be able to take a few apl routines, say aplc, and get something that is reasonably optimised. (If I can do better, why, just build the algorithm into the code generation, so all similar examples can use it.) I don't know which version you have, or if the bug fixes I reported are now included, but I will list them in case they are of use - I know Tim is busy (I had to bother him a year or so later - the archive tape I had saved sources onto had been reused as a regular backup ! I know that then these fixes were not installed; fortunately I did have a copy of the following, and was able to restore a fixed version.) (This version also failed on ulam, but now works fine) I have built the compiler on a number of hosts (Vax, Clipper, ST, .., and have compiled output C code on more (even Transputers under Helios). The main requirement is a C compiler that gets structures right. A good optimiser to chuck away unwanted routines is nice. J.B. Webber jbww@ukc.ac.uk Physics Lab, University of Kent at Canterbury, Kent, UK. O.K. here follows my letter to Tim (about June 1987) : ___________________________________________________________________________ Thanks for the latest copy of aplc. . . . The code produced runs faster and is about twice as compact as that produced by the earlier version that I have been working with to-date. - most impressive (it also looks a lot cleaner) It passes all the `snags' that I had problems with before (such as failing when catenating items of zero length). I have found two small typos in axis.c . . Bug finding and fixing has been proceeding so well, that I have just kept going. Although the changes I have had to make are few, they have made a great difference to the actual ability to compile and run my programs. I have mainly looked at the `production' stages, but the compiler really does now work rather well (with just two `holes' that I know of). Two things that I have looked briefly at, but do not yet understand, are : 1) looping at the top level does not appear to be allowed - one is allowed a `simplestatement' but not an `lstatement' - is this a bug or a feature ? - is there a conflict with direct definition ? At the least this should be handled by something rather less cryptic than a core dump ! - any ideas ? [ in apl.h : object: simplestatement {gstmts = addstmt(gstmts, newstate(NILCHAR, $1)); } | dclstmt | function | fident COLON expression NL {direct($1, $3);} | uident COLON expression NL {direct($1, $3);} ; ] 2) arbitary axis catenate and laminate is not yet supported - the comma bar function does not appear to be supported - is the missing code quite as widespread as the comments would suggest ? - would it be easier to add comma bar as a first step ? This would appear to be the thing that is now the most important item in need of completion. Optimisation. The way should now be clear to complete some final optimisations of the production stage. i.e. - Some of the things that are now fixed, are clearly intended to allow the elimiation of code for loops which it is known will only be executed once. In gendrop in trans.c at line 364 there is some optimisation code for a known loop count, that now works correctly. (This appears to be the only example of this particular optimisation - for rkloop, but could also be used for `take', which makes the only other call to rkloop.) Could a similar optimisation be worked into other loops - such as izloop ? ( 6 calls ) What were your plans ? _____________________________________________________________________________ Thanks indeed for the latest copy. . . . cheers beau webber jbww@UKC.AC.UK UKC, Canterbury, England. P.S. Is Atari St disc a useful way of exchanging bug fixes,etc. (SS/DS ?) __________________________________________________________________ O.K., so here are the changes, first in human readable form, then as a shell script. I have taken care to mark all changes, so people know who to blame, if I have broken anything else. __________________________________________________________________ 1) dyadic rotate : for : d _ 5 5 .ro .io 25 V _ 1 2 2 2 3 h _ V .rv d was getting : i5 = (*(V.value.ip + i6) + i2 % i4; cause : In genrotate in Axis.c at line 650 an unpaired open parenthesis lp() - I just commented the line out, though I expect inserting the matching one would be more correct. axis.c genrotate - generate code for dyadic rotate instruction *** 647,653 /* add value to appropriate position of rhs */ ieq(node->ptr3); ! /* lp(); jbww UKC 6/87 */ ctgen(LEFT->values.c); printf(" + "); asubi(node, RIGHT, node->index, node->ptr1, node->ptr2); 2) subscripted indexing : for : (above example continued) v _ V .rv[1] d was getting : i6 = (*(V.value.ip + i7) + (i2 / i4) i5; cause : In asubi in Axis.c at line 211 in a printf statement : a % where there should be %% , of course axis.c asubi - generate code to produce the i'th element *** 208,214 else if (node->info & LASTAXIS) printf("i%d %% i%d", index, s); else ! printf("(i%d / i%d) %% i%d", index, e, s); /* jbww UKC 6/87 */ ! /* printf("(i%d / i%d) % i%d", index, e, s); */ } 3) reduction, along a specified axis : i.e. +/[2] 4 5 6 .ro .io 120 There were two bugs here - : firstly the new shape was being generated incorrectly : then the values were being selected in the wrong order In cpwo in runmem.c at line 146 : both pointers were being incremented together (or not), but always need the one to step on. (I have gone for clarity, this may be written better.) In genred in ops.c at line 500 : in the call to divmod, div and mod were swapped. Took me a while to see what had happened - on the way realised that these two lines can be simplified quite a bit - more on optimisation another time, just doing fixes now ! runmem.c cpwo - copy an integer vector without a specific element *** 141,149 register int *to, *from; { int i; ! for (i = 0; i < size; i++) { if (i != wo) ! *to++ = *from; /* jbww UKC 6/87 */ ! from++; ! } ops.c genred - generate code for reduction function *** 496,506 printf("i%d = (i%d + 1) * i%d - 1;\n", RIGHT->index, node->index, node->ptr3); else { ! divmod(node->index, node->ptr2, node->ptr9, ! RIGHT->index ); ! /* div, mod were swapped. jbww ukc 6/87 */ printf("i%d += (i%d + 1) * i%d - i%d;\n", RIGHT->index, node->ptr9, node->ptr8, node->ptr2); } 4) drop, with a positive value, and a vector argument i.e 5 .da .io 15 was giving the same result as .ng5 .da .io 15 There were two bugs here : In gendrop in trans.c at line 364 in some optimisation code for a known left value, the test for the sign was the wrong way round : This then exposed a problem in the evaluation of the constant : In valvalue in cutil.c at line 95 this had clearly been edited from valshape, and was still returning the shape information. trans.c gendrop - generate code for drop function *** 360,368 case COMBINE: if (is_scalar(LEFT->right)) { if (LEFT->info & VALUESKNOWN) { top = iconsts[valvalue(LEFT)]; ! if (top > 0) { /* jbww ukc 6/87 */ ! /* if (top < 0) { */ printf("*mp%d.ip += %d;\n", node->ptr2, top); } } cutil.c *** 86,99 /* valvalue - return the constant offset of a known value */ int valvalue(node) struct node *node; { struct codetree *child; if (! (node->info & VALUESKNOWN)) gcant_happen(113); if ((node->values.c)->cop != cnst) gcant_happen(10); ! child = (node->values.c)->c0.cleft; /* jbww UKC 6/87 */ ! /* child = (node->shape.c)->c0.cleft; */ if (child->cop != icnst) gcant_happen(1112); return(child->c0.cindex); } 5) looping : Three problems here, .go sel / label Compression by a scalar was not handled the same as compression by a vector with one element : I just pretended they were the same. The `assignments to right of arrow not recognised' problem : move the `break' production from after the stmtno assignment to after the finish code : ensure that code will `fall through' if no branch, by incrementing stmtno just before the test If more than one label is present, each one that was declared in the c program was followed by a `}' : caused by a copied sectiom in gcdcls, that was incompletely edited axis.c compshape - generate shape information for compress *** 306,312 { printf("valloc(&mp%d, ", mpval); ! if ( (is_vector(child) || is_scalar(child)) ){ /* jbww UKC 6/87 */ ! /* if (is_vector(child)) { */ printf("1, INT);\n"); setmptoi(mpval, sp); } misc.c gengo - generate code for unconditional goto *** 384,407 case SHAPE: adjdcls(node); switchbox(RIGHT, SHAPE, 0); getsize(node->ptr1, RIGHT); + printf("stmtno++;\n"); /* jbww UKC 6/87 */ printf("if (i%d > 0) {\n", node->ptr1); savetree = looprank; looprank = RIGHT->rank.c; if (! (RIGHT->info & NOINDEX)) seticon(RIGHT->index, 0); switchbox(RIGHT, VALUE, 0); mkrktype(RIGHT, INT, node->ptr2); printf("stmtno = "); ctgen(RIGHT->values.c); seminl(); ! /* printf("break;\n"); */ rbr(); looprank = savetree; break; case FINISH: switchbox(RIGHT, FINISH, 0); + printf("break;\n"); /* jbww UKC 6/87 */ break; } } dcls.c gcdcls - generate constant declarations *** 77,85 if (lctop) { printf("int l_%s[%d] = {", funname, lctop); for (i = 0; i < lctop; i++) { printf("%d%c ", lconsts[i].label_num, ! ((i+1)<lctop?',':'}')); /* jbww UKC 6/87 */ ! /* ((i+1)<rctop?',':'}')); */ if (i % 10 == 9) nl(); } seminl(); } __________________________________________________________________ Here is a shar file. Place it in a directory with the source of aplc, and run it through /bin/sh , It will preserve copies of the original source, and insert the bugfixes where necessary. __________________________________________________________________ # This is a shell archive. # Remove everything above and including the cut line. # Then run the rest of the file through sh. #-----cut here-----cut here-----cut here-----cut here----- #!/bin/sh # shar: Shell Archiver # Run the following text with /bin/sh to create: # mod.sh # axis.ed # dcls.ed # misc.ed # ops.ed # runmem.ed # trans.ed # util.ed # This archive created: Tue Jul 7 21:11:28 1987 echo shar: extracting mod.sh '(380 characters)' sed 's/^XX//' << \SHAR_EOF > mod.sh XX#! /bin/sh -e XX# preserves originals, then inserts bugfixes dated 6/87 XX# from j.b. webber at University of Kent, Canterbury, UK. XX XXmv axis.c axis.c.orig XXmv runmem.c runmem.c.orig XXmv ops.c ops.c.orig XXmv trans.c trans.c.orig XXmv cutil.c cutil.c.orig XXmv misc.c misc.c.orig XXmv dcls.c dcls.c.orig XX XXforeach i (*.ed) XX(cat $i ; echo '1,$p' ) | ed $i:r.c.orig > $i:r.c XXend XX SHAR_EOF if test 380 -ne "`wc -c mod.sh`" then echo shar: error transmitting mod.sh '(should have been 380 characters)' fi echo shar: extracting axis.ed '(423 characters)' sed 's/^XX//' << \SHAR_EOF > axis.ed XX650c XX/* lp(); jbww UKC 6/87 */ XX. XX309c XX if ( (is_vector(child) || is_scalar(child)) ){ /* jbww UKC 6/87 */ XX/* if (is_vector(child)) { */ XX. XX211c XX printf("(i%d / i%d) %% i%d", index, e, s); /* jbww UKC 6/87 */ XX/* printf("(i%d / i%d) % i%d", index, e, s); */ XX. XX SHAR_EOF if test 423 -ne "`wc -c axis.ed`" then echo shar: error transmitting axis.ed '(should have been 423 characters)' fi echo shar: extracting dcls.ed '(168 characters)' sed 's/^XX//' << \SHAR_EOF > dcls.ed XX81c XX ((i+1)<lctop?',':'}')); /* jbww UKC 6/87 */ XX/* ((i+1)<rctop?',':'}')); */ XX. SHAR_EOF if test 168 -ne "`wc -c dcls.ed`" then echo shar: error transmitting dcls.ed '(should have been 168 characters)' fi echo shar: extracting misc.ed '(265 characters)' sed 's/^XX//' << \SHAR_EOF > misc.ed XX404a XX printf("break;\n"); /* jbww UKC 6/87 */ XX. XX398c XX/* printf("break;\n"); */ XX. XX387a XX printf("stmtno++;\n"); /* jbww UKC 6/87 */ XX. XX SHAR_EOF if test 265 -ne "`wc -c misc.ed`" then echo shar: error transmitting misc.ed '(should have been 265 characters)' fi echo shar: extracting ops.ed '(500 characters)' sed 's/^XX//' << \SHAR_EOF > ops.ed XX503a XX */ XX. XX499c XX XX divmod(node->index, node->ptr2, node->ptr9, XX RIGHT->index ); XX /* div, mod were swapped. jbww ukc 6/87 */ XX printf("i%d += (i%d + 1) * i%d - i%d;\n", XX RIGHT->index, node->ptr9, XX node->ptr8, node->ptr2); XX XX/* divmod(node->index, node->ptr2, RIGHT->index, XX. XX SHAR_EOF if test 500 -ne "`wc -c ops.ed`" then echo shar: error transmitting ops.ed '(should have been 500 characters)' fi echo shar: extracting runmem.ed '(272 characters)' sed 's/^XX//' << \SHAR_EOF > runmem.ed XX146c XX *to++ = *from; /* jbww UKC 6/87 */ XX from++; XX } XX/* for (i = 0; i < size; i++) XX * if (i != wo) XX * *to++ = *from++; XX */ XX. XX144c XX for (i = 0; i < size; i++) { XX. XX SHAR_EOF if test 272 -ne "`wc -c runmem.ed`" then echo shar: error transmitting runmem.ed '(should have been 272 characters)' fi echo shar: extracting trans.ed '(170 characters)' sed 's/^XX//' << \SHAR_EOF > trans.ed XX364c XX if (top > 0) { /* jbww ukc 6/87 */ XX/* if (top < 0) { */ XX. XX SHAR_EOF if test 170 -ne "`wc -c trans.ed`" then echo shar: error transmitting trans.ed '(should have been 170 characters)' fi echo shar: extracting util.ed '(169 characters)' sed 's/^XX//' << \SHAR_EOF > util.ed XX95c XX child = (node->values.c)->c0.cleft; /* jbww UKC 6/87 */ XX/* child = (node->shape.c)->c0.cleft; */ XX. XX SHAR_EOF if test 169 -ne "`wc -c util.ed`" then echo shar: error transmitting util.ed '(should have been 169 characters)' fi # End of shell archive echo echo preserving original aplc source, inserting bugfixes /bin/csh mod.sh exit 0