dgh%dgh@Sun.COM (David Hough) (02/20/88)
I have been wondering about a problem that's been discussed, often incorrectly, in comp.lang.c. The problem is easy enough to understand if you've ever written linear algebra software, but the correct "in the spirit of C" solution wasn't clear to me. Fortunately Richard O'Keefe has volunteered some ideas. I'd appreciate feedback on them. If you're simply opposed to making C suitable for scientific programming, the net has already heard from you, so you needn't repeat your views, but if you have ideas for better syntax or better ways to achieve the goal, please speak up. The following is one of my comments on the January 1988 X3J11 draft for ANSI C, followed by the proposed solution. Comment #24, Section 3.5.4.2: fix arrays I know no C translation that's as clear as the follow- ing Fortran code: SUBROUTINE MATMUL(X,LX,Y,LY,Z,LZ,NX,NY,NZ) REAL X(LX,*),Y(LY,*),Z(LZ,*) DO 1 I=1,NX DO 2 J=1,NZ SUM=0 DO 3 K=1,NY 3 SUM=SUM+X(I,K)*Y(K,J) 2 Z(I,J)=SUM 1 CONTINUE END Code like this is at the heart of most of the major portable Fortran libraries of mathematical software developed over the last twenty years. The declared leading dimensions of X, Y, and Z are not known until runtime. The Draft, like traditional C, disallows the equivalent void matmul(x,lx,cx,...) int lx, cx; double x[lx][cx] ; unless lx and cx are known at compile time. Equivalent functionality can be obtained by treating all arrays as one-dimensional and doing the subscripting "by hand", so that it's harder to get right and harder to optimize. Fix- ing C's handling of arrays would be a far worthier task for the X3J11 experts than (for instance) specifying details of the math library. Maybe there's some good fundamental rea- son why variables are disallowed as array dimensions; C's treatment of arrays has always been confusing. Note that the section numberings in the Draft and Ra- tionale are out of synch in the declarations subchapter. Recommendation: Fix, by language design if necessary, the treatment of arrays in C to properly accommodate multiply-dimensioned arrays whose dimensions vary at run time. The goal is not to duplicate Fortran's features ex- actly, but rather to insure that portable linear algebra li- braries are as easy to create in C as in Fortran. Array proposal ----- -------- Richard O'Keefe has kindly provided the following out- line of how an array facility might work in C: The smallest possible change would be "conformant arrays" much as in ISO Pascal or in Turing, where the param- eter declaration was something like void matmul(double a[p][q], b[q][r], c[p][r]) { int i, j, k; double t; for (i = p; --i >= 0; ) for (j = r; --j >= 0; ) { t = 0.0; for (k = q; --k >= 0; ) t *= a[i][j]*b[j][k]; c[i][j] = t; } } If p, q, and r are #defined to be constant expressions, this is already legal, so we need one more thing to indicate that p,q,r are being defined here, not used. Consider the fol- lowing: declarator: ... | declarator '[' subscript_spec ']' ; subscript_spec: '?' identifier | constant_expression | /* empty */ ; where /* empty */ is only allowed as the first subscript_spec, and ?id is only allowed in a function header. To avoid having to specify what happens if ?x ap- pears several times but the arguments don't agree with that, make it illegal, so the first suggestion would have to be written void matmul(double a[?ar][?ac], /* ar >= p, ac >= q */ double b[?br][?bc], /* br >= q, bc >= r */ double c[?cr][?cc], /* cr >= p, cc >= r */ int p, /* 0 <= p <= min(ar,cr) */ int q, /* 0 <= q <= min(ac,br) */ int r) /* 0 <= r <= min(bc,cc) */ { int i, j, k; double t; for (i = p; --i >= 0; ) for (j = r; --j >= 0; ) { t = 0.0; for (k = q; --k >= 0; ) t *= a[i][j]*b[j][k]; c[i][j] = t; } } A conformant array may be passed as a parameter to a func- tion provided the function's prototype was visible to con- firm that a conformant array parameter was intended. The simplest way of treating sizeof is to rule it out: if the description of 'a' involves any ?s, you can't apply sizeof to it. So given a parameter float fred[?f1][?f2][10]; sizeof fred and sizeof fred[1] would be illegal, but sizeof fred[1][2] and sizeof fred[1][2][3] would be legal. David Hough ARPA: dhough@sun.com UUCP: {ucbvax,decvax,decwrl,seismo}!sun!dhough
gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/20/88)
In article <42529@sun.uucp> dgh%dgh@Sun.COM (David Hough) writes: > The Draft, like traditional C, disallows the equivalent > void matmul(x,lx,cx,...) > int lx, cx; > double x[lx][cx] ; Yes, indeed, this is a pain. I discussed the possibility of allowing something like this several years ago with Dennis Ritchie and Steve Johnson, and the consensus was that it was "doable", although specifying it correctly is tricky. > void matmul(double a[?ar][?ac], /* ar >= p, ac >= q */ > double b[?br][?bc], /* br >= q, bc >= r */ > double c[?cr][?cc], /* cr >= p, cc >= r */ > int p, /* 0 <= p <= min(ar,cr) */ > int q, /* 0 <= q <= min(ac,br) */ > int r) /* 0 <= r <= min(bc,cc) */ I didn't see how this proposal would work. For correct code to be generated, all but one of the array dimensions has to be known via the calling sequence. There seem to be only two ways to achieve this: (1) rely on explicit dimension parameters, as in the first example; (2) automatically pass extra size information along with the array address. The second approach is incompatible with current treatment of arrays, unless the function prototype has parameters declared some special way (perhaps using yet another type qualifier), so that the compilers will pass array information specially for the particular function. In other words, to make method (2) work, more must be involved than just the function definition. Is this what was intended by your second example? (Since the body of the example didn't make use of ar, ac, etc., it's hard to be sure.)
roy@phri.UUCP (Roy Smith) (02/21/88)
I'm not sure I follow all the details of David's note, but I thought I would throw in my suggestion anyway. Some time ago I had need for variable-dimension arrays in C. What I ended up using was vectored arrays. I wrote an array allocator which called malloc to get memory for the main array and for the row address vector, and which initialized the vector to point to the right places in the main array. After that, you could pretend you had a variable dimensional array; i.e. you could pass a subroutine the pointer returned by arrayalloc and the dimensions you used. It was posted a few years back to mod.sources. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
ok@quintus.UUCP (Richard A. O'Keefe) (02/22/88)
Background: David Hough sent me his comments on dpANS C. He wants a language which can do at least as much as Fortran could. One of his criteria was that you should be able to write subroutines that handle multi-dimensional arrays sensibly. I wrote back to him about some of his points. I started to argue that this criterion required much too big a change to the language, and ended up talking myself out of that position. David Hough sent (part of) my very sketchy proposal to comp.lang.c. I should point out that *no* part of the suggestion is original to me. Conformant arrays are part of ISO Pascal, and Turing has them. Doug Gwyn writes: > I didn't see how this proposal would work. For correct code > to be generated, all but one of the array dimensions has to > be known via the calling sequence. There seem to be only > two ways to achieve this: (1) rely on explicit dimension > parameters, as in the first example; (2) automatically pass > extra size information along with the array address. The > second approach is incompatible with current treatment of > arrays, unless the function prototype has parameters declared > some special way (perhaps using yet another type qualifier), > so that the compilers will pass array information specially > for the particular function. In other words, to make method > (2) work, more must be involved than just the function > definition. Is this what was intended by your second example? > (Since the body of the example didn't make use of ar, ac, etc., > it's hard to be sure.) [ there was meant to be an assert() checking the comments, but I forgot. Sorry about that, but it wasn't ready for "publication". ] The answer is that I wrote that it would not be legal to call a function with a conformant array parameter unless the actual function definition or a prototype were in scope, so that the compiler would know to pass the dimension information. This is easily checked by a tool like "lint": simply mark in the symbol table each function which is called through through an implicit or old-fashioned declaration, and complain about any marked function which has a conformation array parameter. Functions which take a variable number of parameters are already subject to just such a restriction. The declaration of conformant dimension parameters "? id" is precisely the "special way" of declaration which Doug Gwyn rightly claims to be necessary. And yes, conformant array parameters *are* incompatible with the current treatment of arrays. If you have p[D1]...[Dn] then p[E1]...[Ek] would be a pointer only if none of Dk+1,...,Dn were a conformant dimension parameter (?id) [that's "only if", not "if and only if"], and sizeof p[E1]...[Ek] would be legal if and only if none of Dk+1,...,Dn were a conformant dimension parameter. Oddly enough, a weak spot in C comes to our aid: Pascal has this nasty problem: what does procedure p(var a: array [la..ua: integer] of char; var b: array [lb..ub: integer] of char); begin a := b; end {p}; mean? C wouldn't have that problem, because it lacks array assignment. I have managed to convince myself that o David Hough is right: something like this IS necessary if one is to be able to write mathematical libraries in C with at least the same generality as Fortran (e.g. EISPACK, LINPACK, IMSL, NAG, ...) o conformant array parameters, though a *hack*, are at least a hack which is known to work o conformant array parameters can be added to C without major distortion, though they would be exceptions to "sizeof" and quiet-conversion-to-pointer I have NOT yet managed to convince myself that ? conformant array parameters should be added to C but I think one could make a considerably better argument for this addition than for 'noalias'.
chris@trantor.umd.edu (Chris Torek) (02/22/88)
In article <676@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >Background: > David Hough sent me his comments on dpANS C. > He wants a language which can do at least as much as Fortran could. > > One of his criteria was that you should be able to write > subroutines that handle multi-dimensional arrays sensibly. I would claim that the `best' way to do this is to use the row-vector approach. To create a two-dimensional array, e.g., do the following: typedef struct dmat2 { int m2_nrows; int m2_ncols; double **m2_m; /* row vectors */ double *m2_data; } dmat2; dmat2 * create(rows, cols) register int rows, cols; { register dmat2 *m; register int r; m = (dmat2 *)malloc(sizeof(dmat2)); if (m == NULL) return (NULL); m->m2_rows = rows; m->m2_cols = cols; m->m2_m = (double **)malloc(rows * sizeof(double *)); if (m->m2_m == NULL) { free((char *)m); return (NULL); } m->m2_data = (double *)malloc(rows * cols * sizeof(double)); if (m->m2_data == NULL) { free((char *)m->m2_m); free((char *)m->m2_data); } for (r = 0; r < rows; r++) m->m2_m[r] = &m->m2_data[r * cols]; return (m); } These arrays can then be used as m->m2_m[r][c] = val; and parts of the array can be passed to functions with f(&m->m2_m[off], rows - off, cols); Note that a whole matrix can be passed (by reference) with f(m); If you dislike having to specify the range for subarrays, you could create a `make subarray' function: dmat2 * makesub(m, roff, coff, rsize, csize) dmat2 *m; int roff, coff, rsize, csize; { if (m->m2_rows - roff < rsize || m->m2_rows - coff < csize) /* it is not that big! */... etc. } This does, of course, require more memory to describe, but row vectors are generally faster anyway. Since the whole matrix is still there as a single block of memory, you can use existing routines as well (by passing m->m2_data). So why is this solution insufficient? -- In-Real-Life: Chris Torek, Univ of MD Computer Science, +1 301 454 7163 (hiding out on trantor.umd.edu until mimsy is reassembled in its new home) Domain: chris@mimsy.umd.edu Path: not easily reachable
gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/22/88)
In article <676@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
-I have NOT yet managed to convince myself that
- ? conformant array parameters should be added to C
-but I think one could make a considerably better argument for this
-addition than for 'noalias'.
Send this in as a formal public comment and I'll support it.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/22/88)
In article <2336@umd5.umd.edu> chris@trantor.umd.edu (Chris Torek) writes: >So why is this solution insufficient? Because it's a pain in the lower part of the anatomy! Or, in other words, just because it is theoretically sufficient does not mean that it is convenient. This is one of the limitations of C that Fortran programmers have difficulty with; they're used to being able to write reasonable array manipulation routines.
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (02/23/88)
> >So why is this solution insufficient? > > Because it's a pain in the lower part of the anatomy! > Or, in other words, just because it is theoretically sufficient > does not mean that it is convenient. > This is one of the limitations of C that Fortran programmers > have difficulty with; they're used to being able to write > reasonable array manipulation routines. How about matrix assignment (a = b). If you have matrices contiguously in memory you can do a "single" bcopy to move the data. The optional representation doesn't allow this. Matrix addition (a = b + c). while (--size >= 0) *a++ = *b++ + *c++; Questionable whether this is actually faster than a[i][j] method on the non-contiguous matrix. A pain that I'm faced with is transferring data between machines. If the machines are of the same type, I do some heavy optimizations on data transfer. In the case of the matrices, I write the whole matrix at once into a message. If I where to use the optional representation I couldn't perform this optimization. -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/23/88)
In article <929@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
- How about matrix assignment (a = b). If you have matrices
-contiguously in memory you can do a "single" bcopy to move the data.
-The optional representation doesn't allow this.
We weren't proposing to make [][] objects have non-contiguous storage.
ok@quintus.UUCP (Richard A. O'Keefe) (02/23/88)
In article <2336@umd5.umd.edu>, chris@trantor.umd.edu (Chris Torek) writes: > In article <676@cresswell.quintus.UUCP> ok@quintus.UUCP > (Richard A. O'Keefe) writes: > >Background: > > David Hough sent me his comments on dpANS C. > > He wants a language which can do at least as much as Fortran could. > > > > One of his criteria was that you should be able to write > > subroutines that handle multi-dimensional arrays sensibly. > > I would claim that the `best' way to do this is to use the row-vector > approach. To create a two-dimensional array, e.g., do the following: > This does, of course, require more memory to describe, but row vectors > are generally faster anyway. Since the whole matrix is still there > as a single block of memory, you can use existing routines as well > (by passing m->m2_data). > > So why is this solution insufficient? I like the approach Torek recommends, having met this approach on the B6700. But the sad fact of the matter is that C doesn't support it. C *will* let you do it yourself, but it *won't* do it for you. Torek's own example is a better illustration of how much *work* it is to use this approach in C than I could have come up with, and the result doesn't look much like ordinary array access in C. The point of the conformant array suggestion is that o it is a proven technique o for hacking around a basic design flaw in Pascal that is also present in C o it lets you use array parameters *directly* Torek is absolutely right that "row vectors" are a very sensible implementation technique, but having the machinery show through is not a sensible >language< technique. I'm sure that something better than conformant arrays can be done. But we're looking for something that makes it EASY to pass array parameters and EASY to use them, nearly as easy as passing fixed size arrays would be. We're also looking for something which will be the same for all C programmers: if I want to pass an array to a function I got from Fred, and to pass the same array to a function I got from Joe, I really do *not* want to find out that Fred used row pointers and Joe used column pointers.
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (02/23/88)
> > We weren't proposing to make [][] objects have non-contiguous storage. Excuse me Chris. I glanced over your post and ASS-u-ME-d that you were proposing the array of pointers to arrays solution without carefully looking at you code. On second glance I noticed that you allocate the whole memory block contiguously hence my conplaints are no valid. I'll go crawl back into my whole now. 8-) -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
ERICMC@USU.BITNET (Eric McQueen) (02/24/88)
Summary: Standard run-time function would be a better solution. Problems with conformant arrays: - Complicates what should be a simple compiler (we don't want ADA, please!) - Does not allow for allocation of matrices at run time (unless I mis- understand the proposal) -- why make C as bad as Fortran? I believe the addition of a couple of simple functions to the standard library would provide more power than conformant arrays, is more within the "spirit of C", and does not complicate the compiler (I can provide source for these routines written in portable C -- no extra effort required when developing the compiler for a new environment). In message <42529@sun.uucp> David Hough and/or Richard O'Keefe write: > I know no C translation that's as clear as the follow- >ing Fortran code: > SUBROUTINE MATMUL(X,LX,Y,LY,Z,LZ,NX,NY,NZ) > REAL X(LX,*),Y(LY,*),Z(LZ,*) >[arrays are accessed via "X(I,J)"] >[then proposes the following C syntax] > void matmul(double a[?ar][?ac], /* ar >= p, ac >= q */ > double b[?br][?bc], /* br >= q, bc >= r */ > double c[?cr][?cc], /* cr >= p, cc >= r */ > int p, /* 0 <= p <= min(ar,cr) */ > int q, /* 0 <= q <= min(ac,br) */ > int r) /* 0 <= r <= min(bc,cc) */ >[arrays are accessed via "a[i][j]"] >[mentions having to disallow sizeof() in some cases] As has been pointed out several times [recently in <2336@umd5.umd.edu> by Chris Torek (chris@trantor.umd.edu)], the following is already possible: void matmul( double **a, double **b, double **c, int p, int q, int r) { /* arrays are accessed via "a[i][j]" */ and is much simpler. The only problem I see with this is that it is difficult to declare such an array in the calling procedure. Several people have proposed methods to allocate "row vector" arrays. I feel that some such method should be standardized by the committee and routines to handle it be provided in the standard library. The allocation of arrays would be as simple as: float ***m; /* 3-Dimensional Matrix ALLOCation: */ m = d3malloc( sizeof(float), 2, 3, 4 ); or: float **m; int d[] = { 2, 3, 4 }; /* Multi-Dimensional Matrix ALLOCation: */ m = mdmalloc( sizeof(**m), 3, d ); Both of which would yield: m: &a &b 0 /* Pointers ("0" means "NULL"): */ a: &c &d &e 0 b: &f &g &h 0 c: 000 001 002 003 /* Elements (must be contiguous): */ d: 010 011 012 013 e: 020 021 022 023 f: 100 101 102 103 g: 110 111 112 113 h: 120 121 122 123 where "102" represents where "m[1][0][2]" is stored. We allocate room for 11 pointers in a contiguous block and for 24 numbers in a contiguous block. The space required for the pointers would on the order of that required if you increased the last array dimension by 1 (not too bad). The null pointers are included to allow efficient traversal of the entire array. Of course, this method doesn't provide for the Fortran trick of: real a(10,20), b(9,14), c(5,11) call matmult( a(2,3), 10, b(4,1), 9, c(2,5), 5, 2, 2, 2 ) for getting at strange sub-matrices. But I think you should do this via: void matmult( a, ar, b, br, c, cr, p, q, r ) float *a, *b, *c; int ar, br, cr, p, q, r; #define A(i,j) a[ar*i+j] which is what Fortran is really doing anyway. I plan to send a more detailed proposal along these lines to the committee. Any suggestions about my choice of allocation style would be appreciated. For the sake of definateness, here are the basic prototypes: void *mdmalloc( size_t siz /* Size of elements in the array. */ , int ndim /* Number of dimensions. */ , int d[] /* Maximum dimensions "arr[ d[0] ]...[ d[ndim-1] ]". */ ); void *d2malloc( size_t siz, int d1, int d2 ) { int d[2]; d[0] = d1; d[1] = d2; return( mdmalloc( siz, 2, d ); } void *d3malloc( size_t, siz, int d1, int d2, int d3 ) { int d[3]; d[0] = d1; d[1] = d2; d[2] = d3; return( mdmalloc( siz, 3, d ); } void mdmfree( void *arr, int ndim ); #define d2free( arr ) d2free( arr, 2 ) #define d3free( arr ) d2free( arr, 3 ) I think some strange systems would want to provide special versions that deal only with certain data types if, say, sizeof(float *) < sizeof(void *): float **d2falloc( int d1, int d2 ); double ***d3dalloc( int d1, int d2, int d3 ); But I think most systems would just use: #define d3dalloc(d1,d2,d3) (double ***) d3malloc( sizeof(double), d1, d2, d3 ) Is there any reason to: *Explicitly* provide for allocating multi-dimensional arrays of anything other than float, double, int, or long? Force non-strange systems to provide {md,d2,d3}{f,d,i,l}alloc()? Provide for the allocation of rows separately in case of fragmented memory? Make the inclusion of the null pointers optional or drop them altogether? Include the dimensions or number of dimensions in the array structure? Not include these routines in the standard? (no-cost added option?) Comments not of general interest to the net could be mailed directly to me. I will summarize significant comments. (Comments not of significant interest should just be shouted at your roommate/boss/dog.) --- Eric Tye McQueen Mathematics Department Also at (after some ericmc@usu.bitnet Utah State University time in March): (801) 753-4683 Logan, Utah 84322-3900 ericmc@usu.usu.edu UUCP: ...{uunet,psuvax1}!usu.bitnet!ericmc "Nothing is obvious CSNET: ericmc%usu.bitnet@relay.cs.net unless you're over- Arpa: ericmc%usu.bitnet@cunyvm.cuny.edu looking something."
bvs@light.uucp (Bakul Shah) (02/25/88)
Multi-dimensional arrays are a useful feature and David Hough / Richard O'Keefe's proposal of conformant arrays is usable & implementable. But the proposed conformant arrays are not elegant and, in my opinion, it is a *bad* idea to add them to C (atleast as bad as adding structure comparison or nested procedures or name-your-favorite-non-C-feature). Such things are best left for C++. No new features are needed to implement multi-dimensional arrays in C++, Chris Torek's ``row vectors'' would do just fine AND, and the machinery will be hidden (what O'Keefe wants in a >sensible< language). If they are added to C, C++ will be forced to carry them (thereby introducing an unneeded hack) OR, C++ will *not* be upward compatible to C. Since conformant arrays do not exist in C today, only new code will use this feature (if added). Why not use C++ or Chris Torek's idea for new code? C just doesn't have the bandwidth to do many sensible things. Attempts to increase this bandwidth in an elegant way is extremely difficult, if not impossible. My two cent's worth. -- Bakul Shah ..!{ucbvax,sun}!amdcad!light!bvs
kurtk@tekcae.TEK.COM (Kurt Krueger) (02/26/88)
Look at "The C Programming Language" by Kernighan & Ritchie on p. 110. If you set up your array like int *b[10] you can treat it as a conformal 2-d array without having to pass ANY bounds information (except for whatever the procedure logic requires).
henry@utzoo.uucp (Henry Spencer) (02/26/88)
> Recommendation: Fix, by language design if necessary, > the treatment of arrays in C to properly accommodate > multiply-dimensioned arrays whose dimensions vary... This is an interesting idea, and fortunately you are just in time to have some chance of getting it into the next revision of C, five years or so hence. Here's what you do: (a) implement it in a C compiler (b) use it, and have others use it, for a few years (c) write it up and submit it formally I guarantee that if you follow these steps, IN THE ABOVE ORDER, X3J11 will at least listen attentively when you propose it. Note that writing up the proposal and submitting it comes after you've tried it out, *in C*, not before. (Yes, X3J11 has done a few things that have not been based on prior experience. Conformant arrays are arguably a better idea than, say, noalias. This is not much of a recommendation, however, since noalias is increasingly looking like the biggest mistake X3J11 has ever made.) Alternatively you might consider doing it in C++, which probably won't require any compiler changes at all. X3J11 has never claimed that C is the answer to all mankind's problems. You will be listened to *more* attentively if you can explain why C++ does *not* solve your problem. -- Those who do not understand Unix are | Henry Spencer @ U of Toronto Zoology condemned to reinvent it, poorly. | {allegra,ihnp4,decvax,utai}!utzoo!henry
ok@quintus.UUCP (Richard A. O'Keefe) (02/26/88)
In article <1988Feb24.165307.4938@light.uucp>, bvs@light.uucp (Bakul Shah) writes: > Multi-dimensional arrays are a useful feature and David Hough / Richard > O'Keefe's proposal of conformant arrays is usable & implementable. But ... > Such things are best left for C++. ... > If they are added to C, C++ will be forced to carry them (thereby > introducing an unneeded hack) OR, C++ will *not* be upward compatible > to C. Since conformant arrays do not exist in C today, only new code > will use this feature (if added). Why not use C++ or Chris Torek's idea > for new code? I am sorry to blaspheme against anyone's religion, but IHMO, C++ itself is an unneeded hack. Well, add a smiley to "unneeded", but "hack" I am serious about. C++ is the Fortran 8X of object-oriented languages. X3J3, in trying to put together a language with all sorts of things which are in themselves good, yet which contains Fortran 77 as a subset, have created a monster. C++ is just such a monster. The things that Stroustrup added to C are in themselves fine things, the trouble was that more things should have been left out. If C++ is not upwards compatible with C, so much the better, and about time too! There is something distinctly odd about letting the C++ tail wag the C dog. C is not the only language in the world. It would be so nice to have intercallability between ANSI C and ISO Pascal. It is particularly important not to commit to a particular device (such as Chris Torek's idea) so that an implementor can provide something which can be connected in a sensible fashion to Pascal or Turing or PL/I or even (via a wee bit of macro expansion) to Fortran. If I go to the trouble of coding a simplex algorithm in C, and run it on an IBM mainframe, I would like the C vendor to have been able to select an implementation of conformant array parameters which is compatible with PL/I. Or if I find that there is a better one already there in PL/I, I don't want to have written my program using edge-vectors so that I am *guaranteed* incompatibility with everything in sight! It is only too credible that conformant array parameters may not be the best way of achieving the goals of o minimal change to the language and implementation o making it straightforward to write functions with multidimensional array parameters o avoiding over-specification, so that intercallability with other language is not impaired more than we can help But it is CERTAIN that Bakul Shah's suggested alternatives > Why not use C++ or Chris Torek's idea for new code? do not meet those goals.
ark@alice.UUCP (02/26/88)
In article <694@cresswell.quintus.UUCP>, ok@quintus.UUCP writes: > The things that Stroustrup > added to C are in themselves fine things, the trouble was that more things > should have been left out. A challenge: name five things that ``should have been left out.''
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (02/26/88)
> Look at "The C Programming Language" by Kernighan & Ritchie on p. 110. If you > set up your array like int *b[10] you can treat it as a conformal 2-d array > without having to pass ANY bounds information (except for whatever the > procedure logic requires). But you have to access the array as if it where a n by 10 matrix. This method does not provide for n by m by .... arrays. At most you may leave the first dimension (the rows) unspecified. -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (02/26/88)
> > C is not the only language in the world. It would be so nice to have > intercallability between ANSI C and ISO Pascal. It is particularly > important not to commit to a particular device (such as Chris Torek's > idea) so that an implementor can provide something which can be connected > in a sensible fashion to Pascal or Turing or PL/I or even (via a wee bit of > macro expansion) to Fortran. If I go to the trouble of coding a simplex > algorithm in C, and run it on an IBM mainframe, I would like the C vendor > to have been able to select an implementation of conformant array parameters > which is compatible with PL/I. Or if I find that there is a better one > already there in PL/I, I don't want to have written my program using > edge-vectors so that I am *guaranteed* incompatibility with everything in > sight! The issue of intercallability procedures is no simple matter. You have to deal with the mapping of data types from one language to the next. A mapping of a data type between languages need not be trivial or computational inexpensive. Unions in C to the language of your choice is one example of a non-trivial mapping. Union in C are indiscriminate - you need not have a tag fields explicitly determining the data type of the union. Because of this you can't determine the equivalent representation in another language. Another example would be mapping structures in C to Fortran. Fortran doesn't have a niece equivalent. I seem to recall that FORTRASH :-) (or at least some versions) uses column major form for storing arrays and C uses row major form. To move multi-dimensional arrays (tensor) from one language to the next you must preform a transpose. This could get expensive for large matrices. However, I didn't miss part of what you were advocating which was common representation for data types. Well, let me add why I don't think this is a good idea - at least not yet and here's an example why. I seem to remember Simula uses row-vector representation for matrices. One of the advantages of this is that the matrix my have dynamic bounds. A disadvantage it that indexing becomes more expensives (a couple levels of indirection per matrix access). The point being made it that the choice of data representations is dictated by the design goals of the language. In many cases there are tradeoffs between flexablity and efficiency. -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
reeder@ut-emx.UUCP (William P. Reeder) (02/27/88)
In article <1491@tekcae.TEK.COM>, kurtk@tekcae.TEK.COM (Kurt Krueger) writes: > Look at "The C Programming Language" by Kernighan & Ritchie on p. 110. If you > set up your array like int *b[10] you can treat it as a conformal 2-d array > without having to pass ANY bounds information (except for whatever the > procedure logic requires). Since no one else has had the gumption to say it, I will. Can this discussion continue without cross-posting to comp.lang.fortran? I know it started out as a discussion of what fortran could learn from C, but it has long since deviated to be a pure C discussion. Thanks.
karl@haddock.ISC.COM (Karl Heuer) (02/27/88)
In article <8802240725.AA22255@jade.berkeley.edu> ERICMC@USU.BITNET (Eric McQueen) writes: >Summary: Standard run-time function would be a better solution. > >As has been pointed out several times [recently in <2336@umd5.umd.edu> by >Chris Torek (chris@trantor.umd.edu)], the following is already possible: >void matmul( double **a, double **b, double **c, int p, int q, int r) {...} >and is much simpler. The only problem I see with this is that it is >difficult to declare such an array in the calling procedure. Yes. >[X3J11 should standardize a function to allocate such row-vector arrays, e.g. > float **a = d2malloc( sizeof(float), nrows, ncols ); ] Now we have a problem. How does the routine know that the row vectors should be "float *" rather than some other pointer type? This design implicitly assumes that all pointers look alike. Now, you can (and do) assert that only "strange systems" have this problem, but that doesn't make it go away. If you insist that an ANSI implementation must supply d2malloc(), and that strange systems must therefore be non-ANSI, you're going against the X3J11 charter to work on as many systems as possible. If you allow an implementation to omit d2malloc(), it kind of defeats the purpose of putting it into the standard. To put it another way, if I know of a system where your function can't possibly be implemented, I can't comfortably use it in my portable programs, even if it somehow gets put in the standard. What can we salvage? As you mentioned, you could make explicit functions for the most common datatypes (int, long, float, double). The implicit rule that all struct pointers are equivalent allows you to do it for generic structs, too; this ought to be useful. Alternately, you could petition X3J11 to add this with the syntax float **a = d2malloc( float, nrows, ncols ); where the first argument is the type. On non-strange systems this has the simple implementation you outlined; the first argument is used only as a parameter to sizeof(). On strange systems, this would require assistance from the compiler via a builtin. (Cf. <stdarg.h>) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
ok@quintus.UUCP (Richard A. O'Keefe) (02/27/88)
In article <973@PT.CS.CMU.EDU>, edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes: > The issue of intercallability procedures is no simple matter. You > have to deal with the mapping of data types from one language to the > next. A mapping of a data type between languages need not be trivial > or computational inexpensive. > I seem to recall that FORTRASH :-) (or at least some versions) uses column > major form for storing arrays and C uses row major form. To move > multi-dimensional arrays (tensor) from one language to the next > you must preform a transpose. This could get expensive for large > matrices. > > However, I didn't miss part of what you were advocating which > was common representation for data types. I begin to appreciate the problems that X3J11 have faced. NO! I was not advocating common representation of data types! At least, not in the standard as such. What I was advocating was that the standard should not go out of its way to PRECLUDE intercallability, which is a different point. The point about Fortran using column-major and other languages (Algol 60, Pascal, ADA, C, ...) using row-major is a rather old red herring. So a Fortran subroutine might use MAT(I,J) and the C caller would identify the exact same element as mat[j][i]. So what? ADA[*] uses the same order as C, and nobody seems to think ADA can't interface to Fortran. Since one is passing a description of an array to the other language, the cost of transposing the array proper is not an issue: what matters is what it costs to pass the description of the transpose. (I am *not* suggesting that descriptor-passing should be *required* or that the need for transposition is genuine.) Similarly, C would use a fixed lower bound of zero, but Turing would use a constant expression, and Pascal would be given two bounds. So what? Any reasonable bijection will do, it doesn't have to be the identity. (By the way, my spinor calculus teacher would kill you for confusing arrays and tensors. Fortunately, he's not here. :-) The C standard cannot possibly require intercallibility, because it has no authority over other languages. All I claim for the conformant array parameter approach is that it avoids over-specification, and, since this technique is already in use in other languages, is unlikely to make intercallability worse, and if any other mechanism were to be adopted, not *precluding* intercallability is a good criterion. Someone suggested taking the auxiliary-array-of-pointers approach and requiring it in the standard library. That approach may be a wonderful implementation method, but it is too specific. (C is already too much like Lisp, no point in making it worse.) The worst thing about that approach is that it requires explicit memory management. It's great that C permits explicit memory management, but *requiring* it does not come under the heading of "acts of kindness". Why should I have to explicitly manage a block of pointers to do a trivial thing that ALGOL 60 did 28 years ago, PL/I does, Pascal does, ... and so on. Oh well, someone promised me a copy of GNU CC, I guess I'd just better wait for the other half to arrive, then start hacking. [*] ADA is a birthmark of the DuD UFO.
ok@quintus.UUCP (Richard A. O'Keefe) (02/29/88)
In article <7715@alice.UUCP>, ark@alice.UUCP writes: > In article <694@cresswell.quintus.UUCP>, ok@quintus.UUCP writes: > > The things that Stroustrup > > added to C are in themselves fine things, the trouble was that more things > > should have been left out. > > A challenge: name five things that ``should have been left out.'' The challenge is easily met. But we have to nail down an ambiguity in "should have been left out" first. Given that Stroustrup *wanted* a high-powered language which was upwards compatible with C, he didn't have any choices at all about what to leave out: *nothing* could be left out. But the greatest virtue of C is that it doesn't get in your way much. It doesn't have a lot of positive virtues. This is no reflection on the designers of C. C wasn't *supposed* to be an early ADA, it was supposed to be a fairly minimal sort of tool. So when I said that "more things should have been left out", the question I was addressing is "what should a successor to C look like, which tries to stay as comprehensible as C, but which tries to make it easier to write correct programs?" I repeat that this is not the problem that Stroustrup was trying to solve, so my claim that some things we'd be better off without is not a reflection on Stroustrup. 1. Integer types as 'char', 'short', 'long' &c. This is KNOWN to be a cause of portability problems. It's fine when you are programming a particular machine, and want to be certain that things are the size you think they are. It's terrible for writing portable programs. These days, there is no good reason for a programming language to let the machine dictate the range of integers. (The machine does dictate what range of integers is *efficient*, but that's another question.) SUN didn't let the fact that MC68010s have no 32-bit multiply instruction dictate to them that int==short, so why should the absence of 48-bit integer instructions mean that I can't have a variable capable of holding 48-bit integers? Yes, C++ will let me define such a data-type, and will let me define operations on it. But it's my job, and it should be the compiler's. 2. Identification of integer and boolean types. This is a case of over-specification. On some machines, it might be more efficient to use (< 0)=true,(>=0)=false. It is also a cause of confusion. Suppose that integer and boolean were distinct types. Then if (i = 0) ... would be a type error. Similarly, the quiet conversion of pointers and floating-point numbers to boolean (in each case, equality to 0 is falsehood and difference from 0 is truth) is something we'd be better off without. 3. Implicit 'int'. "register i;" is a legal C declaration, hence legal in C++. Things used as functions are implicitly "int foo(...);". In fact, to keep the number down, let's include old-fashioned function declaration syntax. 4. Untagged unions. Not only does C++ have C's unions, it goes out of its way to make it easier to trip over your own feet. I refer to anonymous unions. I could write a hymn of praise to tagged unions, discrimination case statements, and polymorphic types, and an execration text for untagged unions, but let's keep it down. 5. Identification of 'endcase' and 'break'. Possibly the most frequent reason for the presence of a label in my C code is that C uses the same symbol for "finish case" and "finish loop". This is particularly bewildering, because BCPL had different symbols for the two ('endcase' and 'break'), and it didn't make the BCPL compiler any more complicated. In fact, let's broaden this to the switch statement as a whole. A statement like switch (i) { int i; { int i; case 1: ...; } if (...) case 2: ...; else case 3: ...; } is in fact a legal C statement, and from my reading of the C++ book it is a legal C++ statement. (The book explicitly says that the controlled statement doesn't have to be compound, and *may* contain declarations.) 6. C-style arrays. I can go into more detail if anyone's interested. The problem is that the size of an array is part of its type, sort of. It's all very confusing, really. SUMMARY. Each of the things I have mentioned is an obstacle to the writing of reliable and portable programs. Number 6 in particular is responsible for the large number of UNIX utilities which die horribly in the input data exceed some undocumented limit (and the rather wierd limits that *are* documented, see xargs(1)). Each of them is something that Stroustrup had to include in C++ *if* it was to be upwards compatible with C. Which is why I want to use "D", not "C++".
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (02/29/88)
In article <708@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes: > > 4. Untagged unions. > > Not only does C++ have C's unions, it goes out of its way to make > it easier to trip over your own feet. I refer to anonymous unions. > > I could write a hymn of praise to tagged unions, discrimination > case statements, and polymorphic types, and an execration text > for untagged unions, but let's keep it down. > There are tradeoffs in requiring or not requiring tag fields. It's the space vs readablity problem again. In the system that I maintain, I use indiscriminate unions for space and time reasons. I use arrays of indiscriminate unions (TOKENs) to basically represent C structures. There are a number of fields predefined by the system for systems use. One of these fields is an index into a table that discribes the structure of the TOKEN. So the tag fields of the union can be view as external to the union that they are associated with in my example. This save a lot of space. Time savings comes into play when transferring these guys over the network. I only transfer the data object, and no tag field information. Again tables are used to discriminate the object coming across the net. BTW I not gunning for you Rich. I know that I some of my posts may seem to be attacking your views. They really are not, I'm just adding my .02 to the discussion. And I have to say, I agree with a lot of your posts. (P.S. I think I have to get out my differential geometry notes and review tensor :-)) -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
msb@sq.uucp (Mark Brader) (03/01/88)
Karl Heuer (karl@haddock.ima.isc.com) writes: > ... Alternately, you could petition X3J11 to add this with the syntax > float **a = d2malloc( float, nrows, ncols ); > where the first argument is the type. ... [in general] > ... this would require assistance from the compiler via a builtin. > > Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint Well, there is also this: float **a; d2malloc (a, float, nrows, ncols); with #define d2malloc(var,type,nr,nc) { \ int _i, _j = (nc);\ (var) = ((type) **) malloc ((_i = (nr)) * sizeof ((type) *));\ while (_i) (var)[--(_i)] = ((type) *) malloc (sizeof ((type)));\ } [Not tested; checks of return status of malloc() omitted for brevity] But I find this at least equally unsatisfactory. Certainly there is nothing in the Draft now that requires a macro that may expand to a statement, and the above can't be done without temporary variables, thus requiring {...}. And besides all THAT, what do you do when you want a *3*-dimensional array? To me, the choice is between the Fortran approach and no change. Mark Brader "Howeb45 9 qad no5 und8ly diturvrd v7 7jis dince SoftQuad Inc., Toronto 9 qas 8mtillihemt mot ikkfavpur4d 5esoyrdeful utzoo!sq!msb, msb@sq.com abd fill if condif3nce on myd3lf." -- Cica
cabo@tub.UUCP (Carsten Bormann) (03/01/88)
In article <694@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
() C++ is the Fortran 8X of object-oriented languages.
What a piece of nonsense.
[For the uninitiated: C++ is not an ``object-oriented language'', it is
a language that retains the spirit of C and that among other (often more
important) improvements on C facilitates object-oriented programming. If
you compare the position of C++ in the development of the C language to
the position of a FORTRAN dialect in the history of the FORTRAN language,
you're better off using FORTRAN-77 as a reference. Somehow, the recent
comp.lang.c discussions about ``D'' (e.g. about conformant arrays) seem
to have picked up the spirit of FORTRAN 8X.]
--
Carsten Bormann, <cabo@tub.UUCP> <cabo@db0tui6.BITNET> <cabo@tub.BITNET>
Communications and Operating Systems Research Group
Technical University of Berlin (West, of course...)
Path: ...!pyramid!tub!cabo from the world, ...!unido!tub!cabo from Europe only.
bvs@light.uucp (Bakul Shah) (03/01/88)
In an earlier article I wrote that multi-dimensional arrays are best left out of ANSI C and for C++. Richard A. O'Keefe <ok@quintus.UUCP> responded: >I am sorry to blaspheme against anyone's religion, but IHMO, C++ itself >is an unneeded hack. Well, add a smiley to "unneeded", but "hack" I am >serious about. C++ is the Fortran 8X of object-oriented languages. > ... >It is only too credible that conformant array parameters may not be the >best way of achieving the goals of > o minimal change to the language and implementation > o making it straightforward to write functions with multidimensional > array parameters > o avoiding over-specification, so that intercallability with other > language is not impaired more than we can help >But it is CERTAIN that Bakul Shah's suggested alternatives >> Why not use C++ or Chris Torek's idea for new code? >do not meet those goals. Somewhat different set of goals make sense to me. 1. Avoid any extensions to C at this point -- this would only delay the the standard. Conformant array parameters or any other means of making multi-dimensional arrays a first class object would be a major extension. Leave such extensions to a successor language (I used C++ in this sense, as an *example* -- I don't care to join a discussion of whether C++ is an abortion or an immaculate conception). 2. A successor language, call it NEXT(C), should first clean up C and provide a more regular set of features. 3. For NEXT(C) minimal change is not as critical as making all new features orthogonal (to get maximal use out of a minimal number of features). Conformant array parameters extension may be *minimal*, it certainly isn't as orthogonal as some other mechanisms. 4. Do not standardize a new feature until you have implemented and used it for a few years. Related topics of o what KIND of features should go in NEXT(C), o whether NEXT(C) should be upward compatible to C or a *subset* of C, o whether NEXT(C) should be a high level assembler (like C) or allow data abstraction facilities (like CLU does), and, o whether C++ fits the bill or not warrant further discussion, separate from this one. I bring up NEXT(c) mainly to point out that it will be a better language to add MD arrays (multi-dimensional arrays -- sorry, I can't think of a better name). In C there ARE some ways of implementing functions that take MD arrays. They may not be pretty but will get the job done. In my previous article I used Chris Torek's proposal as an *example*, not as the *right* way to implement MD arrays. Besides, in a modern language you would want more than one representation for MD arrays. BTW, MD arrays in C++ using a variation of Torek's idea is *not* an over- specification. It is just another piece of useful code, not an extension to the language. Now if someone proposed legislating it as the only way to do MD arrays in C++, I would be very surprised and definitely fight it. Intercallability is not an issue since an MD array can be mapped one-to- one from one language to another but you will need some glue in some cases; even Pascal and Fortran are not intercallable without some glue. But compatibility with Pascal's conformant array parameters can not be a desirable goal -- conformant arrays are a hack in *any* language! -- Bakul Shah ..!{ucbvax,sun,py
ok@quintus.UUCP (Richard A. O'Keefe) (03/01/88)
In article <996@PT.CS.CMU.EDU>, edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes: > In article <708@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes: > > I could write a hymn of praise to tagged unions, discrimination > > case statements, and polymorphic types, and an execration text > > for untagged unions, but let's keep it down. > > [Recall that this started with me saying that C/C++ compatibility didn't strike me as important because I thought C++ would be a better language without some of C's features, and having been challenged to name five such features, I named six. C's unions are what they are, and a good C programmer will learn to live with them. I do not advocate changing C. ] Wyatt writes: > There are tradeoffs in requiring or not requiring tag fields. > It's the space vs readablity problem again. I can't really agree. In Wyatt's own example, he makes it plain that the information as to what his unions are is held in a field, it's just that the field is somewhere else. If you are storing information about a union somewhere, what's so terrible about storing it as part of the union? There is something of a cultural conflict here. My basic preference is for languages like Simula and ML, where there is no such thing as an untagged union, and you don't have to debug using-the-wrong-union bugs because they just can't happen. This goes with a view of what a union *is*, which says that 'union' ("sum" types) is the categorial dual of 'struct' ("product" types), that the "fields" of a union are injection functions, and it goes with a pattern-matching programming style (the Simula INSPECT statement, for example). On the other hand, C has nothing but untagged union, which goes with a quite different view of what a union is, namely that it is a storage overlay. Pascal tried to compromise, which meant that it offered neither the security of sum types nor the storage efficiency of untagged unions. I think everyone will admit that sum types are theoretically clean, and that union types are riskier. Further, sum types are better news for optimising compilers. I think we can discount the "space saving" argument: if space saving were of paramount concern in C, it would have dynamically allocated arrays, not fixed-size arrays. As for transmitting things over networks, presumably there is some reason why SUN's XDR coding doesn't pack things down to byte, but works in multiples of 32 bits... C's unions are what they are, and there are legitimate uses of them. But are they a desirable feature in a successor to C? Perhaps we can steer this discussion back to something of more immediate use to C programmers: the six points I listed in the message Wyatt quoted: o do other C programmers experience them as nuisances? o how do people work around them?
pardo@june.cs.washington.edu (David Keppel) (03/02/88)
In article <1988Feb29.205138.28452@sq.uucp> msb@sq.UUCP (Mark Brader) writes: > >Karl Heuer (karl@haddock.ima.isc.com) writes: >> ... Alternately, you could petition X3J11 to add this with the syntax >> float **a = d2malloc( float, nrows, ncols ); >> where the first argument is the type. ... [in general] >> ... this would require assistance from the compiler via a builtin. > >Well, there is also this: > > float **a; > d2malloc (a, float, nrows, ncols); > >with > #define d2malloc(var,type,nr,nc) { \ > int _i, _j = (nc);\ > (var) = ((type) **) malloc ((_i = (nr)) * sizeof ((type) *));\ > while (_i) (var)[--(_i)] = ((type) *) malloc (sizeof ((type)));\ > } >[Not tested; checks of return status of malloc() omitted for brevity] >But I find this at least equally unsatisfactory. Or possibly: float **a = (float **) dmalloc( dims, dimsvals, ptrsizes ); where "dims" is an array of n+1 ints for n dimensions, zero terminated, sizebase is the sizes of the intermediate objects: static int dismvals[] = { 5, 6 } static int ptrsizes = { sizeof(float), sizeof(float *) } This assumes that the important thing about the intermediate type pointers is their size and not their format. While (I think) this is mostly portable, I can bet that it isn't entirely portable. I'm not sure that making it a builtin would make it any more portable. ;-D on (Just my $0.02 in real *copper* pennies) Pardo
karl@haddock.ISC.COM (Karl Heuer) (03/02/88)
In article <4340@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes: > >Karl Heuer (karl@haddock.ima.isc.com) writes: > >> ... Alternately, you could petition X3J11 to add this with the syntax > >> float **a = d2malloc( float, nrows, ncols ); > >> where the first argument is the type. ... [in general] > >> ... this would require assistance from the compiler via a builtin. > >Or possibly: [pass an array of { sizeof(float), sizeof(float *) }] >This assumes that the important thing about the intermediate type pointers >is their size and not their format. There are systems where (char *) and (int *) have the same size but different formats. If the type in question is (char[4]), I think the distinction would be lost completely in your model. >I'm not sure that making it a builtin would make it any more portable. Assuming that there are only a finite number of pointer formats, the compiler could have a builtin `__typeof(type)` which returns a magic number. The macro could be defined as #define d2malloc(type, nr, nc) _d2malloc(__typeof(type *), nr, nc) and the library routine could be written void *_d2malloc(int typecode, size_t nr, size_t nc) { switch (typecode) { case __typeof(char *): ...; case __typeof(int *): ...; } } This seems like a fairly clean solution, and it would make the usage portable. (Recall that we were assuming that X3J11 would require this to be supported somehow. The above shows that my model could be supported on a strange architecture, in contrast to the original proposal.) (I did assume that appending "*" yields a valid type; this restriction is similar to one already in place in <stdarg.h>. It's still possible without that restriction, but I find the implementation a bit less clean.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
throopw@xyzzy.UUCP (Wayne A. Throop) (03/04/88)
> karl@haddock.ISC.COM (Karl Heuer) >> pardo@uw-june.UUCP (David Keppel) >>Or possibly: [pass an array of { sizeof(float), sizeof(float *) }] > [...this is not sufficent because...] > There are systems where (char *) and (int *) have the same size but different > formats. If the type in question is (char[4]), I think the distinction would > be lost completely in your model. Right. Karl's proposed solution: > void *_d2malloc(int typecode, size_t nr, size_t nc) { > switch (typecode) { > case __typeof(char *): ...; > case __typeof(int *): ...; > } > } ... would work, and allow one to code functions to create such array structures for arbitrary levels of dimensions (albeit nonportably), but it is an interesting problem to come up with a compact way of defining these things to any depth of dimension with no extensions to current C. And try to get it coded portably. Hmmmm... (imagine Lurch-like sandpapery sound of fingertips above keyboard...) Howzabout this: A macro alloc_dimention( type, n, initial_value ), which returns a (type *) pointer to the initial element of an array of n elements of (type), each initialized to the given initial_value (the initial value being evaluated "by-name" for each element, of course). Thus, to get the two-D, N-by-M float array, one would say { float ** a = alloc_dimention( float *, N, alloc_dimention( float, M, 0.0 )); ... } and references to elements of the array are just a[i][j], and the whole shebang can be passed to subroutines, wherein references to a[i][j] would work just as well. To get a three-dimentional structure declared, just do the obvious: { float *** a = alloc_dimention( float **, I, alloc_dimention( float *, J, alloc_dimention( float, K, 0.0 ))); ... } Now, can this macro be implemented? Well... sort of. Consider: /* This include file defines the "alloc_dimension" macro, and associated support cruft. The macro takes a type, a dimension limit, and an initial value. An array n of type is allocated, each element of which is assigned the by-name value of the initial value supplied, and a pointer to the first element of the array is returned. This version assumes that setjmp.h is already included, and malloc already defined. */ typedef struct _lab_node_s { struct _lab_node_s * next; jmp_buf jbuf; void *p; int i; } _lab_node_t; static _lab_node_t *_lab_list = 0, *_cur_lab; static void *_cur_p; #define alloc_dimension( t, n, v ) \ (_cur_lab = (_lab_node_t *)malloc(sizeof(_lab_node_t)),\ _cur_lab->next = _lab_list, _lab_list = _cur_lab,\ _lab_list->p = malloc(sizeof(t)*(n)),\ _lab_list->i = 0,\ setjmp( _lab_list->jbuf ) < 2 ?(\ ((t*)_lab_list->p)[_lab_list->i] = (v),\ ++(_lab_list->i) < (n) ? longjmp(_lab_list->jbuf,1)\ : longjmp(_lab_list->jbuf,2),\ 0):0,\ _cur_lab = _lab_list,_lab_list = _lab_list->next,\ _cur_p = _cur_lab->p,free(_cur_lab),(t*)_cur_p) Of course, this has some unpleasant limitations - it isn't clear to me that setjmp and longjmp are guaranteed to work inside comma expressions like this (though I *think* they are), so this may not really be portable code - freeing the thing becomes a hassle, requires another macro - some errorchecking was omitted for brevity (whew) - insufficent error checking for values of dimention limits - the overkill of using setjmp - the tedium of using a dynamic frame mechanism when a simple static mechanism would do if only we had compile-time execution of code a-la lisp macros - the general cruftiness, slowness, and bulkiness of the code Now it seems to me that, allowing non-constant expressions in the bounds of formal array arguments is the minimal, conservative, non-dope-vector, covers-most-bases solution. That is: g(){ double a[I][J][K]; f( a, I, J, K ); } f(a,ilim,jlim,klim) double a[ilim][jlim][klim]; /*currently illegal*/ int ilim,jlim,klim; { int i,j,k; /* do something to each array element */ for( i=0; i<ilim; ++i ){ for( j=0; j<jlim; ++j ){ for( k=0; k<klim; ++k ){ ... a[i][j][k] ...; } } } } ... would be made legal. The problematic point that the type of the formal a isn't precisely known until runtime can be made a special case to lint, and be declared to have unknown behavior when the bounds of the actual passed don't match those of the formal at run-time. Probably the harshest problem is that pointer formats for arrays of arrays of... to any number of "arrays of" levels all be the same, which isn't true now. I'd say either leave it as it is and do such things with pointers, or take the above rather conservative step. But then, who listens to me? -- A LISP programmer knows the value of everything, but the cost of nothing. --- Alan J. Perlis -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
ok@quintus.UUCP (Richard A. O'Keefe) (03/04/88)
In article <662@xyzzy.UUCP>, throopw@xyzzy.UUCP (Wayne A. Throop) writes: > Now it seems to me that, allowing non-constant expressions in the bounds > of formal array arguments is the minimal, conservative, non-dope-vector, > covers-most-bases solution. That is: > > g(){ > double a[I][J][K]; > f( a, I, J, K ); > } > f(a,ilim,jlim,klim) > double a[ilim][jlim][klim]; /*currently illegal*/ > int ilim,jlim,klim; > { ... > } > > ... would be made legal. The problematic point that the type of the > formal a isn't precisely known until runtime can be made a special case (That's not a problematic point in my view; the bounds of an array *are* currently regarded as part of its type, *but* that is a major design error shared with Pascal.) What are the differences between this and the C.A.P. proposal, other than the fact that GNU CC is said to support it, so that this proposal *is* prior art, and the C.A.P. proposal isn't? (1) This is exactly the Fortran approach. There is lots of experience with it, and the numerical people for whose sake the C.A.P. proposal was made will already be familiar with it. (2) One of the benefits is that you can easily declare that several array parameters are the same size, which is not possible in the current version of the C.A.P. proposal. (3) A major problem is that you can get the arguments wrong. (4) A minor problem is that a C program may alter arguments like ilim, jlim, klim (though presumably lint would warn about this), so a compiler either has to be prepared to make a copy of them. Conformant array bound parameters are 'const int' by definition, so this problem cannot arise. The current version of the C.A.P. proposal replaces the rather ugly '?' marker by the alternative keywords 'auto' and 'register', so that one would declare f in the C.A.P. proposal as void f(double a[auto ilim][auto jlim][auto klim]) { ... } (Note that the type is implicitly 'int', and no explicit over-ride of this is possible.) I think conformant arrays are cleaner than the Fortran hack, but I'd be happy with either.
peter@sugar.UUCP (Peter da Silva) (03/06/88)
In article <42529@sun.uucp>, dgh%dgh@Sun.COM (David Hough) writes: > void matmul(double a[?ar][?ac], /* ar >= p, ac >= q */ > double b[?br][?bc], /* br >= q, bc >= r */ > double c[?cr][?cc], /* cr >= p, cc >= r */ > int p, /* 0 <= p <= min(ar,cr) */ > int q, /* 0 <= q <= min(ac,br) */ > int r) /* 0 <= r <= min(bc,cc) */ WOw. I wouldn't have believed it... I prefer Fortran's syntax on this one. void matmul(double a[ar][ac], int ar, int ac, double b[br][bc], int br, int bc, double c[cr][cc], int cr, int cc, int p, int q, int r) { } Or: void matmul(a, ar, ac, b, br, bc, c, cr, cc, p, q, r) int ar, ac, br, bc, cr, cc, p, q, r; double a[ar][ac]; double b[br][bc]; double c[cr][cc]; { } Doesn't require any new operators, doesn't require passing descriptors around on the stack (yech), and allows you to allocate some space and give it different dimensions in different places. -- -- Peter da Silva `-_-' ...!hoptoad!academ!uhnix1!sugar!peter -- Disclaimer: These U aren't mere opinions... these are *values*.
throopw@xyzzy.UUCP (Wayne A. Throop) (03/07/88)
> ok@quintus.UUCP (Richard A. O'Keefe) >> throopw@xyzzy.UUCP (Wayne A. Throop) >> Now it seems to me that, allowing non-constant expressions in the bounds >> of formal array arguments is the minimal, conservative, non-dope-vector, >> covers-most-bases solution. > What are the differences between this and the C.A.P. proposal, other than > the fact that GNU CC is said to support it, so that this proposal *is* > prior art, and the C.A.P. proposal isn't? > > (1) This is exactly the Fortran approach. There is lots of experience > with it, and the numerical people for whose sake the C.A.P. proposal > was made will already be familiar with it. This first point hits the nail directly on the head. At this stage of standardization, it seems a Bad Idea to build a paper-mache feature on top of a plaster-of-paris patch to a weak point in the original K&R version of the language. I mean, we are treading on ground related to array "promotion" to pointers to start with, and are tacking on dubious and unproven syntax in a newly-proposed portion of the language (prototypes), and an area for which alternative proposals are likely to be made (for handling passing arrays by-value, for example). This definitely seems the wrong time to be tinkering around in this particular area in this particular fashion. Allowing non-constant expressions in this context doesn't alter the syntax much, and seems a safer step to take. But of course, I'm *really* upset that nobody blanched at my demonstration of how to shoehorn a loop into the body of a value-returning macro, to get iterated by-name evaluation. I guess you folks have stronger stomachs that I gave y'all credit for... -- The LISP programmer knows the value of everything, but the cost of nothing. --- Alan J. Perlis -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw