dougs@hcx2.SSD.HARRIS.COM (09/06/88)
Row and/or column major refers to the way the arrays are stored in memory. This is only relevant for 2D or greater dimensioned arrays. Row major means that as you move through *sequentially addressed* memory locations, the right-most subscript of an array reference will increase fastest. Column major means the left-most subscript varies fastest. When you dealt with graphs or arrays in school, the most usual way for you to envision them was probably row-major. a(1,1..4) 5 6 7 8 a(2,1..4) 6 1 8 0 a(3,1..4) 2 8 4 7 ... It is interesting to note that almost all languages have array accessing defined as row-major (C, Pascal, blah-blah-blah) but FORTRAN is column major. Why? .... Douglas G. Scofield dougs@hcx2.ssd.harris.com Harris Computer Systems Division 2101 W. Cypress Creek Rd. Ft. Lauderdale, FL 33069
eugene@eos.UUCP (Eugene Miya) (09/09/88)
Still no one can answer why column-major over row? I suspect this had to do with early implementations of compilers simply allocating space on the fly from left to right. I'll ask Backus when I get the chance in a couple of weeks. Another gross generalization from --eugene miya, NASA Ames Research Center, eugene@aurora.arc.nasa.gov resident cynic at the Rock of Ages Home for Retired Hackers: "Mailers?! HA!", "If my mail does not reach you, please accept my apology." {uunet,hplabs,ncar,decwrl,allegra,tektronix}!ames!aurora!eugene "Send mail, avoid follow-ups. If enough, I'll summarize."
dik@cwi.nl (Dik T. Winter) (09/09/88)
In article <44600002@hcx2> dougs@hcx2.SSD.HARRIS.COM writes: > Row and/or column major refers to the way the arrays are stored in memory. > This is only relevant for 2D or greater dimensioned arrays. Row major > means that as you move through *sequentially addressed* memory locations, > the right-most subscript of an array reference will increase fastest. Column > major means the left-most subscript varies fastest. ... > It is interesting to note that almost all languages have array accessing > defined as row-major (C, Pascal, blah-blah-blah) but FORTRAN is column (There are many languages where it is *not* defined.) > major. Why? .... > Well, Fortran is an old language, so compiler writers had to use older technology. To see why people thought about column-major first, consider how an array element is accessed. Given the declaration: DIMENSION X(15, 10) to access X(I, J) the following code has to be executed: &X + I + J * 15 - 16. (Of course I and J may be expressions!) The simplest way to do this is: push(&X), push(16), push(I), push(15), push(J), R=pop()*pop(), R=R*pop(), R=R+pop(), R=R-pop(), R=R+pop() So essentially the access code is in lexical order. To do this with row-major order involves a non-lexical order or playing with the stack (which older machines did not readily allow). Pascal is row major because array [1..5,1..10] of integer; is defined to be equivalent to array [1..5] of array [1..10] of integer; Interestingly, for languages that use dope vectors (Algol 60, Algol 68) for arrays, it does not matter what way the array is stored. However, in Algol 60, many implementations had facilities to do I/O on arrays; and it was thought more pleasing to have the output row major. The Algol 68 compiler for CDC Cybers has a flag to generate code for row major order or column major order. And routines compiled one way could cooperate without problems with routines compiled the other way! To see how this is done, let's first consider dopevectors. Assume in Algol 60, an array declared as real array a[l1:u1,l2:u2,l3:u3] The address of a[i,j,k] is: &a + (i-l1)*(u2-l2+1)*(u3-l3+1) + (j-l2)*(u3-l3+1) + (k-l3). The terms (u.-l.+1) are called strides. We can rearrange this to: &a + ((i-l1)*(u2-l2+1) + (j-l2))*(u3-l3+1) + (k-l3). So we need the following values: &a, l1, (u2-l2+1), l2, (u3-l3+1) and l3. We can do better (by semi-constant folding): &a-((l1*(u2-l2+1)+l2)*(u3-l3+1)+l3 + ((i*(u2-l2+1)+j)*(u3-l3+2)+l, where the first term is constant during the lifetime of the array. This must be adapted for Algol 68, because for arrays it is not necessary that there is any dimension with stride 1. Given the declarations: # this is the long form declaration, it can be done shorter. # ref [,,] real a = loc [l1:u1,l2:u2,l3:u3] real; ref [,] real b = a[,,k]; Now b is a two-dimensional array that accesses parts of a; and if we assume row-major for Algol 68, no dimension has stride 1. The best way to deal with this is to calculate (for b[i,j]) as follows: &b-(factor to compensate for lower bounds) + i*s1 + j*s2 where s1 and s2 are the strides for the first and second dimension. This is better on machines that have more than one register available. But now it no longer matters whether s2 evenly divides s1, and so the system may choose to make s1 < s2. And this is only a function of the code to manipulate the declaration; access remains the same, whatever way is chosen. It is a pity that a fairly modern language as Ada did not go this way, and dissallows the selections of parts of more-dimensional arrays. -- Disclaimer: I am not a compiler writer. -- dik t. winter, cwi, amsterdam, nederland INTERNET : dik@cwi.nl BITNET/EARN: dik@mcvax