physics@utcsstat.UUCP (08/09/83)
Here is a fragment of code based directly on p.106,107 of Kernighan & Ritchie: #define SIZE 30000 #define BLOCKS 15000 main() { char *blkptr[LINES]; /* Pointer to blocks of text, see f1 */ if((n = f1(blkptr)) >= 0){ m =f2(blkptr,n); writestrings(blkptr,m); } } f1(blkptr) char *blkptr[]; { char s[SIZE]; /* Here is where the text is */ blkptr[0] = s; while((s[i++] = getchar()) != EOF && i < (SIZE-1))) ; s[i++] = '\0'; return(i); } f2(blkptr,n) char *blkptr[]; { for(i=0, i<n; i++){ if(something){ *(blkptr[0} + i) = '\0'; blkptr[++nblock] = blkptr[0] + (i + 1); } } return(nblock); } This all works fine, except it is limited to an input size of 30,000 chars, it eats a lot of memory, and the application doesnt' require that the whole thing be accessed at once. Obvious solution: set SIZE to a round number like 512, add appropriate flags to f1, and change the if in main() to a while. Except, when one does that, on exit from f1 the array s gets garbage in it. Changing the declaration of s to static char s[SIZE]; fixes the problem. But the question is ---- What's the difference between an i_f and a w_h_i_l_e in main? For an i_f s does not get clobbered, but for a w_h_i_l_e it does. Why does the code in K & R, p106/107 work? David Harrison Dept. of Physics Univ. of Toronto ...!linus!utzoo!utcsstat!physics
tom@rlgvax.UUCP (Tom Beres) (08/11/83)
Another small lesson in C for those who would like it. Others may skip. The problem mentioned is the result of a subroutine which is supposed to: (a) return a pointer to something (in this case, a char array); and (b) allocate the space for that something itself. The trick, then, is how the subroutine allocates the space. Take the simplified example: main() { char *p; if (something) { p = foo(); printf("%s", p); } } char *foo() { int i; char buf[80+1]; for (i = 0; i < 80; i++) { if ((buf[i]=getchar()) == EOF || buf[i] == '\n') break; } buf[i] = 0; return(buf); } Note what happens because buf[] is declared to be an automatic variable. foo() returns the address of buf[], which is assigned to p. But upon exit from foo(), buf[] is de-allocated (i.e. it is popped off the stack), so p points to de-allocated space! As long as the de-allocated space is not re-allocated and re-used, p will point to the desired data. As soon as the de-allocated space is re-used (usually the result of another subroutine call), the data p points to will be trashed. So, the printf() in main() should NOT work, but it might, just out of sheer luck and the good graces of the O/S. This is the setup in the question that was posed. What was seen as the difference between an IF and a WHILE probably was the difference between 1 and several subroutine calls. The de-allocated space survived after one call, but eventually got clobbered after several. So, in foo() change the declaration to: static char buf[80+1]; Now, buf[] will not be de-allocated upon exit from foo(), and the above example will work fine. However, if main() is changed to: main() { char *p[10]; int i; for (i = 0; i < 10; i++) p[i] = foo(); for (i=0; i < 10; i++) printf("%s", p[i]); } Because buf[] is a static, it will be re-used every time foo() is called, so p[0], p[1], ..., p[9] end up ALL pointing to the same space (buf[]), and what will be there will be the characters from the last call to foo(). Not what we want. The resolution for this (and the most general solution) is what K & R did. They had their equivalent to foo() read the line, then call alloc() to allocate new space for the string, copy the string to the new space, and return the address of the new space. Therefore on every call, new space was allocated and there was no problem of re-use and reallocation of space. - Tom Beres {seismo, allegra, brl-bmd, mcnc, we13}!rlgvax!tom