GBOPOLY1@NUSVM.BITNET (fclim) (05/25/89)
Hi, This is a pretty long reply. My apologies to others. In article <8905230413.AA23343@umix.cc.umich.edu>, CORNELLC.cit.cornell.edu:Jacques_Gelinas@CMR001.BITNET writes > First question: tar and very-very-long-filenames >[stuff deleted] > Can I get these files out of the tape ? > (How did prof. Mackay get them on the tape ?) >[stuff deleted] >% tar xf /dev/rct8 >tar: ./tex82/README.WRITE-WHITE - cannot create >tar: ./DVIware/laser-setters/dvi2adobe_fonts/ > StoneInformal-SemiboldItalic.tfm - cannot create > (10 similar lines deleted) The maximum filename length on Domain/IX at SR9.7 is 32; see the MAXNAMLEN macro in /usr/include/sys/dir.h. A file with namelength longer than that can't be created. Even though strlen("README.WRITE-WHITE") is < 32, the length of the name actually stored in the Domain VTOC (their version of UNIX inode) is > 32. Aegis at Sr9.7 is case-insensitive (eg /COM/SH is no different from /cOm/Sh), but Unix has always been case-sensitive. Their workaround is to map upper- case char to a ':' followed by the lower-case char. Eg README.WRITE-WHITE is stored in the VTOC as :r:e:a:d:m:e.:w:r:i:t:e-:w:h:i:t:e which is why this file can't be extracted. There are two ways to extract those files: (1) hack John Gilmore's pd tar. When the fileNameLength is > MAXNAMLEN, then prompt for a new filename or truncate the filename. (2) get SR10 and use BSD4.3. MAXNAMLEN should be 1024 (me think). Furthermore, Aegis at SR10 is case-insensitive. Prof MacKay probably created the tape on a Sun box which has MAXNAMLEN set at 1024 (me think again). >(also shows that BSD4.2 at SR9.7 is compatible with other systems) What'll you say now? > Second question: Paranoia and (text) eof >[stuff deleted] > Can this be simplified on Apollo BSD4.2 systems ? >[stuff deleted] >testeof(iop) >FILE *iop; >{ register int c; > if (feof(iop)) > return(TRUE); > else { /* check to see if next is EOF */ > c = getc(iop); > if (c == EOF) > return(TRUE); > else { > (void) ungetc(c,iop); > return(FALSE); >} } } The simplest way is to delete all but the else body. Hence, if the file has n bytes, then there will be n less tests. This should work for Domain/IX at SR9.7 and most probably for BSD4.3 at SR10 or at least when /lib/clib becomes ANSI-compatible. However, Harbison and Steele in "C: A Reference Manual" sez that feof() should be used to check for EOF. >The 2nd ed. of the K.R. white book ... The 2nd ed. describe the ANSI definition of C and standard library. Domain/C and /lib/clib is not ANSI-compatible at SR9.7. I suggest you refer to the manuals provided by Apollo. > Third question: eof and binary files. >[stuff deleted] > Could someone explain to me the line > "fgetc returns EOF: Error 0" ? > Why is the first use of fgetc different ? > (By permuting the calls to getc and fgetc, you > can get other results. This looks weird.) >[stuff deleted] >% cat fgetc.c==getc.c >/* ------ is fgetc "like" getc ? -------- */ > main (){ ># include <stdio.h> > FILE * datf ; > int c ; > > datf = fopen("fgetc.dat","w+") ; > ># define BYTE 0377 > printf( "BYTE = %o, (int)(char)BYTE = %o\n",BYTE,(int)(char)BYTE); > if(fputc( BYTE, datf)==EOF ) perror("fputc returns EOF") ; > if( putc( BYTE, datf)==EOF ) perror(" putc returns EOF") ; > c = fputc(BYTE, datf ) ; printf("fputc: c = %o\n", c ) ; > c = putc(BYTE, datf ) ; printf(" putc: c = %o\n", c ) ; > > fseek( datf, 0L, 0) ; > > if( (c = fgetc(datf)) == EOF ) perror("fgetc returns EOF") ; > printf("fgetc: c = %o\n", c) ; > if( (c = getc(datf)) == EOF ) perror(" getc returns EOF") ; > printf(" getc: c = %o\n", c) ; > if( (c = fgetc(datf)) == EOF ) perror("fgetc returns EOF") ; > printf("fgetc: c = %o\n", c) ; > if( (c = getc(datf)) == EOF ) perror(" getc returns EOF") ; > printf(" getc: c = %o\n", c) ; > > if(fclose(datf)) perror("fclose") ; > system( "od -b fgetc.dat ; rm -i fgetc.dat" ) ; > } >% cc !* >cc fgetc.c==getc.c > >% a.out >BYTE = 377, (int)(char)BYTE = 37777777777 >fputc returns EOF: Error 0 > putc returns EOF: Error 0 >fputc: c = 37777777777 > putc: c = 37777777777 >fgetc: c = 377 > getc: c = 377 >fgetc returns EOF: Error 0 >fgetc: c = 37777777777 > getc: c = 377 >0000000 377 377 377 377 >0000004 Fgetc() is broken. Getc() is a macro defined in /usr/include/stdio.h as #define getc(p) (--(p)->_cnt >= 0 ? *(p)->_ptr++ & 0377 : _filbuf(p)) Fgetc() and getc() are among the buffered I/O routines. p->_base points to the buffer and p->_ptr points to the next byte to be read in. Normally, getc() will return an int with a value equal to the byte masked with 0377. In effect, this returns an unsigned char. When the buffer is empty, a (undocumented) routine _filbuf() is called to fill the buffer. After filling, _filbuf() also returns the next byte as an unsigned char if there is a next byte. Otherwise, when the end-of-file has been reached, _filbuf() returns EOF which is -1 or 0xffff or 0377...7. Fgetc() is similar to getc() but it is a function and not a macro. The first time it's called, it returns the value of _filbuf() which is an unsigned char since eof has not been reached. The next time fgetc() is called, it should returns an unsigned char or EOF. (this is the ANSI definition of fgetc()). However, Domain/IX.SR9.7 fgetc() returns the next char promoted to an int. In Domain/C, when a char is promoted to an int, the signed-ness is preserved. Therefore, 377 (a char -1) is promoted to 377...7 (an int -1). This int value is undistinguishable from the EOF -1 value. No error had occurred. This is indicated by perror()'s output: "Error 0". Fgetc() does work consistently except when it needs to call _filbuf() to fill the buffer. Normally, it will return the next byte promoted to an int; or when _filbuf() is called, it return the next byte as an unsigned char. To illustrate this, let's cat fgetc.dat fgetc.dat fgetc.dat fgetc.dat > foo Foo has 12 bytes of 377. When we run f = fopen("foo", "r"); for (i = 0; i < 12; i++) printf("%o\n", fgetc(f)); we'll get 377 377...7 \ 377...7 |__ 11 times ... | 377...7 / By default, I/O is buffered with a 1024 bytes buffer. We can change this by char buf[4]; f = fopen("foo", "r"); (void) setbuffer(f, buf, sizeof(buf)); for (i = 0; i < 12; i++) printf("%o\n", fgetc(f)); Now, we'll get 377 \ 377...7 |__ pattern 377...7 | 377...7 / 377 377...7 377...7 377...7 repeated 2 more times. Here, we have a 4 bytes buffer, so _filbuf() is called every 4 bytes. > Last question: default cc flags >All the machines we have are DN3000 or DN4000. Why is it necessary >to specify the -M3000 flag for the cc compiler? The RT/11 operating >system permitted me -in 1979- to customize the compilers by setting >some switches (like the number of lines per page for listings). >Can this be done also at installation time for the Apollo system? Don't know why -M3000 is needed. You can edit the Makefile and add -M3000 to CFLAGS. Hope this helps. :-) fclim --- gbopoly1 % nusvm.bitnet @ cunyvm.cuny.edu computer centre singapore polytechnic dover road singapore 0513.
krowitz@RICHTER.MIT.EDU (David Krowitz) (05/25/89)
The -M3000 switch to /bin/cc is the same as the -CPU 3000 switch to /com/cc. It tells the compiler that the code being generated will be run on a machine with a Motorola 68020 or 68030 processor and a 68881 or 68882 floating point chip (or an FPX or FPA floating point accelerator option running in 68881 emulation mode), and the the code does not have to be downwards compatible with the older 68010 based machines (ie. the DN300/320, the DSP80/80A, the DN400/420/600, and the DN460/660). By default, the compiler will generate code which can be run on all of the Motorola based Apollo nodes (ie. everything except the DN10000). This means using on (only) 68010 integer instructions and performing all floating point arithmetic using system calls (since some of the earlier machines had no floating point processors). Code compiled with the -M3000 / -CPU 3000 switch will run quite a bit faster, especially if it does a lot of floating point arithmetic, and it will execute on any of the following machines: DSP90, DN330, DN560/570/580/590, DN570-T/580-T/590-T, DN3000/3500/4000/4500. -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter@eddie.mit.edu krowitz%richter@athena.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
krowitz@RICHTER.MIT.EDU (David Krowitz) (05/25/89)
Actually, the maximum file name length under SR10 is 255 characters (the 256th character is a null terminator), and the maximum length of the entire pathname (ie. including all of the subdirectory names and the '/' characters) is 1023 characters (again, the 1024th character is a null terminator). -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter@eddie.mit.edu krowitz%richter@athena.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)