[comp.sys.sun] disk crisis and inconsistent mail

ghoti@bourbaki.mit.edu (03/25/90)

I was writing a simple program to convert a rasterfile to hex. Basically,
it does what od -hv does except that there are no line numbers or spaces
and there are linefeeds every 64 digits. In retrospect, I should have used
od -hv and piped it through awk or something. Anyway, that is not the
point.

The program originally contained a while statement whose condition was

((c=ftgetc(infile))!EOF) { ....}

and where ... involved writing stuff to another file. I found that the
program only would copy the first 64 bits and that it got stopped by bytes
of the form \377, which is apparently EOF but which is not really the end
of the rasterfile, the latter being over a megabyte. To get around this
problem, I changed the while statement to

while (1) { c=fgetc(infile); ...}

and compiled and ran it. I got my command line prompt a little while later
and decided that the program had simply stopped when it got to the real
end of the file. This was corroborated by the fact that the output file
was the right size given the size of the input file. Then I logged out.
Later on, I logged in again and saw that I had mail. When I typed mail, I
got messages saying that the file system was full and the mail program
aborted. I checked and saw using du that although I had a little more disk
usage than usual, it was not enough to cause a disk crisis. Even so, I
deleted 2 megabytes and again tried to read my mail. Again, I was informed
that the file system was full. I typed df and saw that it was 100% full.
Then I typed ps -aux to see what was going on on the system and saw to my
horror that my C program was still running. So I killed the job and the
file system no longer was full. When I typed mail, I was informed that I
had no mail.  I logged out and logged back in again and got the message
that I had mail.  I typed mail and the machine said I had no mail. All of
this is taking place on the machine lom1.math.yale.edu. I rlogin'ed to
cantor.math.yale.edu and also got an announcement that I had mail. When I
typed mail, it said "Mail: skipping garbage at beginning of messages" and
then said I had no mail.  I should mention that lom1 is a sun3 and cantor
is a sparc and they share their user files and I can get my mail on either
machine in principle.

So it appears that by running my badly written program, I somehow made the
mail malfunction. Now, a few weeks ago there was a problem with the SUNs
in the math department at MIT in which the file system became inconsistent
(I didn't cause that  problem !) and during that period I noticed that
this same funny business with the mail happened whenever I logged in to my
account at MIT. The funny business ceased when the consistency of the file
system was restored. So I am naively inclined to believe that the funny
business with the mail is a symptom of a file system which has become
inconsistent. At MIT this was described as a very serious problem and it
took them weeks to fix it, during which time they warned people not to
create new files. When the problem was fixed, they warned people about
running big jobs in the background, so I guess that might have been a
cause of the inconsistency. This makes is plausible that my little program
might have caused the file system at lom1 to become inconsistent.

I'm somewhat astonished that an operating system has been designed in such
a way that one incompetent user with no special priveleges can screw up
the system so thoroughly. But be that as it may, the question is: what can
one do to fix  a problem like this ?

Although I subscribe to this mailing list, I do so from lom1 and right now
I have no confidence in the mail there. So please reply to me directly at
my mit account which is: ghoti@laurent.mit.edu

I enclose a copy of the C program I wrote. Any comments on the problem of
the inconsistency are welcome.

Allan Adler
ghoti@laurent.mit.edu

===========================================================================

/* Convert a binary file to ASCII characters representing the hex digits */
#include <stdio.h>

#define FILENAMESIZE 20

FILE * infile, * outfile ;
char c;

void getfilenames()
{ char in[FILENAMESIZE],out[FILENAMESIZE];
  printf("\nPrint name of source file:  ");
  scanf("%s",in);
  infile=fopen(in,"r");
  if (infile==NULL) 
    { printf("can't open %s, aborting\n",in);
      abort();
    }
  printf("\nPrint name of target file:  ");
  scanf("%s",out);
  outfile=fopen(out,"w");
  if (outfile==NULL) 
    { printf("can't open %s, aborting\n",out);
      abort();
    }
}
/* Take ASCII char as input, take upper nibble and write it in hex. */
char top(j)
unsigned short j;
{ int k;
  k= j/16;
  if (k<10) return((char) (k+48));
return( (char) (k+87));
}
/* Take ASCII char as input, take lower nibble and write it in hex. */
char bottom(j)
unsigned short j;
{ int k;
  k= j%16;
  if (k<10) return((char) (k+48));
  return( (char) (k+87));
}

void transfer_data()
{ int i=0;
  unsigned short j;
   while (1){c=fgetc(infile);
     j=(c+256)%256;
    fprintf(outfile,"%c%c", top(j),bottom(j));
    i+=2;
    if (i==64) { fprintf(outfile,"\n"); i=0; }
  }
}

main()
{
  getfilenames();
  transfer_data();
}