[comp.mail.elm] elm just ate my mailbox .......

bronson@mfci.UUCP (04/28/88)

    Just 2 minutes ago, when I used the resyncronize command, I watched
as elm printed out 'seek error ??? while reading mailbox' (or something
like this) and then suddenly 60+ messages in my mailbox were gone!
(I'd gotten some of these messages earlier, but without losing mail).
Any idea why I suddenly starting getting seek errors today ?
(too much junk in my mailbox) ?
    Until this happened I've liked to maintain a large set of files
in /usr/spool/mail, and keep them as a reminder. How do people
use mbox, as another folder ?
    
    I really like elm, but I'm getting nervous ....
Tan Bronson
Multiflow Computer Inc  UUCP(work): {yale,uunet}!mfci!bronson 
175 N Main St 		UUCP(home): {yale,mfci}!bronson!tan 
Branford, Ct 06405	Phone(work):(203)-488-6090 x228

zentrale@rmi.UUCP (RMI Net) (04/29/88)

In article <372@m3.mfci.UUCP> bronson@mfci.UUCP () writes:
: 
:     Just 2 minutes ago, when I used the resyncronize command, I watched
: as elm printed out 'seek error ??? while reading mailbox' (or something
: like this) and then suddenly 60+ messages in my mailbox were gone!
:     I really like elm, but I'm getting nervous ....
: Tan Bronson
: Multiflow Computer Inc  UUCP(work): {yale,uunet}!mfci!bronson 
: 175 N Main St 		UUCP(home): {yale,mfci}!bronson!tan 
: Branford, Ct 06405	Phone(work):(203)-488-6090 x228



Just a short hint -in case ...-:

You should find your mail in /tmp/mbox.logname. resync never
worked here...

(I still have 1.2a.)

Regards,
Rupert

*****************************************************************
* addresses:  uucp   zentrale@rmi.de      cis    72446,415      *
*             bix    rmiaachen            bitnet rmohr@unido    *
*****************************************************************

chip@ateng.UUCP (Chip Salzenberg) (05/02/88)

In article <372@m3.mfci.UUCP> bronson@mfci.UUCP () writes:
>
>    Just 2 minutes ago, when I used the resyncronize command, I watched
>as elm printed out 'seek error ??? while reading mailbox' (or something
>like this) and then suddenly 60+ messages in my mailbox were gone!

We once had trouble with the resync command.  I think that our troubles
were caused by the mailbox locking, which was not appropriate for our system.
Xenix systems lock with "/tmp/basename.mlk", where 'basename' is the file
name of the mailbox in question.  For /usr/spool/mail/foo, then, the lock
file should be "/tmp/foo.mlk".

Be sure that your Elm and your mail delivery program are locking alike.
-- 
Chip Salzenberg                "chip@ateng.UU.NET" or "codas!ateng!chip"
A T Engineering                My employer may or may not agree with me.
  "I must create a system or be enslaved by another man's." -- Blake

tkr@praxis.co.uk (Tim Rylance) (05/11/88)

In article <372@m3.mfci.UUCP> bronson@mfci.UUCP () writes:
    
        Just 2 minutes ago, when I used the resyncronize command, I watched
    as elm printed out 'seek error ??? while reading mailbox' (or something
    like this) and then suddenly 60+ messages in my mailbox were gone!
    (I'd gotten some of these messages earlier, but without losing mail).
    Any idea why I suddenly starting getting seek errors today ?
    (too much junk in my mailbox) ?
        Until this happened I've liked to maintain a large set of files
    in /usr/spool/mail, and keep them as a reminder. 
        
We had a major outbreak of this last year.  I also like to keep reminders
in /usr/spool/mail, and when Elm ate over 200 of them I became strongly
motivated to solve the problem.

It happens when /tmp is full. This occurs frequently with the pathetic
nearly-full-from-the-start 7Mb root filesystem in SunOS 3.X (fixed in
Sys4-3.2 and 4.0.)  When Elm starts up it copies /usr/spool/mail/foo to
/tmp/mbox.foo, building a table of headers and message offsets as it goes.
If /tmp happens to be full you may notice the subliminal "write failed:
file system full" message flash by.  If you don't, you will not realise
anything is amiss because Elm extracts the messages you read from
/usr/spool/mail/foo.  But when you quit/resynchronize/change mailbox
Elm copies /tmp/mbox.foo back to /usr/spool/mail/foo, skipping deleted
messages.  At which point it discovers that /tmp/mbox.foo is not as
large as it should be (hence the "seek failed...") and collapses in
a heap, having destroyed your mailbox.

In fact Elm *never* checks for errors after writing.  I went through it 
adding checks and trying to do something reasonably intelligent when a
write fails.  I now give up immediately if /tmp is full on startup, and
I copy /tmp/mbox.foo back to /usr/spool/mail/foo.<pid> and then rename
the latter to avoid abandoning mail in /tmp if /usr/spool is full.
I also removed the use of temporary file names constructed from getpid()+1
and replaced the mailbox locking code (which appears to contain a race)
with that from GNU Emacs.

My diffs follow.  Your line numbers will differ.  Also note that if
/usr/spool/mail is not world-writeable a little more work is needed...

diff -rc elm-1.5b/hdrs/defs.h elm-1.5c/hdrs/defs.h
*** elm-1.5b/hdrs/defs.h	Tue May  5 15:48:17 1987
--- elm-1.5c/hdrs/defs.h	Wed Jul 29 14:51:52 1987
***************
*** 6,12
  
  #include "sysdefs.h"	/* system/configurable defines */
  
! #define VERSION		"1.5b" /* Version number!  WHAT_STRING should agree */
  
  #define WHAT_STRING	"@(#) Version 1.5b, April 1987"
  

--- 6,12 -----
  
  #include "sysdefs.h"	/* system/configurable defines */
  
! #define VERSION		"1.5c" /* Version number!  WHAT_STRING should agree */
  
  #define WHAT_STRING	"@(#) Version 1.5c, 29th July 1987"
  
***************
*** 8,14
  
  #define VERSION		"1.5b" /* Version number!  WHAT_STRING should agree */
  
! #define WHAT_STRING	"@(#) Version 1.5b, April 1987"
  
  #define KLICK		10
  

--- 8,14 -----
  
  #define VERSION		"1.5c" /* Version number!  WHAT_STRING should agree */
  
! #define WHAT_STRING	"@(#) Version 1.5c, 29th July 1987"
  
  #define KLICK		10
  
diff -rc elm-1.5b/hdrs/sysdefs.h elm-1.5c/hdrs/sysdefs.h
*** elm-1.5b/hdrs/sysdefs.h	Tue Jun 30 16:45:55 1987
--- elm-1.5c/hdrs/sysdefs.h	Tue Jul 28 19:09:36 1987
***************
*** 190,195
  #define OLDEBUG		"ELM:debug.last"
  
  #define temp_file	"/tmp/snd."
  #define temp_form_file	"/tmp/form."
  #define temp_mbox	"/tmp/mbox."
  #define temp_print      "/tmp/print."

--- 190,196 -----
  #define OLDEBUG		"ELM:debug.last"
  
  #define temp_file	"/tmp/snd."
+ #define temp_hdr	"/tmp/hdr."
  #define temp_form_file	"/tmp/form."
  #define temp_mbox	"/tmp/mbox."
  #define temp_print      "/tmp/print."
diff -rc elm-1.5b/src/file.c elm-1.5c/src/file.c
*** elm-1.5b/src/file.c	Tue May  5 11:46:33 1987
--- elm-1.5c/src/file.c	Wed Jul 22 10:18:08 1987
***************
*** 162,168
  
  	save_current = current;
  	current = number+1;
! 	copy_message("", fd, FALSE, FALSE);
  	current = save_current;
  
  	if (resolve_mode)

--- 162,172 -----
  
  	save_current = current;
  	current = number+1;
! 	if (copy_message("", fd, FALSE, FALSE) != 0) {
! 	  error2("Error writing %s - message %d not saved", filename, number);
! 	  return;	/* we haven't marked the message DELETED yet,
! 			   so there's no cause to panic */
! 	}
  	current = save_current;
  
  	if (resolve_mode)
diff -rc elm-1.5b/src/file_utils.c elm-1.5c/src/file_utils.c
*** elm-1.5b/src/file_utils.c	Tue May  5 11:48:16 1987
--- elm-1.5c/src/file_utils.c	Wed Jul 29 14:23:10 1987
***************
*** 184,190
  	}
  
  	while (fgets(buffer, VERY_LONG_STRING, from_file) != NULL)
! 	  fputs(buffer, to_file);
  
  	fclose(from_file);
  	fclose(to_file);

--- 184,196 -----
  	}
  
  	while (fgets(buffer, VERY_LONG_STRING, from_file) != NULL)
! 	  if (fprintf(to_file, "%s", buffer) == EOF) {
! 	    dprint(1, (debugfile, "Error %d writing %s (copy)\n",
! 		   errno, to));
! 	    error1("error writing %s", to);
! 	    force_final_newline(to_file);
! 	    return(1);
! 	  }
  
  	if (fflush(to_file) == EOF) {
            dprint(1, (debugfile, "Error %d fflushing %s (copy)\n",
***************
*** 186,191
  	while (fgets(buffer, VERY_LONG_STRING, from_file) != NULL)
  	  fputs(buffer, to_file);
  
  	fclose(from_file);
  	fclose(to_file);
  

--- 192,205 -----
  	    return(1);
  	  }
  
+ 	if (fflush(to_file) == EOF) {
+           dprint(1, (debugfile, "Error %d fflushing %s (copy)\n",
+ 		 errno, to));
+ 	  error1("error writing %s", to);
+ 	  force_final_newline(to_file);
+ 	  return(1);
+ 	}
+ 
  	fclose(from_file);
  	fclose(to_file);
  
***************
*** 261,264
  
  	fprintf(fd, "%d\n", header_table[current-1].index_number);
  	fclose(fd);
  }

--- 275,292 -----
  
  	fprintf(fd, "%d\n", header_table[current-1].index_number);
  	fclose(fd);
+ }
+ 
+ force_final_newline(f)
+ FILE *f;
+ {
+ 	/** Try to replace the last byte of the file with a \n.
+ 	    Called when a write has failed, presumably because
+ 	    a file system is full, to prevent the next message
+ 	    written to the file "vanishing" because the "From "
+ 	    is not at the beginning of the line.  No error 
+ 	    checking - if it doesn't work at least we tried **/
+ 	
+ 	fseek(f,-1,2);	/* 1 byte before EOF */
+ 	putc('\n',f);
  }
diff -rc elm-1.5b/src/fileio.c elm-1.5c/src/fileio.c
*** elm-1.5b/src/fileio.c	Tue May  5 11:52:27 1987
--- elm-1.5c/src/fileio.c	Wed Jul 29 14:24:50 1987
***************
*** 19,24
  
  char *error_name();
  
  copy_message(prefix, dest_file, remove_header, remote)
  char *prefix;
  FILE *dest_file;

--- 19,25 -----
  
  char *error_name();
  
+ int
  copy_message(prefix, dest_file, remove_header, remote)
  char *prefix;
  FILE *dest_file;
***************
*** 30,35
              then it will start copying into the file... If remote is true
  	    then it will append "remote from <hostname>" at the end of the
  	    very first line of the file (for remailing) 
  	**/
  
      char buffer[LONG_SLEN];

--- 31,37 -----
              then it will start copying into the file... If remote is true
  	    then it will append "remote from <hostname>" at the end of the
  	    very first line of the file (for remailing) 
+ 	    Returns 0 if successful, non-zero otherwise
  	**/
  
      char buffer[LONG_SLEN];
***************
*** 43,49
  		header_table[current-1].offset, "copy_message"));
         error1("ELM [seek] failed trying to read %d bytes into file",
  	     header_table[current-1].offset);
!        return;
      }
  
      /* how many lines in message? */

--- 45,51 -----
  		header_table[current-1].offset, "copy_message"));
         error1("ELM [seek] failed trying to read %d bytes into file",
  	     header_table[current-1].offset);
!        return 1;
      }
  
      /* how many lines in message? */
***************
*** 71,77
  	    ok = 0;	/* STOP NOW! */
  	  }
  	  else
! 	    fprintf(dest_file, "%s%s", prefix, buffer);
      }
      if (strlen(buffer) + strlen(prefix) > 1)
        fprintf(dest_file, "\n");	/* blank line to keep mailx happy *sigh* */

--- 73,84 -----
  	    ok = 0;	/* STOP NOW! */
  	  }
  	  else
! 	    if (fprintf(dest_file, "%s%s", prefix, buffer) == EOF) {
! 	      dprint(1, (debugfile, "Error %d writing (copy_message)\n",
! 		     errno));
! 	      force_final_newline(dest_file);
! 	      return 1;
! 	    }
      }
  
      if (strlen(buffer) + strlen(prefix) > 1) {
***************
*** 73,80
  	  else
  	    fprintf(dest_file, "%s%s", prefix, buffer);
      }
!     if (strlen(buffer) + strlen(prefix) > 1)
!       fprintf(dest_file, "\n");	/* blank line to keep mailx happy *sigh* */
  }
  
  /********  the following routines are for a nice clean way to preserve

--- 80,104 -----
  	      return 1;
  	    }
      }
! 
!     if (strlen(buffer) + strlen(prefix) > 1) {
!       /* need blank line to keep mailx happy *sigh* */
!       if (fprintf(dest_file, "\n") == EOF) {
!         dprint(1, (debugfile, "Error %d writing \\n (copy_message)\n",
! 	       errno));
!         force_final_newline(dest_file);
!         return 1;
!       }
!     }
! 
!     if (fflush(dest_file) == EOF) {
!       dprint(1, (debugfile, "Error %d fflushing (copy_message)\n",
! 	     errno));
!       force_final_newline(dest_file);
!       return 1;
!     }
! 
!     return 0;
  }
  
  /********  the following routines are for a nice clean way to preserve
diff -rc elm-1.5b/src/leavembox.c elm-1.5c/src/leavembox.c
*** elm-1.5b/src/leavembox.c	Tue Jun 30 16:31:55 1987
--- elm-1.5c/src/leavembox.c	Tue Jul 28 15:11:43 1987
***************
*** 170,176
  
  	if (! mbox_specified) {
  	  if (pending) {                /* keep some messages pending! */
! 	    sprintf(outfile,"%s%d", temp_mbox, getpid());
  	    unlink(outfile);
  	  }
  	  else if (mailbox_defined)	/* save to specified mailbox */

--- 163,171 -----
  
  	if (! mbox_specified) {
  	  if (pending) {                /* keep some messages pending! */
! 	    /* put temp file into same filesystem as user's maildrop
! 	       to avoid leaving mail in /tmp if we have space problems */
! 	    sprintf(outfile,"%s%s.%d", mailhome, username, getpid());
  	    unlink(outfile);
  	  }
  	  else if (mailbox_defined)	/* save to specified mailbox */
***************
*** 220,226
  	      else {
  		dprint(2, (debugfile, "#%d, ", current));
  	      }
! 	      copy_message("", temp, FALSE, FALSE);
  	    }
  	  fclose(temp);
  	  dprint(2, (debugfile, "\n\n"));

--- 215,227 -----
  	      else {
  		dprint(2, (debugfile, "#%d, ", current));
  	      }
! 	      if (copy_message("", temp, FALSE, FALSE) != 0) {
! 		/* probably a file system full somewhere
! 		   we haven't deleted anything important yet,
! 		   so we can quit normally deleting temp files */
! 		error1("error writing %s - leaving mail unchanged", outfile);
! 		leave();
! 	      }
  	    }
  	  fclose(temp);
  	  dprint(2, (debugfile, "\n\n"));
***************
*** 264,270
  			  infile));
  	          dprint(1, (debugfile, "** %s - %s **\n", error_name(errno),
  		          error_description(errno)));
! 	          error("something godawful is happening to me!!!");
  		  emergency_exit();
  	        }
  		else {

--- 265,271 -----
  			  infile));
  	          dprint(1, (debugfile, "** %s - %s **\n", error_name(errno),
  		          error_description(errno)));
! 	          error1("leaving mail in %s", outfile);
  		  emergency_exit();
  	        }
  		else {
***************
*** 284,290
  		error_description(errno));
  	      emergency_exit();
  	    }
! 	  unlink(outfile);
  	}
  	else if (keep_empty_files) {
  	  sleep(1);

--- 285,291 -----
  		error_description(errno));
  	      emergency_exit();
  	    }
! 	  unlink(outfile);	/* link succeeeded - complete rename */
  	}
  	else if (keep_empty_files) {
  	  sleep(1);
***************
*** 333,339
  	return(to_delete);	
  }
  
! char lock_name[SLEN];
  
  lock(direction)
  int direction;

--- 334,341 -----
  	return(to_delete);	
  }
  
! char lock_name[SLEN],
!      temp_name[SLEN];
  
  lock(direction)
  int direction;
***************
*** 340,351
  {
  	/** Create lock file to ensure that we don't get any mail 
  	    while altering the mailbox contents!
- 	    If it already exists sit and spin until 
-                either the lock file is removed...indicating new mail
- 	    or
- 	       we have iterated MAX_ATTEMPTS times, in which case we
- 	       either fail or remove it and make our own (determined
- 	       by if REMOVE_AT_LAST is defined in header file
  
  	    If direction == INCOMING then DON'T remove the lock file
  	    on the way out!  (It'd mess up whatever created it!).

--- 342,347 -----
  {
  	/** Create lock file to ensure that we don't get any mail 
  	    while altering the mailbox contents!
  
  	    This code was lifted from GNU Emacs (etc/movemail.c)
  	**/
***************
*** 347,354
  	       either fail or remove it and make our own (determined
  	       by if REMOVE_AT_LAST is defined in header file
  
! 	    If direction == INCOMING then DON'T remove the lock file
! 	    on the way out!  (It'd mess up whatever created it!).
  	**/
  
  	register int iteration = 0, access_val, lock_fd;

--- 343,349 -----
  	/** Create lock file to ensure that we don't get any mail 
  	    while altering the mailbox contents!
  
! 	    This code was lifted from GNU Emacs (etc/movemail.c)
  	**/
  
  	struct stat st;
***************
*** 351,357
  	    on the way out!  (It'd mess up whatever created it!).
  	**/
  
! 	register int iteration = 0, access_val, lock_fd;
  
  	sprintf(lock_name,"%s%s.lock", mailhome, username);
  

--- 346,354 -----
  	    This code was lifted from GNU Emacs (etc/movemail.c)
  	**/
  
! 	struct stat st;
! 	long now;
! 	int desc, tem;
  
  	sprintf(temp_name,"%s%s:%d", mailhome, username, getpid());
  	sprintf(lock_name,"%s%s.lock", mailhome, username);
***************
*** 353,358
  
  	register int iteration = 0, access_val, lock_fd;
  
  	sprintf(lock_name,"%s%s.lock", mailhome, username);
  
  	access_val = access(lock_name, ACCESS_EXISTS);

--- 350,356 -----
  	long now;
  	int desc, tem;
  
+ 	sprintf(temp_name,"%s%s:%d", mailhome, username, getpid());
  	sprintf(lock_name,"%s%s.lock", mailhome, username);
  
  	unlink (temp_name);
***************
*** 355,361
  
  	sprintf(lock_name,"%s%s.lock", mailhome, username);
  
! 	access_val = access(lock_name, ACCESS_EXISTS);
  
  	while (access_val != -1 && iteration++ < MAX_ATTEMPTS) {
  	  dprint(2, (debugfile, 

--- 353,359 -----
  	sprintf(temp_name,"%s%s:%d", mailhome, username, getpid());
  	sprintf(lock_name,"%s%s.lock", mailhome, username);
  
! 	unlink (temp_name);
  
  	while (1) {
  	  /* Create the lock file, but not under the lock file name.  */
***************
*** 357,401
  
  	access_val = access(lock_name, ACCESS_EXISTS);
  
! 	while (access_val != -1 && iteration++ < MAX_ATTEMPTS) {
! 	  dprint(2, (debugfile, 
! 		  "File '%s' currently exists!  Waiting...(lock)\n", 
! 		  lock_name));
! 	  if (direction == INCOMING)
! 	    PutLine0(LINES, 0, "Mail being received!\twaiting...");
! 	  else
! 	    error1("Attempt %d: Mail being received...waiting", 
!                    iteration);
! 	  sleep(5);
! 	  access_val = access(lock_name, ACCESS_EXISTS);
! 	}
! 	
! 	if (access_val != -1) {
! 
! #ifdef REMOVE_AT_LAST
! 
! 	  /** time to waste the lock file!  Must be there in error! **/
! 
! 	  dprint(2, (debugfile, 
! 	     "Warning: I'm giving up waiting - removing lock file(lock)\n"));
! 	  if (direction == INCOMING)
! 	    PutLine0(LINES, 0,"\nTimed out - removing current lock file...");
! 	  else
! 	    error("Throwing away the current lock file!");
! 
! 	  if (unlink(lock_name) != 0) {
! 	    dprint(1, (debugfile,
! 		  "Error %s (%s)\n\ttrying to unlink file %s (%s)\n", 
! 		     error_name(errno), error_description(errno), lock_name));
! 	    PutLine1(LINES, 0, 
! 		   "\n\rI couldn't remove the current lock file %s\n\r", 
! 		   lock_name);
! 	    PutLine2(LINES, 0, "** %s - %s **\n\r", error_name(errno),
! 		   error_description(errno));
! 	    if (direction == INCOMING)
! 	      leave();
! 	    else
! 	      emergency_exit();
  	  }
  	  
  	  /* everything is okay, so lets act as if nothing had happened... */

--- 355,367 -----
  
  	unlink (temp_name);
  
! 	while (1) {
! 	  /* Create the lock file, but not under the lock file name.  */
! 	  /* Give up if cannot do that.  */
! 	  desc = open (temp_name, O_WRONLY | O_CREAT, 0666);
! 	  if (desc < 0) {
! 	    error1("Can't create temporary lock file %s", temp_name);
! 	    leave();
  	  }
  	  close (desc);
  	  
***************
*** 397,402
  	    else
  	      emergency_exit();
  	  }
  	  
  	  /* everything is okay, so lets act as if nothing had happened... */
  

--- 363,369 -----
  	    error1("Can't create temporary lock file %s", temp_name);
  	    leave();
  	  }
+ 	  close (desc);
  	  
  	  tem = link (temp_name, lock_name);
  	  unlink (temp_name);
***************
*** 398,417
  	      emergency_exit();
  	  }
  	  
! 	  /* everything is okay, so lets act as if nothing had happened... */
! 
! #else
! 
! 	  /* okay...we die and leave, not updating the mailfile mbox or
! 	     any of those! */
! 	  if (direction == INCOMING) {
! 	    PutLine1(LINES, 0, "\nGiving up after %d iterations...", iteration);
! 	    PutLine0(LINES, 0, 
! 		"Please try to read your mail again in a few minutes.\n");
! 	    dprint(2, (debugfile, 
! 		    "Warning: bailing out after %d iterations...(lock)\n", 
! 		    iteration));
! 	    leave_locked(0);
  	  }
  	  else {
  	    dprint(2, (debugfile, 

--- 365,381 -----
  	  }
  	  close (desc);
  	  
! 	  tem = link (temp_name, lock_name);
! 	  unlink (temp_name);
! 	  if (tem >= 0)
! 	    break;
! 	  sleep (1);
! 	  
! 	  /* If lock file is a minute old, unlock it.  */
! 	  if (stat (lock_name, &st) >= 0) {
! 	    now = time (0);
! 	    if (st.st_ctime < now - 60)
! 	      unlink (lock_name);
  	  }
  	}
  }
***************
*** 413,426
  		    iteration));
  	    leave_locked(0);
  	  }
- 	  else {
- 	    dprint(2, (debugfile, 
- 		   "Warning: after %d iterations, timed out! (lock)\n", 
- 		   iteration));
- 	    leave(error("Timed out on lock file reads.  Leaving program"));
- 	  }
- 
- #endif
  	}
  
  	/* if we get here we can create the lock file, so lets do it! */

--- 377,382 -----
  	    if (st.st_ctime < now - 60)
  	      unlink (lock_name);
  	  }
  	}
  }
  
***************
*** 422,453
  
  #endif
  	}
- 
- 	/* if we get here we can create the lock file, so lets do it! */
- 
- 	if ((lock_fd = creat(lock_name, 0)) == -1) {
- 	  dprint(1, (debugfile,
- 		 "Can't create lock file: creat(%s) raises error %s (lock)\n", 
- 		  lock_name, error_name(errno)));
- 	  if (errno == EACCES)
- 	    leave(error1(
-                  "Can't create lock file!  I need write permission in %s!\n\r",
- 		  mailhome));
- 	  else {
- 	    dprint(1, (debugfile, 
- 		  "Error encountered attempting to create lock %s\n", 
- 		  lock_name));
- 	    dprint(1, (debugfile, "** %s - %s **\n", error_name(errno),
- 		  error_description(errno)));
- 	    PutLine1(LINES, 0,
-          "\n\rError encountered while attempting to create lock file %s;\n\r", 
- 		  lock_name);
- 	    PutLine2(LINES, 0, "** %s - %s **\n\r", error_name(errno),
- 		  error_description(errno));
- 	    leave();
- 	  }
- 	}
- 	close(lock_fd);	/* close it.  We don't want to KEEP the thing! */
  }
  
  unlock()

--- 378,383 -----
  	      unlink (lock_name);
  	  }
  	}
  }
  
  unlock()
diff -rc elm-1.5b/src/mailmsg2.c elm-1.5c/src/mailmsg2.c
*** elm-1.5b/src/mailmsg2.c	Tue May  5 11:54:10 1987
--- elm-1.5c/src/mailmsg2.c	Wed Jul 22 13:19:18 1987
***************
*** 211,217
  
  	/** write all header information into real_reply **/
  
! 	sprintf(filename2,"%s%d",temp_file, getpid()+1);
  	
  	/** try to write headers to new temp file **/
  

--- 211,217 -----
  
  	/** write all header information into real_reply **/
  
! 	sprintf(filename2,"%s%d",temp_hdr, getpid());
  	
  	/** try to write headers to new temp file **/
  
diff -rc elm-1.5b/src/newmbox.c elm-1.5c/src/newmbox.c
*** elm-1.5b/src/newmbox.c	Wed Jun 24 18:50:38 1987
--- elm-1.5c/src/newmbox.c	Wed Jul 29 11:12:13 1987
***************
*** 341,347
  	    }
  	  }
  
! 	  if (copyit) fputs(buffer, temp);
  	  line_bytes = (long) strlen(buffer); 
  	  line++;
  	  if (first_word(buffer,"From ")) {

--- 341,352 -----
  	    }
  	  }
  
! 	  if (copyit)
! 	    if(fprintf(temp, "%s", buffer) == EOF) {
! 	      error1("error writing %s - leaving mail unchanged", temp_filename);
! 	      leave();
! 	    }
! 
  	  line_bytes = (long) strlen(buffer); 
  	  line++;
  	  if (first_word(buffer,"From ")) {
***************
*** 451,456
  	  }
  	  bytes += (long) line_bytes;
  	}
  
  	header_table[count > 0? count-1:count].lines = line + 1;
  	

--- 467,478 -----
  	  }
  	  bytes += (long) line_bytes;
  	}
+ 
+ 	if (copyit)
+ 	  if(fflush(temp) == EOF) {
+ 	    error1("error writing %s - leaving mail unchanged", temp_filename);
+ 	    leave();
+ 	  }
  
  	header_table[count > 0? count-1:count].lines = line + 1;
  	
diff -rc elm-1.5b/src/utils.c elm-1.5c/src/utils.c
*** elm-1.5b/src/utils.c	Tue May  5 11:49:17 1987
--- elm-1.5c/src/utils.c	Wed Jul 22 13:20:30 1987
***************
*** 64,70
  	dprint(1, (debugfile,
  	     "     The composition file : %s%d\n", temp_file, getpid()));
  	dprint(1, (debugfile,
! 	     "     The header comp file : %s%d\n", temp_file, getpid()+1));
  	dprint(1, (debugfile,
  	     "     The readmsg data file: %s/%s\n", home, readmsg_file));
  

--- 64,70 -----
  	dprint(1, (debugfile,
  	     "     The composition file : %s%d\n", temp_file, getpid()));
  	dprint(1, (debugfile,
! 	     "     The header comp file : %s%d\n", temp_hdr, getpid()));
  	dprint(1, (debugfile,
  	     "     The readmsg data file: %s/%s\n", home, readmsg_file));
  
***************
*** 98,104
  	sprintf(buffer,"%s%d",temp_file, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
! 	sprintf(buffer,"%s%d",temp_file, getpid()+1);  /* editor buffer */
  	(void) unlink(buffer);
  
  	sprintf(buffer,"%s%s",temp_mbox, username);  /* temp mailbox */

--- 98,104 -----
  	sprintf(buffer,"%s%d",temp_file, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
! 	sprintf(buffer,"%s%d",temp_hdr, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
  	sprintf(buffer,"%s%s",temp_mbox, username);  /* temp mailbox */
***************
*** 110,115
  	sprintf(buffer,"%s%s.lock",mailhome, username); /* lock file */
  	(void) unlink(buffer);
  
  	if (! mail_only) {
  	  MoveCursor(LINES,0);
  	  Writechar('\n');

--- 110,118 -----
  	sprintf(buffer,"%s%s.lock",mailhome, username); /* lock file */
  	(void) unlink(buffer);
  
+ 	sprintf(buffer,"%s%s.%d",mailhome, username, getpid()); /* temp maildrop */
+ 	(void) unlink(buffer);
+ 
  	if (! mail_only) {
  	  MoveCursor(LINES,0);
  	  Writechar('\n');
***************
*** 135,141
  	sprintf(buffer,"%s%d",temp_file, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
! 	sprintf(buffer,"%s%d",temp_file, getpid()+1);  /* editor buffer */
  	(void) unlink(buffer);
  
  	if (! mail_only) {

--- 138,144 -----
  	sprintf(buffer,"%s%d",temp_file, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
! 	sprintf(buffer,"%s%d",temp_hdr, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
  	if (! mail_only) {
***************
*** 165,171
  	sprintf(buffer,"%s%d",temp_file, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
! 	sprintf(buffer,"%s%d",temp_file, getpid()+1);  /* editor buffer */
  	(void) unlink(buffer);
  
  	sprintf(buffer,"%s%s",temp_mbox, username);  /* temp mailbox */

--- 168,174 -----
  	sprintf(buffer,"%s%d",temp_file, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
! 	sprintf(buffer,"%s%d",temp_hdr, getpid());  /* editor buffer */
  	(void) unlink(buffer);
  
  	sprintf(buffer,"%s%s",temp_mbox, username);  /* temp mailbox */


-- 
Tim Rylance	Praxis Systems plc, 20 Manvers St, BATH BA1 1PX, UK
		...!uunet!mcvax!ukc!praxis!tkr

shields@ists (Paul Shields) (05/16/88)

In article <2287@newton.praxis.co.uk>, tkr@praxis.co.uk (Tim Rylance) writes:
[...]
> In fact Elm *never* checks for errors after writing.  I went through it 
> adding checks and trying to do something reasonably intelligent when a
> write fails.  I now give up immediately if /tmp is full on startup, and
> I copy /tmp/mbox.foo back to /usr/spool/mail/foo.<pid> and then rename
> the latter to avoid abandoning mail in /tmp if /usr/spool is full.
> I also removed the use of temporary file names constructed from getpid()+1
> and replaced the mailbox locking code (which appears to contain a race)
> with that from GNU Emacs.
[...]

The posted solution to this is nifty, but is not precisely what I'm looking
for. I'm trying to bring Elm 1.7 beta up under SunOS 3.5, and have encountered
difficulty with locking files across NFS.

Apparently the lockf() procedures in conjunction with the lockd daemon will 
do it for me.  But this raises compatibility issues, since I don't want
the mail delivery software to interfere, and the GNU code seems to use only
flock() or lock files, not lockf(). 

Has anyone fixed this yet?  Except for new mail delivery it doesn't appear
to be too much trouble, but I thought I'd ask first just in case anyone's 
already done this.

Thanks,
-- 
Paul Shields, shields@ists.yorku.CA, shields@yunccn.UUCP
(...utzoo!yunexus!ists, ...mnetor!ontmoh!yunccn)!shields
I wonder what Freud would have thought of self-serve gas stations?