[comp.unix.questions] fopen

6600pete@hub.UUCP (12/07/89)

When one opens a file under *most* flavors of UN*X (I realize this is
the kind of thing that will be system-dependent, though it oughtn't)
with fopen ( ..., "a" ), the file mark is supposed to be moved to EOF
before every write. Now, how is this done? Are there two system calls,
one to move the file mark and one to do the write, or is there one
system call, "append"? If the latter, then this is an easier solution
for a problem I have than figuring out how to do record locking.
-------------------------------------------------------------------------------
Pete Gontier   : InterNet: 6600pete@ucsbuxa.ucsb.edu, BitNet: 6600pete@ucsbuxa
Editor, Macker : Online Macintosh Programming Journal; mail for subscription
Hire this kid  : Mac, DOS, C, Pascal, asm, excellent communication skills

stevens@hsi.UUCP (Richard Stevens) (12/07/89)

In article <3250@hub.UUCP>, 6600pete@hub.UUCP writes:
> When one opens a file under *most* flavors of UN*X (I realize this is
> the kind of thing that will be system-dependent, though it oughtn't)
> with fopen ( ..., "a" ), the file mark is supposed to be moved to EOF
> before every write. Now, how is this done? Are there two system calls,
> one to move the file mark and one to do the write, or is there one
> system call, "append"?

With System V Release 2, fopen specifies the O_APPEND flag to the
open system call if you specify the "a" mode.  This has the kernel
move the inode's read/write offset to the end of the file every
time you write to the file.  Hence only one system call is required.
I suspect the later release of System V also do this.

Interesting, however, is that the 4.3BSD source differs.  It does
an lseek to the EOF when fopen is called, and that's it.  4.3 does
have an O_APPEND option to open, but it doesn't appear to be used.
The 4.3 man page for fopen also doesn't go to the lengths that
the system V man page does specifying that "a" really means that
every write gets appended, regardless of the file's current position.

	Richard Stevens
	Health Systems International, New Haven, CT
	   stevens@hsi.com
           ... { uunet | yale } ! hsi ! stevens

meissner@dg-rtp.dg.com (Michael Meissner) (12/07/89)

In article <3250@hub.UUCP> 6600pete@hub.UUCP writes:

|  When one opens a file under *most* flavors of UN*X (I realize this is
|  the kind of thing that will be system-dependent, though it oughtn't)
|  with fopen ( ..., "a" ), the file mark is supposed to be moved to EOF
|  before every write. Now, how is this done? Are there two system calls,
|  one to move the file mark and one to do the write, or is there one
|  system call, "append"? If the latter, then this is an easier solution
|  for a problem I have than figuring out how to do record locking.

In "modern" Unixes (ie, System V.[01234], Berkeley BSD 4.[23],
possibly eariler in System III, and Berkeley 4.1, but I don't have
manuals for them), the open system call takes a flag (O_APPEND) that
says to reset the file position to the end of the file whenever a
write system call occurs.  On a filesystem local to the machine, this
is done atomically with the write call.  I'm not sure whether this is
guaranteed to be atomic under NFS, but I suspect not, particularly if
the NFS server is not a UNIX system (such as a VAX running VMS or IBM
mainframe).

My version 7 manual does not list any flags for open, and the fopen
man page does not make any promises about ruber-banding the file
position to the end of the file.

--
--
Michael Meissner, Data General.
Until 12/15:	meissner@dg-rtp.DG.COM
After 12/15:	meissner@osf.org

6600pete@hub.UUCP (12/07/89)

From article <895@hsi86.hsi.UUCP-, by stevens@hsi.UUCP (Richard Stevens):
- In article <3250@hub.UUCP-, 6600pete@hub.UUCP writes:
-- When one opens a file under *most* flavors of UN*X
-- with fopen ( ..., "a" ), the file mark is supposed to be moved to EOF
-- before every write. Now, how is this done? Are there two system calls,
-- one to move the file mark and one to do the write, or is there one
-- system call, "append"?
-
- With System V Release 2, [ there is one system call ].
- I suspect the later release of System V also do this.
-
- Interesting, however, is that the 4.3BSD source differs.  It does
- an lseek to the EOF when fopen is called, and that's it.  4.3 does
- have an O_APPEND option to open, but it doesn't appear to be used.
 
From article <MEISSNER.89Dec6215032@tiktok.rtp.dg.com-, by meissner@dg-rtp.dg.com
(Michael Meissner):
- On a filesystem local to the machine, [ the append ]
- is done atomically with the write call.  I'm not sure whether this is
- guaranteed to be atomic under NFS, but I suspect not, particularly if
- the NFS server is not a UNIX system (such as a VAX running VMS or IBM
- mainframe).
 
- My version 7 manual does not list any flags for open, and the fopen
- man page does not make any promises about ruber-banding the file
- position to the end of the file.
 
Perhaps the best way to do it, then, is to call open() with O_APPEND,
then pass the handle to fdopen()? What does anyone think?
-------------------------------------------------------------------------------
Pete Gontier   : InterNet: 6600pete@ucsbuxa.ucsb.edu, BitNet: 6600pete@ucsbuxa
Editor, Macker : Online Macintosh Programming Journal; mail for subscription
Hire this kid  : Mac, DOS, C, Pascal, asm, excellent communication skills

cpcahil@virtech.uucp (Conor P. Cahill) (12/07/89)

In article <3250@hub.UUCP>, 6600pete@hub.UUCP writes:
> When one opens a file under *most* flavors of UN*X (I realize this is
> the kind of thing that will be system-dependent, though it oughtn't)
> with fopen ( ..., "a" ), the file mark is supposed to be moved to EOF
> before every write. Now, how is this done? Are there two system calls,
> one to move the file mark and one to do the write, or is there one
> system call, "append"? If the latter, then this is an easier solution
> for a problem I have than figuring out how to do record locking.

There exists an append mode for open files where the kernel automatically
places all writes at the end of file.  This is what fopen(3) will use
under unix.

Under other os's this will depend upon the capabilities of the os.

If the system has an append mode,
	 then it will probably be used.
Else If the system has a mechanism to move around a file (like unix lseek())
	it will probably be used to move to the end of the file before each
	write.
Else
	the library could just read data until it came to EOF
	and then write the data.  (Yes, this would be very inefficient)


-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

chris@mimsy.umd.edu (Chris Torek) (12/07/89)

In article <895@hsi86.hsi.UUCP> stevens@hsi.UUCP (Richard Stevens) writes:
>... Interesting, however, is that the 4.3BSD source differs.  It does
>an lseek to the EOF when fopen is called, and that's it.  4.3 does
>have an O_APPEND option to open, but it doesn't appear to be used.

This is scheduled to change (to match the wording in the ANSI standard).
I also wrote code to assert FAPPEND mode on fdopen(fd, "a").  This is a
bit nastier, but would seem to be required for reasons of sanity.  I do
not, however, clear FAPPEND on other fdopen calls.

What does SysV do?  What does the SVID say?
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

rec@dg.dg.com (Robert Cousins) (12/07/89)

In article <3250@hub.UUCP> 6600pete@hub.UUCP writes:
>When one opens a file under *most* flavors of UN*X (I realize this is
>the kind of thing that will be system-dependent, though it oughtn't)
>with fopen ( ..., "a" ), the file mark is supposed to be moved to EOF
>before every write. Now, how is this done? Are there two system calls,
>one to move the file mark and one to do the write, or is there one
>system call, "append"? If the latter, then this is an easier solution
>for a problem I have than figuring out how to do record locking.
>-------------------------------------------------------------------------------
>Pete Gontier   : InterNet: 6600pete@ucsbuxa.ucsb.edu, BitNet: 6600pete@ucsbuxa
>Editor, Macker : Online Macintosh Programming Journal; mail for subscription
>Hire this kid  : Mac, DOS, C, Pascal, asm, excellent communication skills

It is important to point out that use of "a" mode is some circumstances
will not work as anticipated.  This is in any environment in which NFS
is used and two programs are writing to the same file without locking
the records in an effective fashion.  The reason for this is that the
NFS protocol does not have any concept of "guaranteed append" so the
client operating system translates append writes into something more
like lseek-write combinations which are more-or-less atomic.  The problem
with this is that the client operating system has its own idea of where
the end of the file is and therefore where to append.  If another client
has appended something to the file in the mean time it could be lost.

This all comes from the fact that NFS is a stateless protocol.  Each
NFS operation carries with it all (or atleast is supposed to) information
the server needs to complete the operation.  Furthermore, all operations
are idempotent and therefore can be repeatedly performed.  (This is
how they get away with using UDP/IP.)  In effect, an NFS write translates
into a "write x bytes in file y starting at location z."  If the same
write is performed several times (since UDP/IP can deliver a request
multiple times), the data file's contents are the same.
Had this been a stateful protocol ("append x bytes to file y") and multiple
requests were delivered, one could easily see a datafile with a bad
case of the "stutters."

Robert Cousins
Dept. Mgr, Workstation Dev't.
Data General Corp.

Speaking for myself alone.

executions took place

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (12/08/89)

  In retrospect I think that one more key letter would have been useful
in pANS. The use of "a" to mean 'always append, never rewrite' is a
useful one, but often "a" is used when what is meant is to 'open the
existing file if there is one, otherwise create one.'

  If "a" really means append only, then the second use requires:
    open for "r"
    if that fails open for "w+"

  This is not a big deal, but either another open type in addition to
{rwa} could have been provided, or another modifier in addition to {+b}
would suffice. Something for the next committee to consider, I suspect.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/08/89)

In article <3263@hub.UUCP> 6600pete@hub.UUCP writes:
>Perhaps the best way to do it, then, is to call open() with O_APPEND,
>then pass the handle to fdopen()? What does anyone think?

I think you haven't shown us a clear conception about what you
need to do.  Is there some reason for not using fopen(...,"a")
and letting the C implementation worry about the details?

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/08/89)

In article <21152@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>This is scheduled to change (to match the wording in the ANSI standard).
>I also wrote code to assert FAPPEND mode on fdopen(fd, "a").  This is a
>bit nastier, but would seem to be required for reasons of sanity.  I do
>not, however, clear FAPPEND on other fdopen calls.
>What does SysV do?  What does the SVID say?

All the System V documentation and sources I could find, including SVID
Issue 2, indicate that the proper open() modes were the responsibility
of the invoker of fdopen(), not of the fdopen() implementation.

IEEE Std 1003.1 is a bit more explicit in its description of fdopen():
"The type of the stream must be allowed by the mode of the open file".
To me this indicates clearly that the System V implementation is proper.

In fact, I think it is a disservice for some other (4.nBSD?)
implementation to add functionality such as you describe.  That
could mislead programmers on such systems into thinking that that
implementation's behavior was universal, whereas it is not.

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/08/89)

In article <1989Dec7.130813.4992@virtech.uucp> cpcahil@virtech.uucp (Conor P. Cahill) writes:
-Under other os's this will depend upon the capabilities of the os.
-Else If the system has a mechanism to move around a file (like unix lseek())
-	it will probably be used to move to the end of the file before each
-	write.

On a single-user non-multitasking system, a better implementation would
be to seek to the end only on the initial open, not for each write.

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/08/89)

In article <1890@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes:
-  In retrospect I think that one more key letter would have been useful
-in pANS. The use of "a" to mean 'always append, never rewrite' is a
-useful one, but often "a" is used when what is meant is to 'open the
-existing file if there is one, otherwise create one.'

If so, that's simply a user error.  That is not and never has been
the meaning of the "a" fopen() mode.

trt@rti.UUCP (Thomas Truscott) (12/09/89)

> It is important to point out that use of "a" mode is some circumstances
> will not work as anticipated.  ... [problems with NFS noted]

> Had this been a stateful protocol ("append x bytes to file y") and multiple
> requests were delivered, one could easily see a datafile with a bad
> case of the "stutters."

Except of course that stateful protocols invariably have "at most once"
semantics.  Since it is stateful the protocol can easily
detect and discard the duplicate requests.
	Tom Truscott

les@chinet.chi.il.us (Leslie Mikesell) (12/09/89)

In article <11775@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:

>On a single-user non-multitasking system, a better implementation would
>be to seek to the end only on the initial open, not for each write.

But what if the single-user non-multitasking system is networked to
a shared filesystem and you would like your log files to work?

Les Mikesell
  les@chinet.chi.il.us

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/11/89)

In article <1989Dec9.000805.1617@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes:
-In article <11775@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
->On a single-user non-multitasking system, a better implementation would
->be to seek to the end only on the initial open, not for each write.
-But what if the single-user non-multitasking system is networked to
-a shared filesystem and you would like your log files to work?

Suggest you look up "system" in a decent engineering textbook.
You described a system that doesn't fit my qualifiers.

rec@dg.dg.com (Robert Cousins) (12/11/89)

In article <3319@rti.UUCP> trt@rti.UUCP (Thomas Truscott) writes:
>> It is important to point out that use of "a" mode is some circumstances
>> will not work as anticipated.  ... [problems with NFS noted]
>
>> Had this been a stateful protocol ("append x bytes to file y") and multiple
>> requests were delivered, one could easily see a datafile with a bad
>> case of the "stutters."
>
>Except of course that stateful protocols invariably have "at most once"
>semantics.  Since it is stateful the protocol can easily
>detect and discard the duplicate requests.
>	Tom Truscott

It is true that there are a number ways in which NFS could have
been designed differently. However, the point is, fopen(..., "a")
does have some implications in an NFS environment which do
derive from the early design decision to use a stateless
protocol.

Question in general:  How could NFS have been designed (from
scratch) to be more closely representative of UNIX semantics
while keeping its "nice" features?  I think it is time to
have this discussion again. Maybe some new ideas will come up.

Robert Cousins
Dept. Mgr, Workstation Dev't.
Data General Corp.

Speaking for myself alone.

bobmon@iuvax.cs.indiana.edu (RAMontante) (12/12/89)

gwyn@brl.arpa (Doug Gwyn) <11775@smoke.BRL.MIL> :
-On a single-user non-multitasking system, a better implementation
-[of append] would
-be to seek to the end only on the initial open, not for each write.

Is the process forbidden from doing an lseek, or are you allowing the
programmer to reposition somewhere else in the file?  What is the
semantics of the append behavior?

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/12/89)

In article <31276@iuvax.cs.indiana.edu> bobmon@iuvax.cs.indiana.edu (RAMontante) writes:
>gwyn@brl.arpa (Doug Gwyn) <11775@smoke.BRL.MIL> :
>-On a single-user non-multitasking system, a better implementation
>-[of append] would
>-be to seek to the end only on the initial open, not for each write.
>Is the process forbidden from doing an lseek, or are you allowing the
>programmer to reposition somewhere else in the file?  What is the
>semantics of the append behavior?

You must mean fseek(), as use of lseek() in conjunction with a stdio
stream can break stdio operation.

fseek() on an "a" mode stream could report failure (or, to be fancy,
it could succeed if the f.p.i. wouldn't be changed by the seek).

However, the story for an "a+" mode stream is different, because so
far as I can determine reads from the stream can be initiated anywhere
by preceding them with fseek() calls, and only writes are required to
jump to the end of the file.

Whether or not seek before write would be necessary depends on how
much state information is maintained for the stream by the stdio
implementation.

les@chinet.chi.il.us (Leslie Mikesell) (12/12/89)

In article <11785@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:

>-In article <11775@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>->On a single-user non-multitasking system, a better implementation would
>->be to seek to the end only on the initial open, not for each write.
>-But what if the single-user non-multitasking system is networked to
>-a shared filesystem and you would like your log files to work?

>Suggest you look up "system" in a decent engineering textbook.
>You described a system that doesn't fit my qualifiers.

I fail to see how providing each user-level process with its own CPU
and i/o facilities would break anyone's concept of a "system".  Do
you mean that all filesystem clients and servers must maintain state
information to be worthy of being called a "system"?

Les Mikesell
  les@chinet.chi.il.us

Kemp@DOCKMASTER.NCSC.MIL (12/13/89)

Michael Meissner writes:
 > On a filesystem local to the machine, this [seeking to EOF] is
 > done atomically with the write call.  I'm not sure whether this
 > is guaranteed to be atomic under NFS, but I suspect not, particularly
 > if the NFS server is not a UNIX system (such as a VAX running VMS
 > or IBM mainframe).

This has *nothing* to do with the NFS server.  The client is responsible
for maintaining whatever state is associated with the open file,
including the seek position.

From the NFS Protocol Spec, Version 2:

     NFSPROC_WRITE(writeargs)
     struct writeargs {
         fhandle file;
         unsigned beginoffset;
         unsigned offset;
         unsigned totalcount;
         opaque data<NFS_MAXDATA>;
     };

   'Writes "data" beginning at "offset" bytes from the beginning of
   "file".  The first byte of the file is at offset zero.  ...  The
   write operation is atomic.  Data from this call to WRITE will not
   be mixed with data from another client's calls.

   Note: The arguments "beginoffset" and "totalcount" are ignored and
   are removed in the next protocol revision.'

Dave Kemp <Kemp@dockmaster.ncsc.mil> "My sister is a yahoo"

peter@ficc.uu.net (Peter da Silva) (12/13/89)

Names removed to protect the guilty.

a>On a single-user non-multitasking system, a better implementation would
a>be to seek to the end only on the initial open, not for each write.

b>But what if the single-user non-multitasking system is networked to
b>a shared filesystem and you would like your log files to work?

a>Suggest you look up "system" in a decent engineering textbook.
a>You described a system that doesn't fit my qualifiers.

b>I fail to see how providing each user-level process with its own CPU
b>and i/o facilities would break anyone's concept of a "system".

Now it's not a single-user non-multitasking system. That is, you didn't
fit his qualifiers. Like he said.

Now for something completely different:

b>Do you mean that all filesystem clients and servers must maintain state
b>information to be worthy of being called a "system"?

Well, it's a quality of information issue. But I agree with P1003.1 on
this one... stateless NFA isn't acceptable.
-- 
`-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
 'U`  Also <peter@ficc.lonestar.org> or <peter@sugar.lonestar.org>.
"It was just dumb luck that Unix managed to break through the Stupidity Barrier
and become popular in spite of its inherent elegance." -- gavin@krypton.sgi.com

les@chinet.chi.il.us (Leslie Mikesell) (12/14/89)

In article <21726@adm.BRL.MIL> Kemp@DOCKMASTER.NCSC.MIL writes:
>Michael Meissner writes:

> > On a filesystem local to the machine, this [seeking to EOF] is
> > done atomically with the write call.  I'm not sure whether this
> > is guaranteed to be atomic under NFS, but I suspect not, particularly
> > if the NFS server is not a UNIX system (such as a VAX running VMS
> > or IBM mainframe).

>This has *nothing* to do with the NFS server.  The client is responsible
>for maintaining whatever state is associated with the open file,
>including the seek position.

Which means that it can't be guaranteed to know the current EOF position
if there are multiple writers.  The server knows the EOF position, of
course, but doesn't accept "append" requests.  With a stateless protocol
the possibility would then exist for a request to succeed, but the
ack back to the client to be lost resulting it a retry on the request.
If another "append" request intervened before the retry, the write
would be duplicated in different places.

Les Mikesell
  les@chinet.chi.il.us

guy@auspex.UUCP (Guy Harris) (12/19/89)

>Which means that it can't be guaranteed to know the current EOF position
>if there are multiple writers.

Correct.

>The server knows the EOF position, of course, but doesn't accept "append"
>requests.  With a stateless protocol the possibility would then exist for
>a request to succeed, but the ack back to the client to be lost resulting
>it a retry on the request.  If another "append" request intervened before
>the retry, the write would be duplicated in different places.

No, since the "write" request contains the position in the file at which
the "write" is to occur; instead, you run the risk of having your data
overwritten by another writer.  If the server *did* accept an "append"
request" that implicitly wrote to the end of the file, a retry would run
the risk of causing the write to be duplicated in different places.