[comp.lang.c] a couple of random questions

friedl@vsi.UUCP (Stephen J. Friedl) (04/13/88)

Hiho there,

     Two random questions.  First, the word "entry" used to be a
reserved keyword in C but appears to never have been used.  Does
anybody know the direction it might have taken had its use been
implemented?  I recall vaguely about a similar keyword in FORTRAN
but wonder if anybody has any stories from long ago in days of old.

     Second, what is the portable way to rewind a Unix file
descriptor?  On almost every machine I have ever used:

        lseek(fd, (off_t)0, SEEK_SET);

works because the offset is a byte count, but it is inevitable
that on some machine, off_t is the pointer to some kind of
struct, or at least is *not* simply a byte count.  What other
machines might work this way?  I know that the good old BDS C
compiler for the Z-80 measured its offset in records, but
nevertheless a zero arg did the trick.

     Related to this, how about a portable way to back up one
record?  On the assumption that doing math on an off_t is not
portable, I basically save the offset just before a read:

#define tell(fd)        lseek(fd, (off_t)0, SEEK_CUR)

extern off_t    lseek( /* int fd, off_t offset, int whence */ );

{
off_t   off;

        while (off = tell(fd), read(fd, buf, SIZE) == SIZE)
                if (some.condition->here)
                        lseek(fd, off, SEEK_SET);       /* back up */

Note that this is largely an academic exercise, and I know that
using dpANS buffered I/O resolves these issues, but I'm just
curious...

-- 
Steve Friedl   V-Systems, Inc.   "Yes, I'm jeff@unh's brother"
friedl@vsi.com  {backbones}!vsi.com!friedl  attmail!vsi!friedl

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (04/14/88)

In article <530@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
| [...]
|      Two random questions.  First, the word "entry" used to be a
| reserved keyword in C but appears to never have been used.  Does
| anybody know the direction it might have taken had its use been
| implemented?  

  The FORTRAN version gave another name which could be called, like
another subroutine declaration. I think I used it once, just to see how
it worked.

|      Second, what is the portable way to rewind a Unix file
| descriptor?  On almost every machine I have ever used:
| 
|         lseek(fd, (off_t)0, SEEK_SET);
| 

Any implementation which doesn't use 
	lseek(int, long, int)
In the K&R manner (pg 164) will break virtually every program which uses
the feature. I have to check dpANS on this, or someone can post and tell
me that they found some way to justify doing something else.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

friedl@vsi.UUCP (Stephen J. Friedl) (04/15/88)

In an article, davidsen@steinmetz.ge.com (William E. Davidsen Jr) writes:
< In the first article  friedl@vsi.UUCP (Stephen J. Friedl) writes:
< |      Second, what is the portable way to rewind a Unix file
< | descriptor?  On almost every machine I have ever used:
< | 
< |         lseek(fd, (off_t)0, SEEK_SET);
< 
< Any implementation which doesn't use 
< 	lseek(int, long, int)
< In the K&R manner (pg 164) will break virtually every program which uses
< the feature. I have to check dpANS on this, or someone can post and tell
< me that they found some way to justify doing something else.

I believe that dpANS does not address lseek(2) because it is an
operating system function; they specify fseek(3) instead, where
the offset is defined to be in characters.  Presumably the stdio
library is required to "just figure it out" on a record-based
system.  I've seen it written somewhere that the only portable
way to get an lseek(2) offset is as a result from a previous lseek(2).

-- 
Steve Friedl   V-Systems, Inc.   "Yes, I'm jeff@unh's brother"
friedl@vsi.com  {backbones}!vsi.com!friedl  attmail!vsi!friedl

henry@utzoo.uucp (Henry Spencer) (04/15/88)

>      Second, what is the portable way to rewind a Unix file
> descriptor?  On almost every machine I have ever used:
> 
>         lseek(fd, (off_t)0, SEEK_SET);
> 
> works because the offset is a byte count, but it is inevitable
> that on some machine, off_t is the pointer to some kind of
> struct, or at least is *not* simply a byte count...

Not on a *Unix* machine.  There is no portable way to rewind a Unix
file descriptor, because Unix file descriptors are not portable!
On a Unix system, (off_t)0 is fine.  On a seriously non-Unix system,
you have to use stdio streams instead, in which case rewind() is
available.  For vaguely Unix-like systems, all bets are off.
-- 
"Noalias must go.  This is           |  Henry Spencer @ U of Toronto Zoology
non-negotiable."  --DMR              | {allegra,ihnp4,decvax,utai}!utzoo!henry

kenny@uiucdcsb.cs.uiuc.edu (04/16/88)

Subject: Re: a couple of random questions
/* Written 10:05 am  Apr 14, 1988 by davidsen@steinmetz.ge.com in uiucdcsb:comp.lang.c */
/* ---------- "Re: a couple of random questions" ---------- */
In article <530@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
|      Second, what is the portable way to rewind a Unix file
| descriptor?  On almost every machine I have ever used:
| 
|         lseek(fd, (off_t)0, SEEK_SET);
| 

Any implementation which doesn't use 
	lseek(int, long, int)
In the K&R manner (pg 164) will break virtually every program which uses
the feature. I have to check dpANS on this, or someone can post and tell
me that they found some way to justify doing something else.
/* End of text from uiucdcsb:comp.lang.c */

dpANS doesn't *have* lseek, since it's a low-level routine, but rather
has fseek; it also has ftell.

On *binary* files, you can fseek to anywhere.

On *text* files, the only things you can do portably are:
	fseek (stream, 0L, SEEK_SET); /* Rewind */
	fseek (stream, 0L, SEEK_CUR); /* Do nothing */
	fseek (stream, 0L, SEEK_END); /* Position to end of file */
	fseek (stream, p, SEEK_SET);  /* Position to previous location */

In the last form, the seek pointer p must be a long returned from an
earlier call to ftell.

While the standard doesn't mention this explicitly, I would not be
upset to see an implementation disallow attempts to seek beyond the
end of the last write() to a text file; in other words, if you rewrite
stuff in the middle of a text file, anything beyond the point of the
rewrite *may* be lost.  This restriction is essential to handle media
whose nature forbids `forward read after write' operations.  Moreover,
I wouldn't be distressed to see a restriction forbidding read()
operations which extend beyond the end of the last write().

Doug, did you ever submit the change request on the consideration that
I outlined in the last paragraph?  I know we discussed this, but I
don't remember the conclusion.

Kevin

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/16/88)

In article <541@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>I believe that dpANS does not address lseek(2) because it is an
>operating system function; they specify fseek(3) instead, where
>the offset is defined to be in characters.  Presumably the stdio
>library is required to "just figure it out" on a record-based
>system.  I've seen it written somewhere that the only portable
>way to get an lseek(2) offset is as a result from a previous lseek(2).

lseek() is not in the dpANS for C because it refers to a file descriptor,
and no file-descriptor oriented functions are permitted in the dpANS.
POSIX (actually IEEE 1003.1) on the other hand intends to standardize
such functions.

lseek() works in bytes, using the UNIX file model (sequence of bytes).
fseek() to an absolute position only works PORTABLY if the cookie you
give it is one obtained from ftell().  rewind() will always do what
one would think.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/16/88)

In article <165600037@uiucdcsb> kenny@uiucdcsb.cs.uiuc.edu writes:
>Doug, did you ever submit the change request on the consideration that
>I outlined in the last paragraph?

I honestly don't recall at this point.
The current dpANS says that fseek() returns 0 for an "improper request"
but doesn't say what that consists of.  I think it's within the rights
of an implementation to say that seeking past EOF is "improper".

dhesi@bsu-cs.UUCP (Rahul Dhesi) (04/18/88)

In article <7706@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>)
writes:
>fseek() to an absolute position only works PORTABLY if the cookie you
>give it is one obtained from ftell().

VAX/VMS C, as always, is an exception.  Open a standard VMS text file
for read, then:

(a) keep reading until you hit end-of-file
(b) do an ftell
(c) do an fseek to end-of-file (with whence=2, offset=0L)
(d) do an ftell again

The two ftells will give different answers.

(I last tried this some months ago.  VMS C evolves nearly as fast as
fruit files do so things might have changed since then.)
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

karl@haddock.ISC.COM (Karl Heuer) (04/19/88)

In article <541@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>[in ANSI fseek()] the offset is defined to be in characters.

This is true for binary streams only.  The offset for a text stream must be
either zero or (with SEEK_SET) the cookie returned by ftell().

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/24/88)

In article <2636@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>(a) keep reading until you hit end-of-file
>(b) do an ftell
>(c) do an fseek to end-of-file (with whence=2, offset=0L)
>(d) do an ftell again
>The two ftells will give different answers.

Although I believe this to be correct, I don't see what relevance it has
for the preceding discussion.  Certainly we never promised that the
cookies returned by ftell() were uniquely determined by the byte-stream
model position.  In fact, in a record-oriented architecture, it is easy
to imagine that an ftell() cookie might represent the same position as
either the record number plus an offset into it, or as the next record
number minus an offset back into the preceding record, depending on how
one has been manipulating the stream.

dhesi@bsu-cs.UUCP (Rahul Dhesi) (04/26/88)

In article <7745@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>)
writes:
[about my observation that ftell() on VMS gives different values for
the same file position reached in different ways]

>Although I believe this to be correct, I don't see what relevance it has
>for the preceding discussion.  Certainly we never promised that the
>cookies returned by ftell() were uniquely determined by the byte-stream
>model position.

My thinking was affected by my VMS C manual, which says that "With
record files, ftell returns the starting position of the current
record, not the current byte offset."  What I observed seems to
illustrate that VMS C's idea of "the current record" depends on how you
reached end-of-file.

I certainly don't accuse the ANSI C committee's actions of causing VMS
C to do what it does.  Exactly the opposite, in fact.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

terry@wsccs.UUCP (Every system needs one) (04/30/88)

In article <2736@bsu-cs.UUCP>, dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
> [my observation that ftell() on VMS gives different values for
> the same file position reached in different ways]
> 
> My thinking was affected by my VMS C manual, which says that "With
> record files, ftell returns the starting position of the current
> record, not the current byte offset."  What I observed seems to
> illustrate that VMS C's idea of "the current record" depends on how you
> reached end-of-file.

Another soul relegated to NL: by a buggy ftell()!  The specific problem you
get is this:  when reading a record oriented file using fgetc() (and also
fgets(), for all I know), you read a record [read...read...read].  This
record stops at the end of the record, where VMS kindly supplies a '\n',
pretending it's a stream file (as a record seperator).  The problem is
that say you are bouncing around in a file, so you want to fseek() to
reposition the pointer to the record following the one you just read...
logically, you ftell(), and store the result (which is the record number,
not the byte offset into the file).  The problem you run into is that
you fseek() to the location later AND IT PUTS YOU AT THE START OF THE RECORD
YOU JUST DID THE FTELL FROM!  What's going on?!? You KNOW you read the
full record!

The problem is that the record pointer is not advanced to the next record
until you read the first character of that record, even after VMS has gone
and faked up the '\n' for you!  The proper operation would be to go and
advance the record pointer after faking the '\n' record terminator.  After
all, the next thing you read will be the first byte of that record, so one
would assume that if you are pointing at the front of a record and did an
fseek(ftell()), you would still be there, instead of the previous record!

Here is a fix that will be portable when this bug is finally fixed in a
future revision of the VMS libraries, God willing.  It makes the assumption
that you are using the standard C calls to do I/O and haven't done any of
the weird stuff required to talk to strange files.  Making the argument's
match is your own problem... the following code took 5 days:

[Should I declare this shareware and demand $5.00 or a first born male child?]

#define ftell() dftell()	/* Hi.  I live in a global include and*/
				/* come to visit all your .c files!*/

....		/* some unreasonably large amount of normal code*/
....
....

/* this must be the last thing in some poor .c file*/
#undef ftell()		/* so we can use the real one. UGH.*/
dftell()
{
	ungetc( getc());		/* force record pointer update*/
	return( ftell());		/* return updated position*/
}

This is damned ugly, but will save you doing it everywhere or changing all
ftell()'s everywhere.  It also has the advantage of allowing you to say

#ifdef VMS
	/* code that is otherwise reasonable*/
#endif

instead of carrying it all over.  Never tried it under ANSI-C becuase there
isn't one yet until the standard lands on my head.  Probably bomb if they see
it, and become an example of "unallowable code" (volatile dftell()?).


| Terry Lambert           UUCP: ...{ decvax, ihnp4 } ...utah-cs!century!terry |
| @ Century Software        OR: ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
| 'Admit it!  You're just harrasing me because of the quote in my signature!' |