[comp.lang.c] fgetpos, fsetpos, and ANSIness in general

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/07/89)

In article <OTTO.89Oct8093607@tukki.jyu.fi> otto@tukki.jyu.fi (Otto J. Makela) writes:
>While leafing thru the "C Programming Language, 2nd Edition", I stumbled across
>the two functions fgetpos and fsetpos.  Now, the minimalist description given
>in the book seem to indicate the same functionality as ftell and fseek, with
>slightly different argument types.  What gives ?

The f*etpos() functions communicate file positions with cookies, while the
traditional fseek()/ftell() communicate them as long integers (byte offsets,
for binary streams).  Cookies are needed because many systems support files
containing more bytes than can be represented in the natural size for a
long integer.  You don't have to use the new f*etpos() functions unless your
program needs to randomly access huge files.  Few applications do.

>In general, I've come to the same conclusion as a friend: "Seems the ANSI thing
>has several changes just to make life difficult for traditional Unix C users."

Maybe you shouldn't draw conclusions based on insufficient information.
That's an utterly fantastic notion; what makes life "difficult for
traditional UNIX C programmers" is their ignorance of the differences
between UNIX and more general computing contexts.  Those differences
cannot be ignored in a non UNIX-specific programming language standard.

>Many of the "features" are truly bizarre (think about #) and there is a real
>danger of creeping Pascalism (now that was a nasty thing to say -- should I
>crosspost to alt.religion.computers ? :-)

There is nothing particularly bizarre about the # preprocessing operator.
It could have been spelled differently, and there were several alternative
proposals for obtaining the same functionality in different ways, but nobody
seriously challenged the practical need for such a "stringizing" operator,
as evidenced by a large body of existing code that exploited a misfeature
of Reiser's cpp to effect argument substitution within string literals.

I don't know what "Pascalism" is supposed to consist of.  Practically all
Pascal-inspired proposals for extensions or changes to C were rejected.

>For example, I like the idea behind strftime, the date formatting routine ala
>printf.  But I dislike the interface: the more-or-less Unix tradition for such
>functions is to return a pointer to a fixed buffer.  All right, all kinds of
>things can (and have) been said of this type of implementations, but I would
>claim that in general, it makes for simpler, no-hassles programming.

Other experienced programmers do not agree with you.  There are several
mistakes in the "UNIX tradition"; one should learn from them instead of
mimicking them.

>Now, this interface could be extended for this: define that the function will
>use a internal buffer of, er, let's say 80 characters IF the buffer pointer
>given is NULL.  Function returns zero, as originally, if the buffer length is
>exceeded.  This way, you get the best of both implementation ideas.  Nothing
>like this was not done.  Why ?

We like our design better, that's why.  Why add complexity?

>Could someone point me to a good book ala K&R 2nd Ed. which would go into
>the library description in a bit more detail ?  I'd prefer not to dig into
>the actual ANSI committee stuff on it, but something a bit more digested.

"Standard C" by P. J. Plauger and Jim Brodie (1989, Microsoft Press),
ISBN 1-55615-158-6.

otto@tukki.jyu.fi (Otto J. Makela) (10/08/89)

Here I go again, braving all the flames of the whole network...

While leafing thru the "C Programming Language, 2nd Edition", I stumbled across
the two functions fgetpos and fsetpos.  Now, the minimalist description given
in the book seem to indicate the same functionality as ftell and fseek, with
slightly different argument types.  What gives ?

In general, I've come to the same conclusion as a friend: "Seems the ANSI thing
has several changes just to make life difficult for traditional Unix C users."
Many of the "features" are truly bizarre (think about #) and there is a real
danger of creeping Pascalism (now that was a nasty thing to say -- should I
crosspost to alt.religion.computers ? :-)

For example, I like the idea behind strftime, the date formatting routine ala
printf.  But I dislike the interface: the more-or-less Unix tradition for such
functions is to return a pointer to a fixed buffer.  All right, all kinds of
things can (and have) been said of this type of implementations, but I would
claim that in general, it makes for simpler, no-hassles programming.

Now, this interface could be extended for this: define that the function will
use a internal buffer of, er, let's say 80 characters IF the buffer pointer
given is NULL.  Function returns zero, as originally, if the buffer length is
exceeded.  This way, you get the best of both implementation ideas.  Nothing
like this was not done.  Why ?

Could someone point me to a good book ala K&R 2nd Ed. which would go into
the library description in a bit more detail ?  I'd prefer not to dig into
the actual ANSI committee stuff on it, but something a bit more digested.
Email please on this last.  Flames must (of course :-) be posted...
--
* * * Otto J. Makela (otto@jyu.fi, MAKELA_OTTO_@FINJYU.BITNET) * * * * * * *
* Phone: +358 41 613 847, BBS: +358 41 211 562 (CCITT, Bell 2400/1200/300) *
* Mail: Kauppakatu 1 B 18, SF-40100 Jyvaskyla, Finland, EUROPE             *
* * * freopen("/dev/null","r",stdflame); * * * * * * * * * * * * * * * * * *

amull@Morgan.COM (Andrew P. Mullhaupt) (10/09/89)

In article <OTTO.89Oct8093607@tukki.jyu.fi>, otto@tukki.jyu.fi (Otto J. Makela) writes:
> Here I go again, braving all the flames of the whole network...
> 
> In general, I've come to the same conclusion as a friend: "Seems the ANSI thing
> has several changes just to make life difficult for traditional Unix C users."
> Many of the "features" are truly bizarre (think about #) and there is a real
> danger of creeping Pascalism (now that was a nasty thing to say -- should I
> crosspost to alt.religion.computers ? :-)
> 
Well some of think that creeping Pascalism would improve the C language
but raging militant jihad Pascalism is what is really called for.

Seriously, folks, One of the resident C experts (and here we have several
actual C experts) once explained to me how to structure all my inlude
files and always use function prototypes for sources with many files.
I was stunned to find how primitive C was compared even to the unit
facility of early Pascals and the sleek modernity of Modula-2. Why
can't I have an array of function variables? Why do I need to creat
a spurious structure name on the way to a recursive typedef? 

From what I can see the ANSI style of C is ten times more reasonable
than what went before. (ANSI standards are two for three with me,
since they are improving FORTRAN with 8x, but they could have had
variable length arrays (ISO level 1) in Pascal and chickened out
instead.)

Later,
Andrew Mullhaupt

P.S. Thanks to all who suggested ways to get macro expansion and
indentation into incomprehensible code. It was all for naught in
my case, since the macro expansion (even after removing trivial
and "include source" lines) resulted in an increase from 65 to
720 lines of source in one typical case. The indent and cb programs
were broken by trying this code, (indent introduced a syntax error
due to somehow substituting *= where =* was required). This could
only happen in C.

Disclaimer: These opinions are all mine, but I constantly try to
export them.

henry@utzoo.uucp (Henry Spencer) (10/09/89)

In article <OTTO.89Oct8093607@tukki.jyu.fi> otto@tukki.jyu.fi (Otto J. Makela) writes:
>While leafing thru the "C Programming Language, 2nd Edition", I stumbled across
>the two functions fgetpos and fsetpos.  Now, the minimalist description given
>in the book seem to indicate the same functionality as ftell and fseek, with
>slightly different argument types.  What gives ?

The slightly-different argument types are important, and are the whole
reason for those functions.  The crucial observation is that a 32-bit
number is no longer big enough for a file offset -- people with big
databases think nothing of files several gigabytes long.  Fixing fseek
and friends is impossible; the backward-compatibility hassles would be
beyond belief.  Fgetpos and fsetpos are versions that can use a larger
type if necessary.

>Many of the "features" are truly bizarre (think about #) ...

Some of us prefer not to think about #.  Its only saving grace is that it's
better than the awful undocumented implementation-specific kludges that it
replaces.  (You thought X3J11 made up the whole concept?  Ho ho.  Wrong.)
(Some of us think they should have buried it with a stake through its
heart, but that's another debate.)

>For example, I like the idea behind strftime, the date formatting routine ala
>printf.  But I dislike the interface: the more-or-less Unix tradition for such
>functions is to return a pointer to a fixed buffer...

Unfortunately, that traditional interface causes a *lot* of problems and
there are good reasons to try to avoid it.  Especially for things that can
generate potentially-unlimited amounts of data.

>[proposed alternative] Nothing like this was not done.  Why ?

Probably because there was implementation experience with strftime as it
is (no, X3J11 didn't make it up) and there wasn't any with your alternative.
For all the nasty things that have been said about X3J11, *very* few of the
things in the final draft standard were invented from scratch; almost all
of it has been implemented, and found workable, somewhere.  The few real
inventions, in fact, are generally the most controversial parts.

>Could someone point me to a good book ala K&R 2nd Ed. which would go into
>the library description in a bit more detail ?  I'd prefer not to dig into
>the actual ANSI committee stuff on it, but something a bit more digested.

(I'm posting this rather than emailing because I think it's of general
interest.)  Believe it or not, the ANSI drafts are generally highly readable
and not at all hard to follow.  If you find K&R2 manageable, you should be
able to read an ANSI C draft without trouble.  There are a few areas of the
language itself where you have to read very carefully and legalistically,
but I don't recall anything like that in the library section.

You might try Harbison&Steele, 2nd ed, although it probably needs updating
to match the latest ANSI draft by now.
-- 
A bit of tolerance is worth a  |     Henry Spencer at U of Toronto Zoology
megabyte of flaming.           | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

chris@mimsy.UUCP (Chris Torek) (10/09/89)

In article <432@s5.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes:
>... One of the resident C experts ... once explained to me how to
>structure all my include files and always use function prototypes for
>sources with many files.  I was stunned to find how primitive C was
>compared even to the unit facility of early Pascals

(What `unit' facility?  Early Pascals required that the entire source
code be a single file.  Pascal had `forward' declarations, but not
`external's.  All modern Pascals allow separate compilation, but most
of them use different, incompatible methods.)

>and the sleek modernity of Modula-2.

Other than `see Modula 3', no comment.

>Why can't I have an array of function variables?

You also cannot have an array of feather variables, for the same reason:
There is no such thing as a function variable.  You can, however, have
an array of pointers to functions.

>Why do I need to creat a spurious structure name on the way to a
>recursive typedef? 

Because typedef name recognition must go in the lexer to build LALR
parsers for C.  Structure name parsing is context free, hence references
to them do not require that they already exist.

(Actually, the above question is also ill-phrased.  One cannot build
`recursive typedef's at all in C.  The closest thing to a recursive
type, and what you must have meant, is a self-referential structure
---a structure that contains a pointer to another instance of the same
structure, or a reference to itself via some chain of pointer references.)

On a different note entirely:
>... (indent introduced a syntax error due to somehow substituting *=
>where =* was required). This could only happen in C.

No, it could happen anywhere (consider what the Pascal program

	PROGRAM STUPID(INPUT, OUTPUT);
	VAR INPUTBUFFER[30];
	...

is likely to do when fed input lines longer than 30 characters), but
in this case the change from `=*' to `*=' is a `feature' of indent.
Long, long ago, in a history not far away (New Jersey, actually), the
C language had `=op' operators rather than `op=' operators.  One wrote

	char buf[] "initial value";

	main(argc, argv) char **argv; {
		int i 1;
		char *p;

		while (i < argc) {
			p = argv[i];
			if (*p == '-' && p[1] == 'c') {
				p =+ 2;
				...

Although indent does nothing about the `old fashioned initialisation's
above (`int i 1' and `char buf[] "..."'), it does do something about
the `old fashioned assignment operator': it turns it into a modern
assignment operator, by reversing the two characters `=' and `+'.
It does this for all possible old-style assignment operators, and due
to programmer error, some versions of indent also do it for `=!',
so that `a=!b' becomes `a!=b', although even old C did not parse
thing this way.

The =op operators were changed to op= operators because of the frequency
of mysterious answers from

	i=-1;	/* set i to -1 */

and

	a=*b;	/* set a to *b */

which is actually remarkably similar to the complaint that started this
discussion (a/*b).  Back then, when large disks were 5 MB, it sometimes
seemed important to squeeze white space out of source code. . . .
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

diamond@csl.sony.co.jp (Norman Diamond) (10/11/89)

In article <OTTO.89Oct8093607@tukki.jyu.fi> otto@tukki.jyu.fi (Otto J. Makela) writes:

>>and there is a real danger of creeping Pascalism

In article <11244@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:

>I don't know what "Pascalism" is supposed to consist of.  Practically all
>Pascal-inspired proposals for extensions or changes to C were rejected.

Some of these ugly useless characteristics crept into C before ANSI, but
here are some abominable examples of Pascalism (:-I irony).

-  typedef

-  enum

-  union

-  separate namespace for members of each struct/union

-  function prototypes

-  strict typechecking of function pointers

-- 
Norman Diamond, Sony Corp. (diamond%ws.sony.junet@uunet.uu.net seems to work)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

gis@datlog.co.uk ( Ian Stewartson ) (10/13/89)

In article <OTTO.89Oct8093607@tukki.jyu.fi> otto@tukki.jyu.fi (Otto J. Makela) writes: (edited)

>For example, I like the idea behind strftime, the date formatting routine ala
>printf.  But I dislike the interface: the more-or-less Unix tradition for such
>functions is to return a pointer to a fixed buffer. 
>
>Now, this interface could be extended for this: define that the function will
>use a internal buffer of, er, let's say 80 characters IF the buffer pointer
>given is NULL.  Function returns zero, as originally, if the buffer length is
>exceeded.  This way, you get the best of both implementation ideas.  Nothing
>like this was not done.  Why ?

One of the major problems with the world today is that we don't all speak
English as our first (and only) language and format our dates the way the
British (as opposed to any other 'english' speaking nation) do.  Indeed, some
governments and nations insist that computer software speaks their language
and date formats.  And even the British can't make up their mind.  Therefore,
when you pass a format to strftime, how do it now how big the resultant string
is going to be.  Well it doesn't, so to put the responsibility for any
problems in the right place, you have to pass a buffer and length.  I believe
the reasons for this are very similar to the reasons for using fgets
instead of gets.

Strftime was designed to implement National Language Support which it does
quite sensibly (especially when you compare it with the original nl_ascxtime
and nl_cxtime).

Regards,

Ian Stewartson
Data Logic Ltd, Queens House, Greenhill Way, Harrow, Middlesex, HA1 1YR, UK.
(Phone) +44 1 863 0383 (Telex) 888103 (Fax) +44 1 861 2010
	+44 81 863 0383 after May 1990.
(Network) gis@datlog.co.uk or ukc!datlog!gis