[comp.sources.d] Comments on INSERT.c

webber@athos.rutgers.edu (Bob Webber) (01/09/89)

In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
> It is better to remain silent and be thought a fool than to speak and
> remove all doubt.

But fortunately you spoke.  It takes a brave man to make a fool of themself
like this just to learn a bit more about unix.  The net takes its hat off
to you Bill.  Now, let's take a look at the program:

> First, let's notice that this is essentially equivalent the cp
> command, minus cp's frills. It does do one thing that I don't believe
> all cp's do: it verifies that the copy succeeded. But one can do the
> job with a little shell script that does a cp, some ls's, and some
> tests.  Thus this program is an example of bad analysis: it is a new
> program to do what existing tools can already do.

No, it is fundamentally different from cp.  cp truncates to 0 an
existing file (or unlinks it) before writing into it (or creating
a new one).  INSERT writes into the file and then truncates it to 
the appropriate size.  This difference is crucial if you are on a 
multiuser system that is out of file space.

> Second, it is nonportable. It is Berkeleyoid specific code.
> Ftruncate marks it as such. There are also those UNIX-specific I/O
> calls.

It is as portable as possible.  Tell me how to accomplish the same
correctly on a POSIX/v7 system and I will be happy to make the program
more portable. [By the way, a request for such an equiv has been
posted to comp.unix.wizards, so people interested in this coding
aspect should watch for conversation on it in that group (assuming
that there is anything to say other than it just can't be done except in BSD).]

> Third, it is a bad program, even if we didn't care about portability.
> 
>     1) The glorified numeric error codes. I mean really!  Each error
>        message is just "Error so-and-so : read the source..."!  Each
>        error message should have said something useful, dammit!

That has been improved in the new release (Jan688 release of INSERT in
alt.sources).  However, since perror was used to print the original
messages, they were pretty useful to begin with.  Incidently, the new
release adopts the convention that if only one file name is given,
then standard input is read for what to insert (a sufficiently useful
enhancement to merit a the new release).


>     2) The lack of variable naming consistency. If we're going to
>        have the verbosity of words-for-names and capitalization of
>        each word, names like `fdin', `fdout', and `chs' don't belong.
>        (This one is arguable.)

Consider it argued.  fdin, fdout, and ch are sufficiently standard in
unix programming that they are almost predeclared.  chs for multiple
ch's is just a bit of humor.  The rest of the names are equally
descriptive, although they are longer, e.g., ReadExit and ReadSize
(perhaps ReadTotal would be even better, but I will hold off a release
for that until some more useful can be bundled with it).

>     3) The unnecessary code sequences. The lseek's and the ftruncate
>        serve no purpose that I can see.

The lseeks position the file at the beginning.  Oddly enough, the man
page on open doesn't assert that this is default on our system. From a
diagnostic point of view, the lseeks let one know what the state of
the file is before the first read, which could be interesting.  The
ftruncate is crucial, otherwise when writing a smaller file into a
bigger one, the size doesn't change as it should.

>     4) The use of a cast to create a long constant. This indicates
>        ignorance of C, or a slavish obedience to a not-understood
>        coding standard.

Aren't you worried about portability to systems with 16 bit ints
running BSD?  The man page clearly says that arg is a long and lint
expects it.

>     5) Probably others, but I'm too blind mad to read it again.

I guess it was a rather long piece of code for you to have to read
all the way thru.

> We'll further note that writing a structured program and using a
> consistent style (except for the naming and one minor glitch) did not
> prevent him from writing a bad program.

That remains to be demonstrated.

> And this !@#$%^&* guy teaches!!!!! ?????

But some refuse to learn.

---- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

bill@ssbn.WLK.COM (Bill Kennedy) (01/09/89)

In article <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> webber@athos.rutgers.edu (Bob Webber) writes:
>In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
>> It is better to remain silent and be thought a fool than to speak and
>> remove all doubt.

I felt moved to follow up when Bill posted here but I didn't because I felt
fairly sure that Bob would.

I don't frequently side with Bob or Bill, but this time I'm going to side
with Bob.  Bill brought this over from alt.sources and stuck it here.  Bob
followed up here, why not?  Seems logical to me.

I'm following up Bob's article to ask that any ensuing discussion go back
over to alt.sources, where Bob posted in the first place.  It doesn't make
a lot of sense to discuss it in a comp group where the original article
never appeared.  Bill stated his points, Bob answered them, can we keep
(any of) the rest over where it started?  Thanks,
-- 
Bill Kennedy  usenet      {killer,att,cs.utexas.edu,sun!daver}!ssbn!bill
              internet    bill@ssbn.WLK.COM

bill@twwells.uucp (T. William Wells) (01/10/89)

Well, the expected mail caused by my flame has been running about 2-1
for those more-or-less agreeing with my critique, as opposed to those
who just don't like flamers.

But I'm going to surprise you all.

You see, while I don't like incompetence, and I reserve a particularly
toasty place for those who not only are incompetent but also teach
(hence the virulence of my previous posting), Mr. Webber has done the
only thing which, under the circumstances, could obtain my approval:
he worked to improve his program.

I also owe him a half an apology for missing the essential point of
his program: that it is intended to overwrite the output file
*without* freeing any blocks before the write is finished.  But only
half of one: this essential point was not documented.

As a consequence of these, I have removed him from my kill file and
shall remain civil till given cause.

So, to the meat of the matter:

In article <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> webber@athos.rutgers.edu (Bob Webber) writes:
: > Second, it is nonportable. It is Berkeleyoid specific code.
: > Ftruncate marks it as such. There are also those UNIX-specific I/O
: > calls.
:
: It is as portable as possible.  Tell me how to accomplish the same
: correctly on a POSIX/v7 system and I will be happy to make the program
: more portable.

On ftruncate, I think you are right: other versions of UNIX don't have
the capability, as far as I know. However, the UNIX I/O calls should
not have been used, the standard I/O calls should have instead; the
call to ftruncate should have used fileno() to get the fd it wants.
Also, the use of ftruncate (and fileno, if used) needs to be
commented.  I note that you did so in your latest.

Another portability nit: the program depends overmuch on system-
specific include files. As near as I can tell, the only thing
obtained from the include files are the standard I/O stuff and the
constant MAXBSIZE. The include of stdio.h is, of course, necessary,
but the rest don't appear to be. As for the constant, BUFSIZ, from
stdio.h, should be used instead.

: >     1) The glorified numeric error codes. I mean really!  Each error
: >        message is just "Error so-and-so : read the source..."!  Each
: >        error message should have said something useful, dammit!
:
: That has been improved in the new release (Jan688 release of INSERT in
: alt.sources).  However, since perror was used to print the original
: messages, they were pretty useful to begin with.

The error codes in your new version are a slight improvement; however,
they could be significantly better.

: >     3) The unnecessary code sequences. The lseek's
:
: The lseeks position the file at the beginning.  Oddly enough, the man
: page on open doesn't assert that this is default on our system. From a
: diagnostic point of view, the lseeks let one know what the state of
: the file is before the first read, which could be interesting.

The lseek's are completely unnecessary, regardless of omissions in
the man page. And of course, one should use standard I/O anyway.

: >     4) The use of a cast to create a long constant.
:
: Aren't you worried about portability to systems with 16 bit ints
: running BSD?  The man page clearly says that arg is a long and lint
: expects it.

One should write a long integer constant as the value followed by a
lower or upper case `l', as in 0L.

: >     5) Probably others, but I'm too blind mad to read it again.
:
: I guess it was a rather long piece of code for you to have to read
: all the way thru.

Hardly. However, I did miss these three points about the program:

    1) The name of the program follows the grand UNIX tradition: it
       has significance only to the author. How about calling it
       something like `fileover' (for `file overwrite')?

    2) `Int ReadSize;'! Try `long ReadSize;'.

    3) The program is abysmally documented, though it has been
       slightly improved in the latest version.

Here are suggestions for improving the documentation:

    1) Fix the program name.

    2) Note the nonportabilities in the header.

    3) Fix the error messages. Remove the error numbers; to whomever
       they might be useful to, the exit codes will be as useful.
       Make the rest of the messages meaningful to one who does not
       have the source code.

       Example: change

      perror("Error 1 : ((fdin = open(argv[1],O_RDONLY)) == -1):");

       into (after setting up ProgramName)

      fprintf(stderr, "%s: error while opening %s for read: ",
	  ProgramName, argv[1]);
      perror("");

    4) State the purpose and functioning of the program more clearly.
       In particular, state that the program works even when there
       are other processes that could grab the file space that would
       have been freed had cp been used instead.

    5) There are two distinct code sections, one for argument parsing
       and one for copying the file. These each deserve at least a
       one line comment, to distinguish the two.

One final point: in your new-and-improved version, you essentially
duplicate the open code for the output file. Better would be to set a
variable to the argument to be used for the output file and then do
the output open in one place.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

peter@ficc.uu.net (Peter da Silva) (01/12/89)

In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
>       perror("Error 1 : ((fdin = open(argv[1],O_RDONLY)) == -1):");

>        into (after setting up ProgramName)

>       fprintf(stderr, "%s: error while opening %s for read: ",
> 	  ProgramName, argv[1]);
>       perror("");

On many, if not all, systems this will eventually produce the error message:

	insert: error while opening foobar.c for read: : Not a typewriter

Or:

	insert: error ...: : Inappropriate ioctl for device.

Because that fprintf() will stomp on errno. Please, people, if you're
going to do stuff before you call perror save and restore errno, or do
this:

	{ char buffer[BUFSIZ];
	  sprintf(buffer, "%s: error while opening %s for read",
		ProgramName, argv[1]);
	  perror(buffer);
	}

Don't you just love getting bounced mail with the error message indicating
that "Fred Bloggs" is not a typewriter, or even worse that someone attempted
to perform an inappropriate ioctl on him? I'm glad I'm not Fred.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

guy@auspex.UUCP (Guy Harris) (01/12/89)

>Because that fprintf() will stomp on errno.

...

>Don't you just love getting bounced mail with the error message indicating
>that "Fred Bloggs" is not a typewriter, or even worse that someone attempted
>to perform an inappropriate ioctl on him? I'm glad I'm not Fred.

That's an unrelated problem.  "sendmail" has the habit of including the
error message for "errno" in its messages, *even if the failure that
provoked the message wasn't a system call failure and "errno" is just
left over from a system call N operations ago.*  Needless to say, it
really shouldn't be doing that. 

bill@twwells.uucp (T. William Wells) (01/12/89)

In article <2692@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
: In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
: >       perror("Error 1 : ((fdin = open(argv[1],O_RDONLY)) == -1):");
:
: >        into (after setting up ProgramName)
:
: >       fprintf(stderr, "%s: error while opening %s for read: ",
: >       ProgramName, argv[1]);
: >       perror("");
:
: On many, if not all, systems this will eventually produce the error message:
:
:       insert: error while opening foobar.c for read: : Not a typewriter
:
: Or:
:
:       insert: error ...: : Inappropriate ioctl for device.

Damn! You are right. Me, I normally grab the error message from the
tables and format it into my own messages, so, when offering this
suggestion, it slipped by me that fprintf may futz with errno.

Sorry.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

webber@porthos.rutgers.edu (Bob Webber) (01/12/89)

It turns out that the man page on open does claim to start the file
pointer at the beginning.  As far as I am concerned, the lseek is the
equiv of initializing in the program text something that is already
implicitly properly initialized.

I see no reason to prefer the fopen(3) based routines over the open(2)
based routines.  Certainly the open(2) routines were quite adequate to
the task at hand.

Although a few people have mentioned to me that I could have written
0L instead (long) 0; no one has been able to show me why I should
prefer to.  This seems purely a matter of personal style (and naturally
I prefer my own way of doing things which is why I do them that way).

At the time I wrote the program, I only accessed BSD man pages and so
was not aware that non-BSD systems lacked ftruncate (a function which
really should be added to any system that has to contend with tight
disk space).  Besides mentioning the use of ftruncate, the 2nd version
did mention that this was BSD dependent.  People who want to run INSERT
are strongly urged to get a BSD-based UNIX system.

Since ftruncate expects an unsigned long parameter, READSIZE has been
updated to be an unsigned long (hence, believe it or not, the 12Jan89
revision just posted for those people who want to stay on the cutting
edge of INSERT development).

The main motivation for the coding style was to make sure all errors
were covered, so that the program wouldn't fail without the user knowing
that it had.  Lord knows that there are plenty of programs out there
with pretty error messages that ignore many system error returns.
The fact that the program cannot be conveniently used without easy
access to its source seems to me a benefit rather than a drawback.
Fortunately we live in a country where people are permitted to
write code according to their own lights.  

If someone else wants to write a version by their own lights, I have
but one word of advice: read carefully the man pages of all system
routines used and test the program out on a file system that is 105%
full -- pretty user interfaces on broken programs are of no use to
anyone.

--- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

p.s., all of the versions posted work fine on int==long BSD systems;
only the 12Jan89 version is likely to work on int==short!=long BSD
systems (i.e., PDP-11s and perhaps some microcomputers).

john@frog.UUCP (John Woods) (01/14/89)

In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
> : >     3) The unnecessary code sequences. The lseek's
> The lseek's are completely unnecessary, regardless of omissions in
> the man page. And of course, one should use standard I/O anyway.
> 
I just checked the 3 Nov 87 draft of the ANSI C spec.  I was astonished to
find that they do not say what the value of the file pointer is after opening
a pathname.  Could someone with a recent copy check on this?  The draft POSIX
standard that I checked (12.3, the version of which was just short of being
approved) DOES explicitly say that open() sets the file position to the
beginning of the file.

Frankly, I find the statement "..., regardless of omissions in
the man page" offensive from someone complaining about coding style, as it
smacks much too strongly of the "I read the source, I know what it REALLY
does, and all the world's a VAX anyway" mentality.  Probably that's not
really what you feel, but that's how it reads.
-- 
John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101
...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu

Go be a `traves wasswort.		- Doug Gwyn

allbery@ncoast.UUCP (Brandon S. Allbery) (01/15/89)

Bill's not going to be too happy with me:  I side with Bob Webber on this.
[Heresy!  ;-) ]

As quoted from <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> by webber@athos.rutgers.edu (Bob Webber):
+---------------
| In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
| > First, let's notice that this is essentially equivalent the cp
| > command, minus cp's frills. It does do one thing that I don't believe
| > all cp's do: it verifies that the copy succeeded. But one can do the
| > job with a little shell script that does a cp, some ls's, and some
| > tests.  Thus this program is an example of bad analysis: it is a new
| > program to do what existing tools can already do.
| 
| No, it is fundamentally different from cp.  cp truncates to 0 an
| existing file (or unlinks it) before writing into it (or creating
| a new one).  INSERT writes into the file and then truncates it to 
| the appropriate size.  This difference is crucial if you are on a 
| multiuser system that is out of file space.
+---------------

One could argue that the difference should not be necessary.  But I've seen
cases where it would have been quite nice.

+---------------
| > Second, it is nonportable. It is Berkeleyoid specific code.
| > Ftruncate marks it as such. There are also those UNIX-specific I/O
| > calls.
| 
| It is as portable as possible.  Tell me how to accomplish the same
| correctly on a POSIX/v7 system and I will be happy to make the program
| more portable. [By the way, a request for such an equiv has been
| posted to comp.unix.wizards, so people interested in this coding
| aspect should watch for conversation on it in that group (assuming
| that there is anything to say other than it just can't be done except in BSD).]
+---------------

Xenix 5.3.1, at least, has chsize(); this amounts to the same thing.  V7
and System V.2 have no such functionality.  I don't know about Posix.

Bill:  ftruncate() being the whole point of the program, referring to it as
a non-portable cp sounds like a missed target to me.

+---------------
| > Third, it is a bad program, even if we didn't care about portability.
| > 
| >     2) The lack of variable naming consistency. If we're going to
| >        have the verbosity of words-for-names and capitalization of
| >        each word, names like `fdin', `fdout', and `chs' don't belong.
| >        (This one is arguable.)
| 
| Consider it argued.  fdin, fdout, and ch are sufficiently standard in
| unix programming that they are almost predeclared.  chs for multiple
| ch's is just a bit of humor.  The rest of the names are equally
| descriptive, although they are longer, e.g., ReadExit and ReadSize
| (perhaps ReadTotal would be even better, but I will hold off a release
| for that until some more useful can be bundled with it).
+---------------

I have yet to be flamed for calling a random counter variable "cnt", a
random stat buffer "sb", a random file descriptor or file pointer "fd" or
"fp" (respectively), a random char pointer "cp", etc.  This is a foolish
attack, given that you haven't flamed anyone *else* who uses the convention.

+---------------
| >     3) The unnecessary code sequences. The lseek's and the ftruncate
| >        serve no purpose that I can see.
| 
| The lseeks position the file at the beginning.  Oddly enough, the man
| page on open doesn't assert that this is default on our system. From a
+---------------

UNIX seems to assure it; but for portability, you hit the mark.  I've *used*
OSes where the file pointer was undefined after an open().  [I prefer not to
think about them...!]  Unfortunately, the use of the low-level calls --
including ftruncate() pretty much invalidates the portability.  Nice try,
though.  [Actually, enough C libraries implement open(), read(), write(),
etc. on non-UNIX systems that they can be considered at least somewhat
portable.]

+---------------
| >     4) The use of a cast to create a long constant. This indicates
| >        ignorance of C, or a slavish obedience to a not-understood
| >        coding standard.
| 
| Aren't you worried about portability to systems with 16 bit ints
| running BSD?  The man page clearly says that arg is a long and lint
| expects it.
+---------------

Complaining about the use of "(long) 0" when "0L" will do is along the lines
of complaining about the use of "ls ." when "echo *" will do....

+---------------
| > We'll further note that writing a structured program and using a
| > consistent style (except for the naming and one minor glitch) did not
| > prevent him from writing a bad program.
| 
| That remains to be demonstrated.
+---------------

It also raises a *true* point:  the use of structure and consistent style
didn't make the program's *purpose* clear to at least one net.reader.  Which
is why "self-documenting" languages and programming styles have always
failed.

++Brandon
(P.S.  Bill:  Take a Valium and try again when you're calmer.)
-- 
Brandon S. Allbery, moderator of comp.sources.misc    allbery@ncoast.org (soon)
uunet!hal.cwru.edu!ncoast!allbery		    ncoast!allbery@hal.cwru.edu
      Send comp.sources.misc submissions to comp-sources-misc@<backbone>
NCoast Public Access UN*X - (216) 781-6201, 300/1200/2400 baud, login: makeuser

bill@twwells.uucp (T. William Wells) (01/16/89)

In article <Jan.12.05.48.19.1989.9091@porthos.rutgers.edu> webber@porthos.rutgers.edu (Bob Webber) writes:
: It turns out that the man page on open does claim to start the file
: pointer at the beginning.  As far as I am concerned, the lseek is the
: equiv of initializing in the program text something that is already
: implicitly properly initialized.

With one difference: the unnecessary initialization of data (usually)
does not have a cost.  The lseek's do.

: I see no reason to prefer the fopen(3) based routines over the open(2)
: based routines.  Certainly the open(2) routines were quite adequate to
: the task at hand.

Portability. They are specific to Unix.

: Although a few people have mentioned to me that I could have written
: 0L instead (long) 0; no one has been able to show me why I should
: prefer to.  This seems purely a matter of personal style (and naturally
: I prefer my own way of doing things which is why I do them that way).

Consider: a complex construct is more difficult to understand than a
simple one. 0L is a single token, and is understood to be one.
(long)0 is an expression, which requires some level of interpretation.
Even though this is a trivial case, consistent application of the
implied principle creates better understandable code.

: At the time I wrote the program, I only accessed BSD man pages and so
: was not aware that non-BSD systems lacked ftruncate (a function which
: really should be added to any system that has to contend with tight
: disk space).  Besides mentioning the use of ftruncate, the 2nd version
: did mention that this was BSD dependent.  People who want to run INSERT
: are strongly urged to get a BSD-based UNIX system.

:-)

But seriously, I do wish BSD was available on a machine I can afford.

: The main motivation for the coding style was to make sure all errors
: were covered, so that the program wouldn't fail without the user knowing
: that it had.  Lord knows that there are plenty of programs out there
: with pretty error messages that ignore many system error returns.
: The fact that the program cannot be conveniently used without easy
: access to its source seems to me a benefit rather than a drawback.

I'm not going to argue that point, not here; but let me suggest that
you are in a distinct minority. Most people consider having to go to
the source to understand a program to be either a failure to generate
good error messages or bad documentation, or both.

: If someone else wants to write a version by their own lights, I have
: but one word of advice: read carefully the man pages of all system
: routines used and test the program out on a file system that is 105%
: full -- pretty user interfaces on broken programs are of no use to
: anyone.

Amen.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

bill@twwells.uucp (T. William Wells) (01/16/89)

In article <1354@X.UUCP> john@frog.UUCP (John Woods) writes:
: In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
: > : >     3) The unnecessary code sequences. The lseek's
: > The lseek's are completely unnecessary, regardless of omissions in
: > the man page. And of course, one should use standard I/O anyway.
: >
: I just checked the 3 Nov 87 draft of the ANSI C spec.  I was astonished to
: find that they do not say what the value of the file pointer is after opening
: a pathname.  Could someone with a recent copy check on this?

It's in 4.9.3 of the May 88 draft.

: Frankly, I find the statement "..., regardless of omissions in
: the man page" offensive from someone complaining about coding style, as it
: smacks much too strongly of the "I read the source, I know what it REALLY
: does, and all the world's a VAX anyway" mentality.  Probably that's not
: really what you feel, but that's how it reads.

I couldn't. I don't have sources. (Whine :-) What I do know is that
it is said in my manual pages, and that all Unixes that I know of do
properly open the file at its beginning. Therefore, he either misread
the manual page (which is what actually happened) or the manual page
has an omission.

Consider the number of programs that would break if this weren't so
and you'll see why I considered any possible screwup in the manual
page to be irrelevant.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

root@chessene.UUCP (This System) (01/17/89)

In article <Jan.12.05.48.19.1989.9091@porthos.rutgers.edu> webber@porthos.rutgers.edu (Bob Webber) writes:
>did mention that this was BSD dependent.  People who want to run INSERT
>are strongly urged to get a BSD-based UNIX system.

Oh good grief.

That's the silliest statement I've ever seen on Usenet.
--
Mark Buda 				Domain: hermit@chessene.uucp
Dumb: ...rutgers!bpa!vu-vlsi!devon!chessene!hermit
"Here, with a compressed air drill, parsnips are harvested." - an old newsreel

bobmon@iuvax.cs.indiana.edu (RAMontante) (01/19/89)

->did mention that this was BSD dependent.  People who want to run INSERT
->are strongly urged to get a BSD-based UNIX system.
-
-Oh good grief.
-
-That's the silliest statement I've ever seen on Usenet.


Sometimes you don't really want a smiley.  You want a tongue-in-cheeky.

Anyone who wants to run my RPN calculator program is urged to get an
MSDOS-based graphic system.  Or a real HP pocket calculator.

guy@auspex.UUCP (Guy Harris) (01/19/89)

 >>did mention that this was BSD dependent.  People who want to run INSERT
 >>are strongly urged to get a BSD-based UNIX system.
 >
 >Oh good grief.
 >
 >That's the silliest statement I've ever seen on Usenet.

OK, a non-silly statement of the same basic warning would be:

	People who want to run INSERT should note that, due to the way
	it is obliged to work in order to achieve the goal for which it
	was designed, it must be able to call a procedure that will,
	given a file descriptor or a path name, truncate the file
	referred to by that file descriptor or path name to a specified,
	and possibly non-zero, length.

	BSD-based UNIX systems (which include systems not explicitly
	advertised as such) have such a call, named "ftruncate", and
	other UNIX systems may also have such a call, whether named
	"ftruncate" or not.  INSERT cannot be run on systems lacking
	such a call.

(As for it being "the silliest statement (you've) seen on Usenet", I, at
least, have seen far "sillier" statements on USENET; one common category
is the bold claim, presented as fact, that can be shown to be completely
false by trivially looking something up.)

bill@twwells.uucp (T. William Wells) (01/19/89)

In article <13336@ncoast.UUCP> allbery@ncoast.UUCP (Brandon S. Allbery) writes:
: As quoted from <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> by webber@athos.rutgers.edu (Bob Webber):
: +---------------
: | In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
: | > Second, it is nonportable. It is Berkeleyoid specific code.
: | > Ftruncate marks it as such. There are also those UNIX-specific I/O
: | > calls.
: |
: | It is as portable as possible.  Tell me how to accomplish the same
: | correctly on a POSIX/v7 system and I will be happy to make the program
: | more portable. [By the way, a request for such an equiv has been
: | posted to comp.unix.wizards, so people interested in this coding
: | aspect should watch for conversation on it in that group (assuming
: | that there is anything to say other than it just can't be done except in BSD).]
: +---------------
:
: Xenix 5.3.1, at least, has chsize(); this amounts to the same thing.  V7
: and System V.2 have no such functionality.  I don't know about Posix.
:
: Bill:  ftruncate() being the whole point of the program, referring to it as
: a non-portable cp sounds like a missed target to me.

Well, I did miss the point of the program. And, as has already been
hashed out, using ftruncate or some other nonportability somewhere in
the program is pretty much forced, since truncating a file is
nonportable.

However, my portability complaint has more to it than that.  See
below.

: +---------------
: | > Third, it is a bad program, even if we didn't care about portability.
: | >
: | >     2) The lack of variable naming consistency. If we're going to
: | >        have the verbosity of words-for-names and capitalization of
: | >        each word, names like `fdin', `fdout', and `chs' don't belong.
: | >        (This one is arguable.)
: |
: | Consider it argued.  fdin, fdout, and ch are sufficiently standard in
: | unix programming that they are almost predeclared.  chs for multiple
: | ch's is just a bit of humor.  The rest of the names are equally
: | descriptive, although they are longer, e.g., ReadExit and ReadSize
: | (perhaps ReadTotal would be even better, but I will hold off a release
: | for that until some more useful can be bundled with it).
: +---------------
:
: I have yet to be flamed for calling a random counter variable "cnt", a
: random stat buffer "sb", a random file descriptor or file pointer "fd" or
: "fp" (respectively), a random char pointer "cp", etc.  This is a foolish
: attack, given that you haven't flamed anyone *else* who uses the convention.

The point of the complaint was not the names. Me, I'd use simple
names and be done with it. Consistency is one of the highest virtues
when it comes to style; the complaint was that this code used two
different styles for naming.

: +---------------
: | >     3) The unnecessary code sequences. The lseek's and the ftruncate
: | >        serve no purpose that I can see.
: |
: | The lseeks position the file at the beginning.  Oddly enough, the man
: | page on open doesn't assert that this is default on our system. From a
: +---------------
:
: UNIX seems to assure it; but for portability, you hit the mark.  I've *used*
: OSes where the file pointer was undefined after an open().  [I prefer not to
: think about them...!]  Unfortunately, the use of the low-level calls --
: including ftruncate() pretty much invalidates the portability.  Nice try,
: though.  [Actually, enough C libraries implement open(), read(), write(),
: etc. on non-UNIX systems that they can be considered at least somewhat
: portable.]

Y'all have missed the point about portability. There are four
nonportable things in the program:

    1) The include files
    2) Ftruncate
    3) The Unix-style I/O calls
    4) The int/long identity assumption

We can look at three levels of portability:

    1) Portability within Berkeley-like systems.

       The code is Berkeley specific. It had all four of the above
       sins, only the last of which has been fixed; thus the code
       can't be expected to port cleanly to non-Berkeley systems.

    2) Portability within Unix-like systems.

       Two things would make it better for porting to non-Berkeley
       Unix-like systems:

       Put the ftruncate call in a macro, and put the macro at the
       top of the program, so that people who are porting the program
       can see what is going on. For systems which do have some way
       of doing a file truncate, the change is then simple and
       obvious.

       Remove all the include files except stdio.h. They are there to
       get ONE constant that would be better obtained from stdio.h
       (BUFSIZ is the replacement I have in mind).

       At this or the previous level of portability the lseek's are
       not required.

    3) Portability in general.

       Presuming that one has fixed the above problems, there still
       remains the Unix-style file I/O. While it is true that many
       systems provide a similar interface, not all do. Thus, for
       this level of portability, one must replace those calls with
       standard I/O calls.

       For example, Microsoft C for MS-DOS has all the Unix-style I/O
       calls but the second parameter to open() is different!  MS-DOS
       does have a way to truncate files, so this is not moot. (For
       those who are curious, one truncates a file by seeking to the
       desired position to truncate and writing zero bytes. One
       probably has to do the latter as an explicit DOS call, since,
       presuming that the write() call isn't brain damaged, it
       filters out zero byte writes.)

       Note also that, if the code were written for this level of
       portability, the lseek's would still be unnecessary.

: +---------------
: | > We'll further note that writing a structured program and using a
: | > consistent style (except for the naming and one minor glitch) did not
: | > prevent him from writing a bad program.
: |
: | That remains to be demonstrated.
: +---------------
:
: It also raises a *true* point:  the use of structure and consistent style
: didn't make the program's *purpose* clear to at least one net.reader.  Which
: is why "self-documenting" languages and programming styles have always
: failed.

Damn right. And they always will. "How" is not "why".

Oh yeah, I've found another use for the program. Suppose you have a
file specially laid out on your file system and you want to update
the file. If you just do a cp, the file gets truncated and then new
blocks are allocated, probably at random. Using this program would
eliminate that by directly re-using the blocks.  This is useful, even
on a single-thread system. Like MS-DOS. (Yuck!)

"How" is not "why"!

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

john@frog.UUCP (John Woods) (01/19/89)

In article <333@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
>In article <1354@X.UUCP> john@frog.UUCP (John Woods) writes:
>:In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes:
>:> : >     3) The unnecessary code sequences. The lseek's
>:> The lseek's are completely unnecessary, regardless of omissions in
>:> the man page. And of course, one should use standard I/O anyway.
>:>
>:I just checked the 3 Nov 87 draft of the ANSI C spec.  I was astonished to
>:find that they do not say what the value of the file pointer is after
>:opening a pathname.  Could someone with a recent copy check on this?
> 
> It's in 4.9.3 of the May 88 draft.

Which, I find, is where it is in the old draft (and not under the open()
section).  Grr.  
-- 
John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101
...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu

Presumably this means that it is vital to get the wrong answers quickly.
		Kernighan and Plauger, The Elements of Programming Style