webber@athos.rutgers.edu (Bob Webber) (01/09/89)
In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: > It is better to remain silent and be thought a fool than to speak and > remove all doubt. But fortunately you spoke. It takes a brave man to make a fool of themself like this just to learn a bit more about unix. The net takes its hat off to you Bill. Now, let's take a look at the program: > First, let's notice that this is essentially equivalent the cp > command, minus cp's frills. It does do one thing that I don't believe > all cp's do: it verifies that the copy succeeded. But one can do the > job with a little shell script that does a cp, some ls's, and some > tests. Thus this program is an example of bad analysis: it is a new > program to do what existing tools can already do. No, it is fundamentally different from cp. cp truncates to 0 an existing file (or unlinks it) before writing into it (or creating a new one). INSERT writes into the file and then truncates it to the appropriate size. This difference is crucial if you are on a multiuser system that is out of file space. > Second, it is nonportable. It is Berkeleyoid specific code. > Ftruncate marks it as such. There are also those UNIX-specific I/O > calls. It is as portable as possible. Tell me how to accomplish the same correctly on a POSIX/v7 system and I will be happy to make the program more portable. [By the way, a request for such an equiv has been posted to comp.unix.wizards, so people interested in this coding aspect should watch for conversation on it in that group (assuming that there is anything to say other than it just can't be done except in BSD).] > Third, it is a bad program, even if we didn't care about portability. > > 1) The glorified numeric error codes. I mean really! Each error > message is just "Error so-and-so : read the source..."! Each > error message should have said something useful, dammit! That has been improved in the new release (Jan688 release of INSERT in alt.sources). However, since perror was used to print the original messages, they were pretty useful to begin with. Incidently, the new release adopts the convention that if only one file name is given, then standard input is read for what to insert (a sufficiently useful enhancement to merit a the new release). > 2) The lack of variable naming consistency. If we're going to > have the verbosity of words-for-names and capitalization of > each word, names like `fdin', `fdout', and `chs' don't belong. > (This one is arguable.) Consider it argued. fdin, fdout, and ch are sufficiently standard in unix programming that they are almost predeclared. chs for multiple ch's is just a bit of humor. The rest of the names are equally descriptive, although they are longer, e.g., ReadExit and ReadSize (perhaps ReadTotal would be even better, but I will hold off a release for that until some more useful can be bundled with it). > 3) The unnecessary code sequences. The lseek's and the ftruncate > serve no purpose that I can see. The lseeks position the file at the beginning. Oddly enough, the man page on open doesn't assert that this is default on our system. From a diagnostic point of view, the lseeks let one know what the state of the file is before the first read, which could be interesting. The ftruncate is crucial, otherwise when writing a smaller file into a bigger one, the size doesn't change as it should. > 4) The use of a cast to create a long constant. This indicates > ignorance of C, or a slavish obedience to a not-understood > coding standard. Aren't you worried about portability to systems with 16 bit ints running BSD? The man page clearly says that arg is a long and lint expects it. > 5) Probably others, but I'm too blind mad to read it again. I guess it was a rather long piece of code for you to have to read all the way thru. > We'll further note that writing a structured program and using a > consistent style (except for the naming and one minor glitch) did not > prevent him from writing a bad program. That remains to be demonstrated. > And this !@#$%^&* guy teaches!!!!! ????? But some refuse to learn. ---- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)
bill@ssbn.WLK.COM (Bill Kennedy) (01/09/89)
In article <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> webber@athos.rutgers.edu (Bob Webber) writes: >In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: >> It is better to remain silent and be thought a fool than to speak and >> remove all doubt. I felt moved to follow up when Bill posted here but I didn't because I felt fairly sure that Bob would. I don't frequently side with Bob or Bill, but this time I'm going to side with Bob. Bill brought this over from alt.sources and stuck it here. Bob followed up here, why not? Seems logical to me. I'm following up Bob's article to ask that any ensuing discussion go back over to alt.sources, where Bob posted in the first place. It doesn't make a lot of sense to discuss it in a comp group where the original article never appeared. Bill stated his points, Bob answered them, can we keep (any of) the rest over where it started? Thanks, -- Bill Kennedy usenet {killer,att,cs.utexas.edu,sun!daver}!ssbn!bill internet bill@ssbn.WLK.COM
bill@twwells.uucp (T. William Wells) (01/10/89)
Well, the expected mail caused by my flame has been running about 2-1
for those more-or-less agreeing with my critique, as opposed to those
who just don't like flamers.
But I'm going to surprise you all.
You see, while I don't like incompetence, and I reserve a particularly
toasty place for those who not only are incompetent but also teach
(hence the virulence of my previous posting), Mr. Webber has done the
only thing which, under the circumstances, could obtain my approval:
he worked to improve his program.
I also owe him a half an apology for missing the essential point of
his program: that it is intended to overwrite the output file
*without* freeing any blocks before the write is finished. But only
half of one: this essential point was not documented.
As a consequence of these, I have removed him from my kill file and
shall remain civil till given cause.
So, to the meat of the matter:
In article <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> webber@athos.rutgers.edu (Bob Webber) writes:
: > Second, it is nonportable. It is Berkeleyoid specific code.
: > Ftruncate marks it as such. There are also those UNIX-specific I/O
: > calls.
:
: It is as portable as possible. Tell me how to accomplish the same
: correctly on a POSIX/v7 system and I will be happy to make the program
: more portable.
On ftruncate, I think you are right: other versions of UNIX don't have
the capability, as far as I know. However, the UNIX I/O calls should
not have been used, the standard I/O calls should have instead; the
call to ftruncate should have used fileno() to get the fd it wants.
Also, the use of ftruncate (and fileno, if used) needs to be
commented. I note that you did so in your latest.
Another portability nit: the program depends overmuch on system-
specific include files. As near as I can tell, the only thing
obtained from the include files are the standard I/O stuff and the
constant MAXBSIZE. The include of stdio.h is, of course, necessary,
but the rest don't appear to be. As for the constant, BUFSIZ, from
stdio.h, should be used instead.
: > 1) The glorified numeric error codes. I mean really! Each error
: > message is just "Error so-and-so : read the source..."! Each
: > error message should have said something useful, dammit!
:
: That has been improved in the new release (Jan688 release of INSERT in
: alt.sources). However, since perror was used to print the original
: messages, they were pretty useful to begin with.
The error codes in your new version are a slight improvement; however,
they could be significantly better.
: > 3) The unnecessary code sequences. The lseek's
:
: The lseeks position the file at the beginning. Oddly enough, the man
: page on open doesn't assert that this is default on our system. From a
: diagnostic point of view, the lseeks let one know what the state of
: the file is before the first read, which could be interesting.
The lseek's are completely unnecessary, regardless of omissions in
the man page. And of course, one should use standard I/O anyway.
: > 4) The use of a cast to create a long constant.
:
: Aren't you worried about portability to systems with 16 bit ints
: running BSD? The man page clearly says that arg is a long and lint
: expects it.
One should write a long integer constant as the value followed by a
lower or upper case `l', as in 0L.
: > 5) Probably others, but I'm too blind mad to read it again.
:
: I guess it was a rather long piece of code for you to have to read
: all the way thru.
Hardly. However, I did miss these three points about the program:
1) The name of the program follows the grand UNIX tradition: it
has significance only to the author. How about calling it
something like `fileover' (for `file overwrite')?
2) `Int ReadSize;'! Try `long ReadSize;'.
3) The program is abysmally documented, though it has been
slightly improved in the latest version.
Here are suggestions for improving the documentation:
1) Fix the program name.
2) Note the nonportabilities in the header.
3) Fix the error messages. Remove the error numbers; to whomever
they might be useful to, the exit codes will be as useful.
Make the rest of the messages meaningful to one who does not
have the source code.
Example: change
perror("Error 1 : ((fdin = open(argv[1],O_RDONLY)) == -1):");
into (after setting up ProgramName)
fprintf(stderr, "%s: error while opening %s for read: ",
ProgramName, argv[1]);
perror("");
4) State the purpose and functioning of the program more clearly.
In particular, state that the program works even when there
are other processes that could grab the file space that would
have been freed had cp been used instead.
5) There are two distinct code sections, one for argument parsing
and one for copying the file. These each deserve at least a
one line comment, to distinguish the two.
One final point: in your new-and-improved version, you essentially
duplicate the open code for the output file. Better would be to set a
variable to the argument to be used for the output file and then do
the output open in one place.
---
Bill
{ uunet!proxftl | novavax } !twwells!bill
peter@ficc.uu.net (Peter da Silva) (01/12/89)
In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: > perror("Error 1 : ((fdin = open(argv[1],O_RDONLY)) == -1):"); > into (after setting up ProgramName) > fprintf(stderr, "%s: error while opening %s for read: ", > ProgramName, argv[1]); > perror(""); On many, if not all, systems this will eventually produce the error message: insert: error while opening foobar.c for read: : Not a typewriter Or: insert: error ...: : Inappropriate ioctl for device. Because that fprintf() will stomp on errno. Please, people, if you're going to do stuff before you call perror save and restore errno, or do this: { char buffer[BUFSIZ]; sprintf(buffer, "%s: error while opening %s for read", ProgramName, argv[1]); perror(buffer); } Don't you just love getting bounced mail with the error message indicating that "Fred Bloggs" is not a typewriter, or even worse that someone attempted to perform an inappropriate ioctl on him? I'm glad I'm not Fred. -- Peter da Silva, Xenix Support, Ferranti International Controls Corporation. Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180. `-_-' Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net. 'U` Opinions may not represent the policies of FICC or the Xenix Support group.
guy@auspex.UUCP (Guy Harris) (01/12/89)
>Because that fprintf() will stomp on errno. ... >Don't you just love getting bounced mail with the error message indicating >that "Fred Bloggs" is not a typewriter, or even worse that someone attempted >to perform an inappropriate ioctl on him? I'm glad I'm not Fred. That's an unrelated problem. "sendmail" has the habit of including the error message for "errno" in its messages, *even if the failure that provoked the message wasn't a system call failure and "errno" is just left over from a system call N operations ago.* Needless to say, it really shouldn't be doing that.
bill@twwells.uucp (T. William Wells) (01/12/89)
In article <2692@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: : In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: : > perror("Error 1 : ((fdin = open(argv[1],O_RDONLY)) == -1):"); : : > into (after setting up ProgramName) : : > fprintf(stderr, "%s: error while opening %s for read: ", : > ProgramName, argv[1]); : > perror(""); : : On many, if not all, systems this will eventually produce the error message: : : insert: error while opening foobar.c for read: : Not a typewriter : : Or: : : insert: error ...: : Inappropriate ioctl for device. Damn! You are right. Me, I normally grab the error message from the tables and format it into my own messages, so, when offering this suggestion, it slipped by me that fprintf may futz with errno. Sorry. --- Bill { uunet!proxftl | novavax } !twwells!bill
webber@porthos.rutgers.edu (Bob Webber) (01/12/89)
It turns out that the man page on open does claim to start the file pointer at the beginning. As far as I am concerned, the lseek is the equiv of initializing in the program text something that is already implicitly properly initialized. I see no reason to prefer the fopen(3) based routines over the open(2) based routines. Certainly the open(2) routines were quite adequate to the task at hand. Although a few people have mentioned to me that I could have written 0L instead (long) 0; no one has been able to show me why I should prefer to. This seems purely a matter of personal style (and naturally I prefer my own way of doing things which is why I do them that way). At the time I wrote the program, I only accessed BSD man pages and so was not aware that non-BSD systems lacked ftruncate (a function which really should be added to any system that has to contend with tight disk space). Besides mentioning the use of ftruncate, the 2nd version did mention that this was BSD dependent. People who want to run INSERT are strongly urged to get a BSD-based UNIX system. Since ftruncate expects an unsigned long parameter, READSIZE has been updated to be an unsigned long (hence, believe it or not, the 12Jan89 revision just posted for those people who want to stay on the cutting edge of INSERT development). The main motivation for the coding style was to make sure all errors were covered, so that the program wouldn't fail without the user knowing that it had. Lord knows that there are plenty of programs out there with pretty error messages that ignore many system error returns. The fact that the program cannot be conveniently used without easy access to its source seems to me a benefit rather than a drawback. Fortunately we live in a country where people are permitted to write code according to their own lights. If someone else wants to write a version by their own lights, I have but one word of advice: read carefully the man pages of all system routines used and test the program out on a file system that is 105% full -- pretty user interfaces on broken programs are of no use to anyone. --- BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber) p.s., all of the versions posted work fine on int==long BSD systems; only the 12Jan89 version is likely to work on int==short!=long BSD systems (i.e., PDP-11s and perhaps some microcomputers).
john@frog.UUCP (John Woods) (01/14/89)
In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: > : > 3) The unnecessary code sequences. The lseek's > The lseek's are completely unnecessary, regardless of omissions in > the man page. And of course, one should use standard I/O anyway. > I just checked the 3 Nov 87 draft of the ANSI C spec. I was astonished to find that they do not say what the value of the file pointer is after opening a pathname. Could someone with a recent copy check on this? The draft POSIX standard that I checked (12.3, the version of which was just short of being approved) DOES explicitly say that open() sets the file position to the beginning of the file. Frankly, I find the statement "..., regardless of omissions in the man page" offensive from someone complaining about coding style, as it smacks much too strongly of the "I read the source, I know what it REALLY does, and all the world's a VAX anyway" mentality. Probably that's not really what you feel, but that's how it reads. -- John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101 ...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu Go be a `traves wasswort. - Doug Gwyn
allbery@ncoast.UUCP (Brandon S. Allbery) (01/15/89)
Bill's not going to be too happy with me: I side with Bob Webber on this. [Heresy! ;-) ] As quoted from <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> by webber@athos.rutgers.edu (Bob Webber): +--------------- | In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: | > First, let's notice that this is essentially equivalent the cp | > command, minus cp's frills. It does do one thing that I don't believe | > all cp's do: it verifies that the copy succeeded. But one can do the | > job with a little shell script that does a cp, some ls's, and some | > tests. Thus this program is an example of bad analysis: it is a new | > program to do what existing tools can already do. | | No, it is fundamentally different from cp. cp truncates to 0 an | existing file (or unlinks it) before writing into it (or creating | a new one). INSERT writes into the file and then truncates it to | the appropriate size. This difference is crucial if you are on a | multiuser system that is out of file space. +--------------- One could argue that the difference should not be necessary. But I've seen cases where it would have been quite nice. +--------------- | > Second, it is nonportable. It is Berkeleyoid specific code. | > Ftruncate marks it as such. There are also those UNIX-specific I/O | > calls. | | It is as portable as possible. Tell me how to accomplish the same | correctly on a POSIX/v7 system and I will be happy to make the program | more portable. [By the way, a request for such an equiv has been | posted to comp.unix.wizards, so people interested in this coding | aspect should watch for conversation on it in that group (assuming | that there is anything to say other than it just can't be done except in BSD).] +--------------- Xenix 5.3.1, at least, has chsize(); this amounts to the same thing. V7 and System V.2 have no such functionality. I don't know about Posix. Bill: ftruncate() being the whole point of the program, referring to it as a non-portable cp sounds like a missed target to me. +--------------- | > Third, it is a bad program, even if we didn't care about portability. | > | > 2) The lack of variable naming consistency. If we're going to | > have the verbosity of words-for-names and capitalization of | > each word, names like `fdin', `fdout', and `chs' don't belong. | > (This one is arguable.) | | Consider it argued. fdin, fdout, and ch are sufficiently standard in | unix programming that they are almost predeclared. chs for multiple | ch's is just a bit of humor. The rest of the names are equally | descriptive, although they are longer, e.g., ReadExit and ReadSize | (perhaps ReadTotal would be even better, but I will hold off a release | for that until some more useful can be bundled with it). +--------------- I have yet to be flamed for calling a random counter variable "cnt", a random stat buffer "sb", a random file descriptor or file pointer "fd" or "fp" (respectively), a random char pointer "cp", etc. This is a foolish attack, given that you haven't flamed anyone *else* who uses the convention. +--------------- | > 3) The unnecessary code sequences. The lseek's and the ftruncate | > serve no purpose that I can see. | | The lseeks position the file at the beginning. Oddly enough, the man | page on open doesn't assert that this is default on our system. From a +--------------- UNIX seems to assure it; but for portability, you hit the mark. I've *used* OSes where the file pointer was undefined after an open(). [I prefer not to think about them...!] Unfortunately, the use of the low-level calls -- including ftruncate() pretty much invalidates the portability. Nice try, though. [Actually, enough C libraries implement open(), read(), write(), etc. on non-UNIX systems that they can be considered at least somewhat portable.] +--------------- | > 4) The use of a cast to create a long constant. This indicates | > ignorance of C, or a slavish obedience to a not-understood | > coding standard. | | Aren't you worried about portability to systems with 16 bit ints | running BSD? The man page clearly says that arg is a long and lint | expects it. +--------------- Complaining about the use of "(long) 0" when "0L" will do is along the lines of complaining about the use of "ls ." when "echo *" will do.... +--------------- | > We'll further note that writing a structured program and using a | > consistent style (except for the naming and one minor glitch) did not | > prevent him from writing a bad program. | | That remains to be demonstrated. +--------------- It also raises a *true* point: the use of structure and consistent style didn't make the program's *purpose* clear to at least one net.reader. Which is why "self-documenting" languages and programming styles have always failed. ++Brandon (P.S. Bill: Take a Valium and try again when you're calmer.) -- Brandon S. Allbery, moderator of comp.sources.misc allbery@ncoast.org (soon) uunet!hal.cwru.edu!ncoast!allbery ncoast!allbery@hal.cwru.edu Send comp.sources.misc submissions to comp-sources-misc@<backbone> NCoast Public Access UN*X - (216) 781-6201, 300/1200/2400 baud, login: makeuser
bill@twwells.uucp (T. William Wells) (01/16/89)
In article <Jan.12.05.48.19.1989.9091@porthos.rutgers.edu> webber@porthos.rutgers.edu (Bob Webber) writes:
: It turns out that the man page on open does claim to start the file
: pointer at the beginning. As far as I am concerned, the lseek is the
: equiv of initializing in the program text something that is already
: implicitly properly initialized.
With one difference: the unnecessary initialization of data (usually)
does not have a cost. The lseek's do.
: I see no reason to prefer the fopen(3) based routines over the open(2)
: based routines. Certainly the open(2) routines were quite adequate to
: the task at hand.
Portability. They are specific to Unix.
: Although a few people have mentioned to me that I could have written
: 0L instead (long) 0; no one has been able to show me why I should
: prefer to. This seems purely a matter of personal style (and naturally
: I prefer my own way of doing things which is why I do them that way).
Consider: a complex construct is more difficult to understand than a
simple one. 0L is a single token, and is understood to be one.
(long)0 is an expression, which requires some level of interpretation.
Even though this is a trivial case, consistent application of the
implied principle creates better understandable code.
: At the time I wrote the program, I only accessed BSD man pages and so
: was not aware that non-BSD systems lacked ftruncate (a function which
: really should be added to any system that has to contend with tight
: disk space). Besides mentioning the use of ftruncate, the 2nd version
: did mention that this was BSD dependent. People who want to run INSERT
: are strongly urged to get a BSD-based UNIX system.
:-)
But seriously, I do wish BSD was available on a machine I can afford.
: The main motivation for the coding style was to make sure all errors
: were covered, so that the program wouldn't fail without the user knowing
: that it had. Lord knows that there are plenty of programs out there
: with pretty error messages that ignore many system error returns.
: The fact that the program cannot be conveniently used without easy
: access to its source seems to me a benefit rather than a drawback.
I'm not going to argue that point, not here; but let me suggest that
you are in a distinct minority. Most people consider having to go to
the source to understand a program to be either a failure to generate
good error messages or bad documentation, or both.
: If someone else wants to write a version by their own lights, I have
: but one word of advice: read carefully the man pages of all system
: routines used and test the program out on a file system that is 105%
: full -- pretty user interfaces on broken programs are of no use to
: anyone.
Amen.
---
Bill
{ uunet!proxftl | novavax } !twwells!bill
bill@twwells.uucp (T. William Wells) (01/16/89)
In article <1354@X.UUCP> john@frog.UUCP (John Woods) writes: : In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: : > : > 3) The unnecessary code sequences. The lseek's : > The lseek's are completely unnecessary, regardless of omissions in : > the man page. And of course, one should use standard I/O anyway. : > : I just checked the 3 Nov 87 draft of the ANSI C spec. I was astonished to : find that they do not say what the value of the file pointer is after opening : a pathname. Could someone with a recent copy check on this? It's in 4.9.3 of the May 88 draft. : Frankly, I find the statement "..., regardless of omissions in : the man page" offensive from someone complaining about coding style, as it : smacks much too strongly of the "I read the source, I know what it REALLY : does, and all the world's a VAX anyway" mentality. Probably that's not : really what you feel, but that's how it reads. I couldn't. I don't have sources. (Whine :-) What I do know is that it is said in my manual pages, and that all Unixes that I know of do properly open the file at its beginning. Therefore, he either misread the manual page (which is what actually happened) or the manual page has an omission. Consider the number of programs that would break if this weren't so and you'll see why I considered any possible screwup in the manual page to be irrelevant. --- Bill { uunet!proxftl | novavax } !twwells!bill
root@chessene.UUCP (This System) (01/17/89)
In article <Jan.12.05.48.19.1989.9091@porthos.rutgers.edu> webber@porthos.rutgers.edu (Bob Webber) writes: >did mention that this was BSD dependent. People who want to run INSERT >are strongly urged to get a BSD-based UNIX system. Oh good grief. That's the silliest statement I've ever seen on Usenet. -- Mark Buda Domain: hermit@chessene.uucp Dumb: ...rutgers!bpa!vu-vlsi!devon!chessene!hermit "Here, with a compressed air drill, parsnips are harvested." - an old newsreel
bobmon@iuvax.cs.indiana.edu (RAMontante) (01/19/89)
->did mention that this was BSD dependent. People who want to run INSERT ->are strongly urged to get a BSD-based UNIX system. - -Oh good grief. - -That's the silliest statement I've ever seen on Usenet. Sometimes you don't really want a smiley. You want a tongue-in-cheeky. Anyone who wants to run my RPN calculator program is urged to get an MSDOS-based graphic system. Or a real HP pocket calculator.
guy@auspex.UUCP (Guy Harris) (01/19/89)
>>did mention that this was BSD dependent. People who want to run INSERT >>are strongly urged to get a BSD-based UNIX system. > >Oh good grief. > >That's the silliest statement I've ever seen on Usenet. OK, a non-silly statement of the same basic warning would be: People who want to run INSERT should note that, due to the way it is obliged to work in order to achieve the goal for which it was designed, it must be able to call a procedure that will, given a file descriptor or a path name, truncate the file referred to by that file descriptor or path name to a specified, and possibly non-zero, length. BSD-based UNIX systems (which include systems not explicitly advertised as such) have such a call, named "ftruncate", and other UNIX systems may also have such a call, whether named "ftruncate" or not. INSERT cannot be run on systems lacking such a call. (As for it being "the silliest statement (you've) seen on Usenet", I, at least, have seen far "sillier" statements on USENET; one common category is the bold claim, presented as fact, that can be shown to be completely false by trivially looking something up.)
bill@twwells.uucp (T. William Wells) (01/19/89)
In article <13336@ncoast.UUCP> allbery@ncoast.UUCP (Brandon S. Allbery) writes: : As quoted from <Jan.8.18.48.59.1989.10861@athos.rutgers.edu> by webber@athos.rutgers.edu (Bob Webber): : +--------------- : | In article <308@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: : | > Second, it is nonportable. It is Berkeleyoid specific code. : | > Ftruncate marks it as such. There are also those UNIX-specific I/O : | > calls. : | : | It is as portable as possible. Tell me how to accomplish the same : | correctly on a POSIX/v7 system and I will be happy to make the program : | more portable. [By the way, a request for such an equiv has been : | posted to comp.unix.wizards, so people interested in this coding : | aspect should watch for conversation on it in that group (assuming : | that there is anything to say other than it just can't be done except in BSD).] : +--------------- : : Xenix 5.3.1, at least, has chsize(); this amounts to the same thing. V7 : and System V.2 have no such functionality. I don't know about Posix. : : Bill: ftruncate() being the whole point of the program, referring to it as : a non-portable cp sounds like a missed target to me. Well, I did miss the point of the program. And, as has already been hashed out, using ftruncate or some other nonportability somewhere in the program is pretty much forced, since truncating a file is nonportable. However, my portability complaint has more to it than that. See below. : +--------------- : | > Third, it is a bad program, even if we didn't care about portability. : | > : | > 2) The lack of variable naming consistency. If we're going to : | > have the verbosity of words-for-names and capitalization of : | > each word, names like `fdin', `fdout', and `chs' don't belong. : | > (This one is arguable.) : | : | Consider it argued. fdin, fdout, and ch are sufficiently standard in : | unix programming that they are almost predeclared. chs for multiple : | ch's is just a bit of humor. The rest of the names are equally : | descriptive, although they are longer, e.g., ReadExit and ReadSize : | (perhaps ReadTotal would be even better, but I will hold off a release : | for that until some more useful can be bundled with it). : +--------------- : : I have yet to be flamed for calling a random counter variable "cnt", a : random stat buffer "sb", a random file descriptor or file pointer "fd" or : "fp" (respectively), a random char pointer "cp", etc. This is a foolish : attack, given that you haven't flamed anyone *else* who uses the convention. The point of the complaint was not the names. Me, I'd use simple names and be done with it. Consistency is one of the highest virtues when it comes to style; the complaint was that this code used two different styles for naming. : +--------------- : | > 3) The unnecessary code sequences. The lseek's and the ftruncate : | > serve no purpose that I can see. : | : | The lseeks position the file at the beginning. Oddly enough, the man : | page on open doesn't assert that this is default on our system. From a : +--------------- : : UNIX seems to assure it; but for portability, you hit the mark. I've *used* : OSes where the file pointer was undefined after an open(). [I prefer not to : think about them...!] Unfortunately, the use of the low-level calls -- : including ftruncate() pretty much invalidates the portability. Nice try, : though. [Actually, enough C libraries implement open(), read(), write(), : etc. on non-UNIX systems that they can be considered at least somewhat : portable.] Y'all have missed the point about portability. There are four nonportable things in the program: 1) The include files 2) Ftruncate 3) The Unix-style I/O calls 4) The int/long identity assumption We can look at three levels of portability: 1) Portability within Berkeley-like systems. The code is Berkeley specific. It had all four of the above sins, only the last of which has been fixed; thus the code can't be expected to port cleanly to non-Berkeley systems. 2) Portability within Unix-like systems. Two things would make it better for porting to non-Berkeley Unix-like systems: Put the ftruncate call in a macro, and put the macro at the top of the program, so that people who are porting the program can see what is going on. For systems which do have some way of doing a file truncate, the change is then simple and obvious. Remove all the include files except stdio.h. They are there to get ONE constant that would be better obtained from stdio.h (BUFSIZ is the replacement I have in mind). At this or the previous level of portability the lseek's are not required. 3) Portability in general. Presuming that one has fixed the above problems, there still remains the Unix-style file I/O. While it is true that many systems provide a similar interface, not all do. Thus, for this level of portability, one must replace those calls with standard I/O calls. For example, Microsoft C for MS-DOS has all the Unix-style I/O calls but the second parameter to open() is different! MS-DOS does have a way to truncate files, so this is not moot. (For those who are curious, one truncates a file by seeking to the desired position to truncate and writing zero bytes. One probably has to do the latter as an explicit DOS call, since, presuming that the write() call isn't brain damaged, it filters out zero byte writes.) Note also that, if the code were written for this level of portability, the lseek's would still be unnecessary. : +--------------- : | > We'll further note that writing a structured program and using a : | > consistent style (except for the naming and one minor glitch) did not : | > prevent him from writing a bad program. : | : | That remains to be demonstrated. : +--------------- : : It also raises a *true* point: the use of structure and consistent style : didn't make the program's *purpose* clear to at least one net.reader. Which : is why "self-documenting" languages and programming styles have always : failed. Damn right. And they always will. "How" is not "why". Oh yeah, I've found another use for the program. Suppose you have a file specially laid out on your file system and you want to update the file. If you just do a cp, the file gets truncated and then new blocks are allocated, probably at random. Using this program would eliminate that by directly re-using the blocks. This is useful, even on a single-thread system. Like MS-DOS. (Yuck!) "How" is not "why"! --- Bill { uunet!proxftl | novavax } !twwells!bill
john@frog.UUCP (John Woods) (01/19/89)
In article <333@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: >In article <1354@X.UUCP> john@frog.UUCP (John Woods) writes: >:In article <313@twwells.uucp>, bill@twwells.uucp (T. William Wells) writes: >:> : > 3) The unnecessary code sequences. The lseek's >:> The lseek's are completely unnecessary, regardless of omissions in >:> the man page. And of course, one should use standard I/O anyway. >:> >:I just checked the 3 Nov 87 draft of the ANSI C spec. I was astonished to >:find that they do not say what the value of the file pointer is after >:opening a pathname. Could someone with a recent copy check on this? > > It's in 4.9.3 of the May 88 draft. Which, I find, is where it is in the old draft (and not under the open() section). Grr. -- John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101 ...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu Presumably this means that it is vital to get the wrong answers quickly. Kernighan and Plauger, The Elements of Programming Style