mcneill@eplrx7.UUCP (mcneill) (11/16/88)
Here's a weird one folks:
As superuser on one of our machines I have this problem sometimes:
eplrx7 {root} % cat ~mcneill/.login
cat: write error bad file number
eplrx7 {root} % cat /u3/mcneill/.login
---- works fine ----
eplrx7 {root} % exit
(eplrx7:20) cat ~mcneill/.login
---- works fine ----
su again....
eplrx7 {root} % cat ~mcneill/.login
---- works fine ----
The problem is not confined to ~mcneill/.login. It happens with different
~users trying to cat different files. The problem seems to have something
to do with expansion of ~. The operating system is SUNOS 3.5 running yellow
pages.
--
Keith D. McNeill | E.I. du Pont de Nemours & Co.
uunet!eplrx7!mcneill | Experimental Station
(302) 695-7395 | P.O. Box 80357
| Wilmington, Delaware 19880-0357
jc@minya.UUCP (John Chambers) (11/21/88)
In article <41@eplrx7.UUCP>, mcneill@eplrx7.UUCP (mcneill) writes: > > As superuser on one of our machines I have this problem sometimes: > > eplrx7 {root} % cat ~mcneill/.login > cat: write error bad file number > Hey, an excuse to post one of my favorite flames: Once again, some turkey programmer wrote an application that produced an error message, and didn't say what caused the error. The program could have given more details, and perhaps the info would have helped track down the problem. For instance, suppose the error had been: > cat: write error bad file number 65537="/tmp/fubar" telling us that the program had thought it was accessing /tmp/fubar, and the variable containing the file number got garbaged. When passed on to the vendor's support people, it would give some hints as to where the problem lies. Just saying "bad file number" isn't very helpful. It's especially annoying to be told that a program failed because of "Permission denied", and not be told what the problem is. Knowing the name of a file it was trying to open (or exec) will usually, after a quick "ls -l" or "ls -ld", lead to an explanation. Without the file name, it's often hopeless. How can we get programmers to do this right? It isn't difficult. Or perhaps we should be hitting on the QA people, and let them know how shoddy we think their product is if it won't even tell us why it is failing. Any ideas? -- John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393) [Any errors in the above are due to failures in the logic of the keyboard, not in the fingers that did the typing.]
gandalf@csli.STANFORD.EDU (Juergen Wagner) (11/22/88)
In article <134@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >... >Hey, an excuse to post one of my favorite flames: Once again, some >turkey programmer wrote an application that produced an error message, >and didn't say what caused the error. >... One of the worst examples of this is probably the disk formatter on XEROX 1108 LispMachines. When you are formatting a disk, and some error occurs, you get the message Something is wrong. Cannot proceed. and you are back to the executive prompt. Very helpful, indeed. Something must be wrong... :-) -- Juergen Wagner gandalf@csli.stanford.edu wagner@arisia.xerox.com
ps72234@naakka.tut.fi (Pertti Suomela) (11/22/88)
In article <41@eplrx7.UUCP> mcneill@eplrx7.UUCP (mcneill) writes:
eplrx7 {root} % cat ~mcneill/.login
cat: write error bad file number
Tcsh (I don't know if you were running it) contains a bug. If I try to
complete a '~user/file' -style filename by hitting <tab> in the middle
of typing 'file', similar action results. I do not know any way to
cure the bug, you just have to avoid completing filenames containing
'~username'. Tcsh gets into a strange state after the described error
condition. To get it straight again, type 'echo ~legalusername' twice.
In the first time, echo won't print anything (strange?), but in the
second time it works normally.
--
Pertti Suomela, studying (?) at ! Internet: ps72234@tut.fi
Tampere University of Technology, ! UUCP: ps72234@tut.uucp
Finland ! Bitnet: ps72234@fintut.bitnet
nagel@blanche.ics.uci.edu (Mark Nagel) (11/25/88)
In article <PS72234.88Nov22020044@naakka.tut.fi>, ps72234@naakka (Pertti Suomela) writes: |In article <41@eplrx7.UUCP> mcneill@eplrx7.UUCP (mcneill) writes: | | eplrx7 {root} % cat ~mcneill/.login | cat: write error bad file number | |Tcsh (I don't know if you were running it) contains a bug. It isn't just tcsh -- csh has the same bug. So the problem is in the original csh source, not in anything extra tcsh brings along. What the actual problem is? I'm as clueless as the next person! If anyone knows what's going on here, please, feel free to give us a hint... Mark D. Nagel UC Irvine - Dept of Info and Comp Sci | radiation n. 1. the act or process nagel@ics.uci.edu (ARPA) | of radiating. 2. smog with an {sdcsvax|ucbvax}!ucivax!nagel (UUCP) | attitude.
jay@phoenix.Princeton.EDU (Jay Plett) (11/25/88)
In article <PS72234.88Nov22020044@naakka.tut.fi>, ps72234@naakka.tut.fi (Pertti Suomela) writes: > In article <41@eplrx7.UUCP> mcneill@eplrx7.UUCP (mcneill) writes: >> eplrx7 {root} % cat ~mcneill/.login >> cat: write error bad file number > > Tcsh (I don't know if you were running it) contains a bug. > .....................................type 'echo ~legalusername' twice. > In the first time, echo won't print anything (strange?), but in the > second time it works normally. I get the same thing on a Sun 386i running a different hacked csh. Haven't bothered to track it down yet, but here's a hint that might help someone who can be bothered: the shell fires up the first command with file descriptor 1 (stdout) closed. Here's the relevant trace output from running ls twice in succession: % trace ls ~pro . . . open ("/home/pro", 0, 037367540134) = 1 . . . close (1) = 0 write (1, "admin\nbugs\ndistrib\nledger\nmail\nm".., 61) = 0 . . . % trace ls ~pro open ("/home/pro", 0, 037367540134) = 3 . . . close (3) = 0 . . . write (1, "admin bugs distri".., 108) = 108 . . . close (1) = 0 --- ...jay@princeton.edu
peter@stca77.stc.oz (Peter Jeremy) (11/28/88)
In article <134@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >It's especially annoying to be told that a program failed because of >"Permission denied", and not be told what the problem is. Knowing the >name of a file it was trying to open (or exec) will usually, after a >quick "ls -l" or "ls -ld", lead to an explanation. Without the file >name, it's often hopeless. This leads one into the area of whether you want a secure system or a friendly/usable one. If you want a really secure system, you don't want to tell the users what went wrong, because if they were permitted to do it, they wouldn't have gotten the message. If they are violating security, any information you give them might help them to get around the security system. Anecdote time: I once worked on an OS (not Unix or a flavour thereof) with a hole in this area - If you tried to create a file, it returned the error "file already exists" if that file existed, whether or not you had permission to access the file or directory. In some cases, just _knowing_ that a file exists (or doesn't exist) can be useful information. >How can we get programmers to do this right? From the security point of view, it is right. Having said all that, I agree that messages like "Permission Denied" are a severe pain when one is trying to debug a system. I tend towards the view that you always provide additional information - just not necesssarily in a form useful to the end user (like giving the source file/line and internal error numbers when an error occurs) when the end-user is just a user. What it comes down to is, do we want Unix to be friendly and helpful, or secure? I prefer the friendly approach personally. -- Peter Jeremy (VK2PJ) peter@stca77.stc.oz Alcatel-STC Australia ...!uunet!stca77.stc.oz!peter 41 Mandible St peter%stca77.stc.oz@uunet.UU.NET ALEXANDRIA NSW 2015
dupuy@douglass.columbia.edu (Alexander Dupuy) (11/29/88)
In article <PS72234.88Nov22020044@naakka.tut.fi>, ps72234@naakka (Pertti Suomela) writes: |In article <41@eplrx7.UUCP> mcneill@eplrx7.UUCP (mcneill) writes: | | eplrx7 {root} % cat ~mcneill/.login | cat: write error bad file number | |Tcsh (I don't know if you were running it) contains a bug. The problem is not with cat, in fact, cat is the most helpful program you could run in the circumstance, because it tells you what the problem is. Most commands just execute with no output whatsoever. This is much worse than a weird error message. However, the cat error message is still not helpful enough - I only figured out the bug using ofiles(1), available from a sun-spots archive near you. This bug exists in all versions of the csh which do filename completion on Suns (or other NFS systems) running yellow pages. What happens is that when getpwnam(3) is called to find the home directory for some user (~j-user) in the tilde() function, the yellow pages opens up a UDP socket or two to talk to the various yellow pages daemons (ypbind, ypserv). It closes the first of these, but leaves the second one open. Since the csh keeps file descriptors 0-3 unused (it stashes stdin, etc. up around 20), the file descriptor for this socket is usually 1 (aka stdout). When the csh fork/execs a program, it moves stdin, back down to the 0-2 range. But because the yp socket is already down there, the stdout (1) file descriptor gets closed when all is said and done (this may be due to close-on-exec being set for 1, or yp closing it explicitly, or something else, I never bothered to find out). The result of this is that your cat was invoked with no file descriptor 1. As a result it got a write error (EBADF bad file number). Most programs which never checked the result of writes to stdout got errors too, but didn't tell you about them. You can duplicate this with 'sh -c "program >&-"'. The fix, in tilde() is to make a call to the undocumented, but extremely useful yp_unbind() function after you've gotten the answer back from getpwnam(3). You shouls also check that the closem() function invokes yp_unbind(), although that may not be necessary if you fix tilde() (at any rate, it won't hurt). @alex -- inet: dupuy@columbia.edu uucp: ...!rutgers!columbia!dupuy
gregg@ihlpb.ATT.COM (Wonderly) (12/06/88)
>>How can we get programmers to do this right? > From the security point of view, it is right. > > Having said all that, I agree that messages like "Permission Denied" are > a severe pain when one is trying to debug a system. I tend towards the > view that you always provide additional information - just not necesssarily > in a form useful to the end user (like giving the source file/line and > internal error numbers when an error occurs) when the end-user is just > a user. The biggest problem is getting people to use the OS error messages and capabilities instead of inventing their own. Time after time I have changed. if ((fd = creat (file, 0600)) == -1) { printf ("Can't create some file\n"); handle_the_error_exit(); } to if ((fd = creat (file, 0600)) == -1) { perror (file); handle_the_error_exit(); } in code I have ported from the net. Perror(3) (and the associated sys_errlist array) is one of the MOST useful parts of the C-library under UN*X (please don't start another 'errno should not be global' war though). -- It isn't the DREAM that NASA's missing... DOMAIN: gregg@ihlpb.att.com It's a direction! UUCP: att!ihlpb!gregg
ditto@cbmvax.UUCP (Michael "Ford" Ditto) (12/06/88)
In article <360@stca77.stc.oz> peter@stca77.stc.oz (Peter Jeremy) writes: >In article <134@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >>It's especially annoying to be told that a program failed because of >>"Permission denied", and not be told what the problem is. > >This leads one into the area of whether you want a secure system or >a friendly/usable one. If you want a really secure system, you don't >want to tell the users what went wrong, because if they were permitted >to do it, they wouldn't have gotten the message. If they are violating >security, any information you give them might help them to get around >the security system. Although you are right in the particular example of "Permission denied", I think the original complaint was about error reporting in general, not reporting of security violations. This is a particular pet peeve of mine, and I always make it a point to call perror() with the name of the program, and a description of the operation that failed. A *minimal* error message should be something like: $ cat foo cat: can't open "foo": Permission denied yet, on many systems, the above command would print out foo: Permission denied or cat: can't open input. or cat: Permission denied, none of which is very useful, and some of them can be quite misleading. The first message, for example, seems to be from a program called "foo", and the last one makes it appear that the user doesn't have permission to run the cat program. -- -=] Ford [=- "The number of Unix installations (In Real Life: Mike Ditto) has grown to 10, with more expected." ford@kenobi.cts.com - The Unix Programmer's Manual, ...!sdcsvax!crash!elgar!ford 2nd Edition, June, 1972. ditto@cbmvax.commodore.com