vance@mtxinu.UUCP (Vance Vaughan) (11/09/84)
4.2 BUGLIST ABSTRACTS from MT XINU, part 5 of 10: The following is part of the 4.2 buglist abstracts as processed by Mt Xinu. The initial line of each abstract gives the offending program or source file and source directory (separated by --), who submitted the bug, when, and whether or not it contained a proposed fix. Due to license restrictions, no source is included in these abstracts. Important general information and disclaimers about this and other lists is appended at the end of the list... lpr/printjob.c--usr.lib dagobah!efo (Eben Ostby) 17 Nov 83 +FIX lpd will die silently occasionally. Upon putting debugging info into it, we found that the fields used to store items like fromhost, logname, jobname, class, title were (a) woefully short, (b) not checked for bounds overflow. REPEAT BY: lpr -Panyone -Janythingatallthatsover32characters anything FIX: increase size of those fields; use strncpy(class, line+1, sizeof class-1); instead of strcpy. _______________________________________________________________________________ lprm--usr.lib salkind@nyu (Lou Salkind) 30 May 84 +FIX If you try to remove a file local to a machine, lprm can dump core. REPEAT BY: lpr lprm nnn _______________________________________________________________________________ lprm--usr.lib Tim Morgan <morgan@uci-750a> 6 Aug 84 +FIX When you use lprm to remove a file from a printer queue, if that job is the currently active job for that queue, the daemon is killed to stop it from printing the job. The job is then dequeued, and lprm attempts to restart the queue. But it always fails with the message /usr/lib/lpd: <host>: unknown printer REPEAT BY: Try removing the active job with lprm. _______________________________________________________________________________ ls.c--bin dlw@ucbopal.CC (David L. Wasley) 22 Nov 83 +FIX The name lookup algorithm is horribly slow for UIDs > 2047. It essentially reverts to rescanning the passwd file. REPEAT BY: Cd to the parent directory of a group of users who's UIDs are predominantly > 2047. Do 'ls -lg' and wait ... For example: A directory with 56 entries took 11.0u 3.6s 1:18 18% 22+86k 34+7io 14pf+0w to list with the distributed /bin/ls. With the improved version (see below) it took 1.2u 1.3s 0:12 21% 20+71k 5+4io 2pf+0w Roughly a factor of 6 !! _______________________________________________________________________________ ls.c--bin sun!Jskud 21 Nov 83 +FIX (1) ls wants to follow symbolic links if the -F flag is set (2) ls wants to follow symbolic links if linked to a directory (3) ls does not handle -F and -l properly REPEAT BY: cd /tmp && ln -s /bin Jx ls -l Jx will lrwxr-xr-x 1 Jskud 4 Nov 21 15:28 Jx -> /bin yet ls -lF Jx will long list all of /bin! and ls -lF /tmp will drwxr-xr-x 2 bin 1024 Oct 20 14:48 Jx/ (that is, /Jx appears to be 1024 bytes big and a directory, when it should appear to be 4 bytes long and a symbolic link) _______________________________________________________________________________ ls.c--bin chris@maryland (Chris Torek) 2 Oct 84 +FIX ``ls'' has too small a field width for the inode number printed with the -i option. REPEAT BY: run ``ls -li'' on the H partition of a full RA81, in a directory with inode numbers > 99999. FIX: Change the %5d column to a %6d column -- that should handle up to about a 2 gigabyte file system.... (Just for fun: at 2048 file bytes per inode times 1,000,000 inodes => 2.048 gigabytes, which is a little more than four and a half full RA81s.) Chris _______________________________________________________________________________ lseek.2--man donn@utah-cs (Donn Seeley) 23 Sep 84 +FIX The lseek() manual page omits the useful fact that the 'whence' cookies can be found in <sys/file.h>. It also describes the type of the offset as 'int' when it really ought to be 'off_t' or at least 'long', for consistency with the lint library and the kernel. REPEAT BY: N/A. FIX: Here are the changes I made. To check the changes, I made a file which uses the includes and the defines, and compiled it; it worked and got no complaints from the compiler. The changes are in man2/lseek.2: ---------------------------------------------------------------- *** /tmp/,RCSt1023012 Sun Sep 23 05:17:09 1984 --- lseek.2 Sun Sep 23 04:56:28 1984 *************** *** 5,11 .SH SYNOPSIS .nf .ft B ! .ta 1.25i 1.6i #define L_SET 0 /* set the seek pointer */ #define L_INCR 1 /* increment the seek pointer */ #define L_XTND 2 /* extend the file size */ --- 5,16 ----- .SH SYNOPSIS .nf .ft B ! #include <sys/types.h> ! #include <sys/file.h> ! .PP ! .nf ! .ft B ! .ta 1.25i 1.6i 1.8i #define L_SET 0 /* set the seek pointer */ #define L_INCR 1 /* increment the seek pointer */ #define L_XTND 2 /* extend the file size */ *************** *** 12,19 .PP .ft B pos = lseek(d, offset, whence) ! int pos; ! int d, offset, whence; .fi .ft R .SH DESCRIPTION --- 17,26 ----- .PP .ft B pos = lseek(d, offset, whence) ! off_t pos; ! int d; ! off_t offset; ! int whence; .fi .ft R .SH DESCRIPTION ---------------------------------------------------------------- If you want to go with the typedefs, then the entry for lseek() in usr.bin/lint/llib-lc should change from long lseek(f, o, d) long o; { return(0); } to off_t lseek(f, o, d) off_t o; { return(0); } In any event, the lint library and the manual page differ and need to be made consistent... Donn Seeley University of Utah CS Dept donn@utah-cs.arpa 40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn _______________________________________________________________________________ mail--bin smoot@ut-sally.ARPA (Smoot Carl-Mitchell) 27 Mar 84 +FIX file locking using 4.2 flock The following mods to /bin/mail and /usr/ucb/Mail do the mail spool file locking using the 4.2 BSD flock system call. Note that the release version of Mail was not doing locking when reading the spool file. This is wrong and can result in lost or mangled mail under some circumstances. This fix also allows /usr/spool/mail to be protected 755. The only side effect of this is that user mail boxes are truncated to zero length, rather than being deleted. _______________________________________________________________________________ mail--ucb raphael@wisc-crys.arpa (Raphael Finkel) 10 Oct 84 The word 'at' in subject lines of messages gets truncated to '@' REPEAT BY: Mail yourself a message with '~s What's at to ya?' _______________________________________________________________________________ mail--ucb mayo@UCBCALDER 7 Aug 83 The syntax 'user@system' changes into 'system:user' when replying to mail. REPEAT BY: Replying (r command) to this mail header (on ucbcalder): From grace@UCBIC Sat Aug 6 21:57:34 1983 From: grace@UCBIC (Grace Mah) Subject: tpack.lib To: mayo@kim Cc: duksoon@cad gives this mail header: To: kim:mayo grace@UCBIC Subject: Re: tpack.lib Cc: duksoon@cad _______________________________________________________________________________ mail.c--bin cbosgd!mark Jun 4 83 +FIX The following appeared on Usenet - the bug is still in 4.1c: From: phil@sequel.UUCP Subject: YAMB (Yet Another Mail Bug) Date: Thu, 2-Jun-83 16:11:26 EDT There is another bug in 4.1 /bin/mail. I just had my third complaint of mail getting mixed together so decided to fix it once and for all. I looked through my bug files and found only the comsat problem (fwrite to mail but no flush or close before writing to comsat). I had already fixed the comsat bug. REPEAT BY: I suspected the locking protocol was fubar (was running setuid root, etc). This proved not to be the case. I tested it by running the following script: (mail -s "Try number 1" phil < /etc/passwd ) & (mail -s "Try number 2" phil < /etc/passwd ) & (mail -s "Try number 3" phil < /etc/passwd ) & (mail -s "Try number 4" phil < /etc/passwd ) & (mail -s "Try number 5" phil < /etc/passwd ) & I found that usually only 1 to 3 copies arrived usually in a jumbled order. Humm. Must be failing to lock the mailbox right? WRONG! I crawled through the routine lock(), and lock1(). They appear to be 100% ok. In looking at where mail does the local delivery I noticed the following sequence of code: _______________________________________________________________________________ mail/cmd3.c--ucb smoot@ut-sally.ARPA (Smoot Carl-Mitchell) 27 Jan 84 +FIX The reply command in Mail does not handle replies properly. Currently the To: list in the original message is placed in the To: list of the reply. It should be placed in the Cc: field. Also the resulting address list is blank delimited instead of comma delimited as it should be. Another problem is that the heuristic for deleting the receipients name from the reply list often does not work, especially for network mail. This fix adds an "ownname" function which uses a slightly more powerful heuristic, especially for internet and uucp mail. Note that the calls to "mapf" have been ifdef'ed out also. There is no reason for Mail to know about the network topology, since sendmail knows about it. REPEAT BY: since sendmail knows about it. calls to "mapf" have been ifdef'ed out also. There is _______________________________________________________________________________ make--bin root%wisc-stat.uwisc@wisc-crys.arpa 18 Jul 84 +FIX In the documentation for `make', they mention that the sequence of suffixes that `make' will try when trying to create a .o file is .o.c.e.r.f.y.yr.ye.l.s each with their corresponding rules. It does not work that way. Apparently .f appears before .e and .r in the default sequence. This is a real pain at times since .e and .r (efl and ratfor source files) are translated into a .f file by f77 before they are compiled. If the compilation is successful, the .f file is erased but, if it there is an error, the .f file is left intact to allow the user to determine what produced the error message. This means that you find the error, then edit the .e or .r file, then must remember to remove the .f file or your next attempt at making the program will produce the same errors since make uses the .f file instead of the .e or .r file. REPEAT BY: Set up an efl or ratfor program as described above (with errors). FIX: In /usr/src/bin/make/files.c the default suffixes are defined with .f (and .F) in front of .e and .r. Put .f and .F after .e and .r. _______________________________________________________________________________ make--bin quarles@ucbic (Tom Quarles) 19 Mar 84 +FIX Make loses track of file descriptors when making very large programs, and eventually a compiler will die due to an inability to open a temporary file - f77 seems to be the most common cause, although it probably bothers other compilers also. REPEAT BY: attempt to 'make' a program that has many subroutines in separate files and a command that must be exec'd (by the shell - ie has a metacharacterin it) for each one. After about 15 subroutines have been compiled, the compiler will complain about inability to open temp files. _______________________________________________________________________________ makefile--sys allegra!rdg Jul 19 83 #ifdefs missed when "depend" makes dependancy list _______________________________________________________________________________ makefile--sys Chris Kent <decwrl!kent%Shasta@SU-Score> 18 Jul 83 The Makefile entry to build bootrl is incorrect. It compiles with confra.o instead of confrl.o; thus attempts to boot an rl11 on a 750 always yield "Unknown device". REPEAT BY: Just try to build it and boot it! FIX: s/confra.o/confrl.o/ ---------- _______________________________________________________________________________ man.c--ucb smoot@ut-sally.ARPA (Smoot Carl-Mitchell) 15 Dec 83 +FIX The way man searches mano, manl, manl and manp is inadequate _______________________________________________________________________________ man.c--ucb clyde@ut-ngp.ARPA 2 Feb 84 +FIX Man does not always interact well with job control. When displaying a manual page using "more" and interrupted, processes can be left hanging. Symptoms: there is a 'sh -c' in proc wait (on more) 'more' is in stopped mode. NOTE: This only happens when using the Cshell. Cause: Man and more both trap SIGINT. Man removes temp file, and more restores terminal modes before exiting. If man exits before more, then Cshell resets the process group of the controlling terminal, then when more finally gets around to restoring tty modes, it is "in the background" and so gets stopped with SIGTTOU. The shell that fired up more is hung waiting for more to exit. Man rolls its own system() so it can use vfork; however it does NOT ignore SIGINT and SIGQUIT while waiting for the shell it invoked to exit. So if interrupted, man can exit before its subprocesses do. No problem unless the subprocesses do things with the terminal - then they will get stopped with SIGTTOU. REPEAT BY: Do 'man something' and interrupt at the "--More--" prompt a few times until man dies and the tty is left in noecho cbreak mode. _______________________________________________________________________________ man.c--ucb dlw@ucbopal.CC (David L. Wasley) 1 Feb 84 +FIX 'Man' does not use the 'cat'able file if output is not a tty. Also, the 'nroff' command it uses to produce the 'cat'able file should NOT include the -h option: this produces very sloppy output on terminals with "standout mode glitch". REPEAT BY: man 1 more > file & ps _______________________________________________________________________________ mh--local cak (Chris Kent) 12 Aug 83 When using MH, if you do an 'inc' and your inbox fills up, all the letters get scribbled on the file ? in inbox, each overwriting the previous. In short, you lose mail. REPEAT BY: Touch file 999 in your inbox, mail yourself two letters, and inc. _______________________________________________________________________________ mh/cmds/prompter.c--new rws@mit-bold (Robert W. Scheifler) 21 Nov 83 +FIX prompter doesn't prompt for To:, Cc:, Subject:. REPEAT BY: Make prompter your editor and run comp. It will start reading in the body of the message. FIX: In main(), change if(field[0] != '\n' || field[1] != 0) { to i = 0; while ((field[i] == ' ') || (field[i] == '\t')) i++; if(field[i] != '\n' || field[i+1] != 0) { Stripping off leading white space should really be done in m_getfld, so all the commands don't have to repeat such code. _______________________________________________________________________________ mh/cmds/replsubs.c--new sjk@sri-spam (Scott J. Kramer) 15 Dec 83 +FIX It is sometimes possible for the MH "repl" program to generate the reply addresses with spaces around the "@"s [eg, "sjk @ sri-spam"]. This violates RFC822, if I understand it correctly, or at least causes "sendmail" to complain since it parses the header as three addresses rather than a single one. REPEAT BY: If a user mails to a "non-forwarding" local user and the message includes a "Reply-To" header containing a hostname, a bogus "Cc" header will be generated by the "repl" program. For example, if "sjk" mails a message to local user "sjkx" which contains "Reply-To: sjk@sri-spam", the "repl" program will create a "Cc: sjkx @ sri-spam" header, which is invalid. I believe that the "To" header will be correct, however. _______________________________________________________________________________ mip/match.c--lib solomon@wisc-crys.arpa 6 May 84 +FIX In some circumstances, the second (code-generation) pass of the portable C compiler cannot cope with a procedure call when an actual parameter is a record of length exactly 4 bytes. REPEAT BY: The problem is difficult to reproduce reliably. Try compiling the following program. struct { int a; } arr[3]; main() { register i; f(arr[i]); } During the course of compilation, /lib/ccom will reference an array with a subscript value that is too large. The results are unpredictable; there may be an error message, a segmentation violation, or no error at all depending on the details of how /lib/ccom was compiled (e.g., whether the -O flag was specified). _______________________________________________________________________________ mip/pftn.c--lib sdcsvax!sdcc3!steve@Nosc (steve serocki) 4 Sep 84 +FIX Error checking in the portable C compiler's semantic analysis phase has always been less than rigorous, to the point now of being the butt of cruel jokes comparing the pcc in robustness to the C shell. Recent ridicule (*) has been directed especially at the comic handling of irregular parameter declarations. The pcc will disdain catching parameters described as "static", or "extern" storage class in semantic analysis, ultimately falling on its nose in codegen, or worse. A catch is described in a companion bug report. The pcc will often respond to missing parameters with a "bad arg temp" warning, and then go on to generate incorrect code. If the missing parameter is in addition (improperly) declared "static" or "external" no warning will be given and the result will be even more bizarre. A catch for the "bad arg temp" bug is described here. [(*) Refs: 3015@utah-cs.UUCP, 8180-8181@ucmp-cs.UUCP] REPEAT BY: "bad arg temp": Give the following C program to the pcc and compile with the -S option. The compiler will give the "bad arg temp" warning and display a ridiculous stack offset for the parameter. -------the program:------ int imp; main(perverse ) int imp; /* "imp" missing from param list */ { imp = 6969; } -------the .s file:------- LL0: .data .comm _imp,4 .text .align 1 .globl _main _main: .word L13 jbr L15 L16: "foo.c", line 6: warning: bad arg temp cvtwl $6969,-1275(ap) ret .set L13,0x0 L15: jbr L16 .data Even more bizarre: Give the following C program to the pcc and compile with the -S option. The compile will complete normally and silently. L13 will be undefined in the output. int imp; main(perverse ) static int imp; /* "imp" missing from param list */ { imp = 6969; } _______________________________________________________________________________ mip/pftn.c--lib sdcsvax!sdcc3!steve@Nosc (steve serocki) 4 Sep 84 +FIX Error checking in the portable C compiler's semantic analysis phase has always been less than rigorous, to the point now of being the butt of cruel jokes comparing the pcc in robustness to the C shell. Recent ridicule (*) has been directed especially at the comic handling of irregular parameter declarations. The pcc will often respond to missing parameters with a "bad arg temp" warning, and then go on to generate incorrect code. a catch for the "bad arg temp" bug is described in a second, co-posted bug report. The pcc will disdain catching parameters described as "static", or "extern" storage class in semantic analysis, ultimately falling on its nose in codegen, or, gosh forbid, at assemble time or later if other bugs are present. A catch for the misspecified parameter SC is described here. The problem is simply that the appropriate check against these two meaningless storage classes is never made. [(*) Refs: 3015@utah-cs.UUCP, 8180-8181@ucmp-cs.UUCP] REPEAT BY: Give the following C program to the pcc. The compilation will terminate ungracefully with an internal compiler error. main(perverse ) static int perverse; /* bad storage class */ { } _______________________________________________________________________________ mip/trees.c--lib donn@utah-cs (Donn Seeley) 27 Aug 84 +FIX Initializers must evaluate to a constant, but the C compiler fails to reduce expressions of the form '(unsigned) 2 >> 1' to a constant, so they cause the compiler to bomb. '(unsigned)' can be replaced here by a cast to any type compatible with but not the same as 'int' and the operator may be varied as well (with varying incorrect results; some operators, in particular '+', work correctly). This bug was found by Krzysztof Kozminski at the University of Rochester (whose article we never got), and was followed up by Chris Torek at the University of Maryland, from whose article I got all the information. REPEAT BY: Put the following line in a file 'one.c' and try to compile it. ---------------------------------------------------------------- int one = (unsigned) 2 >> 1; ---------------------------------------------------------------- The precise message you get is: ---------------------------------------------------------------- "one.c", line 1: compiler error: expression causes compiler loop: try simplifying ---------------------------------------------------------------- (The C compiler on monet (Ralph's version) complains about an illegal initialization instead -- it's smart enough not to try to generate code into the data area, which is what the older version did.) _______________________________________________________________________________ mkfs--man Jay Lepreau <lepreau@utah-cs> 10 Jul 84 +FIX The existing and useful option to set the "number of bytes per inode" is not documented. (It is doc'ed in newfs(8) however.) REPEAT BY: man mkfs; and see mkfs.c _______________________________________________________________________________ more.c--ucb dlw@ucbopal.CC (David L. Wasley) 31 Jan 84 +FIX 'more' has a number of known bugs, among them: 1) shell escapes, when stdin is a pipe, get the pipe on fd:0, 2) underline mode is not turned off at the end of lines, 3) the prompt is not fully erased on terminals with "standout mode glitch", 4) underlining does not take advantage of white space for terminals with "underline mode glitch". REPEAT BY: Try 'man diff' on a Televideo 925. Note extra spaces around underlined words. Several screens down is a line that ends with an underlined word: note the underlining extend to the right edge of the screen. Try 'more /usr/dict/words'. Note that the prompt is not fully erased if the next word is 13 characters. Etc ... _______________________________________________________________________________ more.c--ucb root%oregon-grad.csnet@csnet-relay.arpa 3 Aug 84 +FIX "more" can behave improperly with large ( > 100 byte ) termcap entries, e.g., loss of standout mode for "-- More --" prompt. REPEAT BY: Inspection of the source will reveal the problem (see Fix below), but if you want to duplicate it: Set TERMCAP to a filename containing a termcap entry similar to that given below ( it is > 100 char's.), then invoke "more" on a file long enough to cause paging. The symptom in my case was loss of standout mode for the "-- More --" prompt at bottom of screen. To duplicate the problem, TERMCAP should be set to a filename containing the termcap info, rather than having TERMCAP contain the entry itself. FIX: In more.c: < char clearbuf[100]; --- > char clearbuf[TBUFSIZ]; ------------------------------------------------------------------------ If you really want to duplicate the problem, use this termcap entry, with TERMCAP env. var. set to it's filename. # $Header: termcap,v 1.9 84/07/24 03:58:17 root Exp $ d1|vt100|vt-100|pt100|pt-100|dec vt100 w/ no-scroll region:\ :co#80:li#22:\ :bs:pt:\ :is=\E7\E>\E[?7h\E[2;23r\E[?6h\E8:\ :if=/usr/lib/tabset/vt100:\ :ks=\E[?1h\E=:ke=\E[?1l\E>:\ :ku=\EOA:kd=\EOB:kr=\EOC:kl=\EOD:\ :k1=\EOP:k2=\EOQ:k3=\EOR:k4=\EOS:\ :ce=\E[K:\ :cl=\E[H\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K^J\E[K\E[H:\ :ti=\E[?6h\E[1q:te=\E[0q:\ :cm=\E[%i%2;%2H:\ :up=\E[A:\ :do=^J:\ :nd=\E[C:\ :so=\E[7m:se=\E[m:\ :us=\E[4m:ue=\E[m: # This seems to cause weird cursor jumping on VT105 # :sr=\EM: # "dumb" entry to satisfy "ex" when run non-interactively. su|dumb|un|unknown:\ :am:bl=^G:co#80:cr=^M:do=^J:nl=^J: _______________________________________________________________________________ more/more.c--ucb Tim Morgan <morgan@uci-750a> 28 Aug 84 +FIX In more, the routine initterm() is called to initialize the terminal capabilities which are used subsequently. It uses a buffer called "clearbuf" of 100 bytes to store the capability strings into, using the termcap(3) routines (eg, tgetstr(3)). But since clearbuf is local to initterm(), when that routine exits, clearbuf and the pointers to the capability strings stored within it can be overwritten by other routines. Normally more works because just before clearbuf is another array called "buf" which is 1024 bytes long. Thus clearbuf is high enough (or low enough, depending on how you look at it) on the stack that it avoids being trashed. REPEAT BY: Modify more so that some routine uses (writes on) more than 1024 bytes of space on the stack. More will no longer correctly do things like clear the screen or change to or from inverse video. FIX: Change the declaration of "clearbuf" in initterm() from char clearbuf[100]; to static char clearbuf[100]; _______________________________________________________________________________ more/more.c--ucb decwrl!qubix!msc (Mark Callow) 11 Jul 84 +FIX On certain terminals such as the tvi925, more(1) incorrectly handles underlining. When the last word on a line is underlined more(1) sends a CR followed by the underline-off code instead of an underline-off code then a CR. On certain terminals this causes the remainder of the line up to the edge of the screen to be underlined. REPEAT BY: Get yourselves a tvi925 or similar terminal. Try doing man on a few manual pages. You'll soon find one where the last word on a line is underlined. The tvi has other problems as well due to it needing a screen space for attributes. This causes problems with the line endings which you should ignore for the purposes of this test. _______________________________________________________________________________ msgs.c--ucb sam@ucbarpa (Sam Leffler) 2 Sep 83 If you invoke msgs with the -p flag and then quit from more such that it gets a SIGPIPE, msgs is killed (apparently) by the signal. REPEAT BY: Find a msg long enough and use more... FIX: Should probably catch/ignore SIGPIPE when forking more. _______________________________________________________________________________ mt.c--bin genji@UCBTOPAZ.CC (Genji Schmeder) 30 Sep 83 +FIX Given a null argument for operation code, mt defaults to weof (write tapemark). REPEAT BY: /bin/mt "" FIX: Change this line (preceding usage message): if (argc < 2) { to this: if (argc < 2 || strlen(argv[1]) <= 0) { --Genji _______________________________________________________________________________ net/if.c--sys Mike Braca <mjb%Brown@UDel-Relay> 27 Sep 83 +FIX On a ubareset, if.c calls the network drivers with the wrong number of arguments (the first argument is supposed to be the unit number). This may cause some drivers to not be reset (depending on what garbage they pull off the stack for a uba no.). REPEAT BY: Just look at the code (match the call in /sys/net/if.c ifubareset() with, e.g., ilreset in /sys/vaxif/if_il.c). _______________________________________________________________________________ net/raw_usrreq.c--sys rws@mit-bold (Robert W. Scheifler) 21 Mar 84 +FIX When raw_usrreq() frees a route in the process of sending a packet, it doesn't zero the pointer to the route. This results in freeing the route multiple times, and in using that route for the duration, as all subsequent rtalloc's become no-ops (see the /* XXX */ comment in rtalloc). REPEAT BY: Use one socket to send raw packets that need different routes; many packets won't go where they are supposed to go. _______________________________________________________________________________ net/route.c--sys rws@mit-bold (Robert W. Scheifler) 22 Nov 83 +FIX ICMP host-specific redirects are treated the same as network redirects. REPEAT BY: Go through a gateway that uses ICMP_REDIRECT_HOST and then do a netstat -r. _______________________________________________________________________________ netinet/if_ether.c--sys Christopher A Kent <cak@Purdue.ARPA> 8 Jan 84 +FIX If it is desired, for testing or other purposes, to force ethernet traffic for the local host to go out onto the wire, it is not possible to do so. Changes to if_ether.c to follow that cause the ethernet traffic to use the loopback device only if it is marked IFF_UP; thus by "ifconfig lo0 down", one can force the packets to go onto the wire. Unfortunately, the ARP code must be changed, as well, since it is designed to ignore incoming packets from itself. A new type of entry, a "sticky" entry, is defined. The interface address definition routine calls the new routine arpinstall() to install a mapping entry for itself; this is a sticky entry that will not be timed out. Thus the resolution can always be done without broadcasting a packet. This feature may also be useful for networks that have simple-minded diskless stations that depend on someone knowing their IP address, using a modified ARP to do a "reverse query". REPEAT BY: On an ethernet host, rlogin `hostname`. Inspecting netstat -i will show the the lo packet counts are going up, not the ether device. _______________________________________________________________________________ netinet/if_ether.c--sys salkind@nyu (Lou Salkind) 2 Dec 83 +FIX When you send out an ARP packet, the packet length is too big. REPEAT BY: All arp packets generated have this bug. FIX: In arpwhohas, change m->m_len = sizeof *ea + sizeof *eh; to m->m_len = sizeof *ea; _______________________________________________________________________________ netinet/if_ether.c--sys sun!rusty (Russel Sandberg) 2 Apr 84 +FIX Sending a broadcast ICMP ECHO packet on an ethernet with lots of hosts causes a panic. REPEAT BY: Send an ICMP ECHO packet on an ethernet with >50 hosts. _______________________________________________________________________________ netinet/in_pcb.c--sys watmath!arwhite (Alex White) 17 Feb 84 System dies with a panic from a garbage pointer in soqremque called from sonewconn. What happens is that sonewconn, calls tcp_usrreq which calls tcp_attach, this calls in_pcballoc which succeeds, but then tcp_newtcpcb fails due to lack of mbufs. tcp_attach hence calls in_pcbdetach to clean up. in_pcbdetach unfortunately invokes sofree which releases the socket itself. we then return back up to sonewconn. Sonewconn now tries to clean up and release the socket itself; it calls soqremq with the socket which now has a zero pointer for so_head and craps out. Superficial examination of code as in udp_usrreq, PRU_ABORT shows that it invokes in_pcbdetach and then itself calls sofree which was done in in_pcbdetach! REPEAT BY: Run out of mbuf's. If you don't crash of panic: exit: m_getclr first (I have a fix for that one...) I suppose its probable that you'll hit this after a while... How to run out of mbuf's is another bug, which I haven't tracked down yet - but I suspect that it has something to do with unix domain ipc being done by students.... _______________________________________________________________________________ netinet/ip.h--sys orca!dougg (Doug Grote) 27 Oct 83 +FIX In the file /4.2/usr/sys/netinet/ip.h, the macro definitions of IPOPT_CLASS and IPOPT_NUMBER do not match the Internet Protocol specifications in RFC 791. In particular, the class field should be 2 bits long and the number field should be 5 bits long. These are the current macro definitions: #define IPOPT_CLASS(o) ((o)&0x40) #define IPOPT_NUMBER(o) ((o)&0x3f) I would expect them to be defined as: #define IPOPT_CLASS(o) ((o)&0x60) #define IPOPT_NUMBER(o) ((o)&0x1f) Also, the bit definitions for the possible classes (IPOPT_CONTROL, IPOPT_RESERVED1, IPOPT_DEBMEAS, and IPOPT_RESERVED2) appear to be positionally 1 bit off. They are defined as follows: #define IPOPT_CONTROL 0x00 #define IPOPT_RESERVED1 0x10 #define IPOPT_DEBMEAS 0x20 #define IPOPT_RESERVED2 0x30 I would expect them to be as follows: #define IPOPT_CONTROL 0x00 #define IPOPT_RESERVED1 0x20 #define IPOPT_DEBMEAS 0x40 #define IPOPT_RESERVED2 0x60 Is this worth investigating? (These are the same definitions used in 4.1c ). - Doug Since none of these masks is used anywhere within 4.2 it hardly matters. But then again, someday somebody might use them. _______________________________________________________________________________ netinet/ip_icmp.c--sys Jeff Mogul <mogul@coyote> 27 Feb 84 +FIX If the ICMP code receives a packet in two or more mbufs, it does an m_pullup to put all the data into one m_buf. However, it has already squirreled away a pointer into the original mbuf, and when it later uses this pointer all sorts of nasty things happen. REPEAT BY: If you've got an ICMP echo ("ping") program, do "ping localhost". Otherwise, write something to do raw IP output to "localhost" and send an ICMP; also, if you can arrange things so that an ICMP coming from another host is fragmented, you can probably crash your system. _______________________________________________________________________________ netinet/ip_icmp.c--sys rws@mit-bold (Robert W. Scheifler) 1 Dec 83 +FIX icmp_input() assumes that both the IP and ICMP headers are in the first mbuf, which they needn't be. REPEAT BY: Well, first you should give superuser access to raw ICMP. All the code is there, but the protosw doesn't give access, so change the ICMP protosw in in_proto.c to { SOCK_RAW, PF_INET, IPPROTO_ICMP, PR_ATOMIC|PR_ADDR, icmp_input, rip_output, 0, 0, raw_usrreq, 0, 0, 0, 0, }, Then write yourself a "ping" program to send ICMP ECHOs and look for ICMP ECHOREPLYs. Watch it work over a real network. Watch it fail over the loopback-net to yourself, with ECHO packets apparently being zeroed out. _______________________________________________________________________________ netinet/ip_icmp.c--sys Michael John Muuss <mike@brl-vgr> 14 Dec 83 +FIX When using a "ping" program to send ICMP_ECHO packets onto the network, a steady 20% of them are lost, regardless of interface, including the software loopback driver. The problem is that ip_icmp is incorrectly looking in the sequence number and ID field of the ICMP header, as if this was an "error advice" packet much like ICMP_REDIRECT. REPEAT BY: Ping the loopback device. Ping program availible on request. (Pitty that one was not included with 4.2) _______________________________________________________________________________ netinet/ip_output.c--sys Bill Croft <croft%Safe@SU-Score> 27 Oct 83 +FIX If you setup a manual route to a specific host, ip_output will fail to use it. REPEAT BY: For example: /etc/route add diablo 36.45.0.73 Sets up a correct routing table entry to send packets for host "diablo" to gateway 36.45.0.73. However attempting to connect to diablo will now fail because the IP is not being sent to the gateway. FIX: In "ip_output", change the line if (ro->ro_rt->rt_flags & RTF_GATEWAY) to read if (ro->ro_rt->rt_flags & (RTF_GATEWAY|RTF_HOST)) We only used the "route to host" feature for a few days until we get our proper subnet routing algorithm into the network drivers, so it's not very critical. But you do advertise this feature in "route(8)". _______________________________________________________________________________ netinet/ip_output.c,net/route.c--sys Paul Kirton 5 Dec 83 +FIX New routes installed by the routing daemon or manually are not used by existing TCP connections, which continue to use the old route. To save time when looking up routing table entries, TCP connections save the location of the current routing table entry and pass it to ip_output() with each packet to be sent. When a routing table entry is deleted manually or by the routing daemon, the rtrequest() routine checks the rt_refcnt to see if any connections presently reference it, if so it unlinks the entry from the route table but does not free the memory. Instead the route is marked as down by clearing the RTF_UP flag. When a connection with an existing reference to the deleted route sends a packet it passes the route entry pointer to ip_input() as usual, but ip_input() does not check the RTF_UP flag to see if the route is still usable. It just uses the old route. Thus TCP connections will continue to try to use the old route until the connection terminates. REPEAT BY: The problem can be demonstrated as follows: 1. set up a Telnet connection and then suspend it, 2. netstat -r to check refcnts on routes, 3. delete the route used by the Telnet connection, 4. send data over Telnet to verify that the connection is still up, then suspend again, 5. check netstat -r again, which will show that no refcnts have been incremented indicating that no new routes have been established, thus the old route is still being used. The Telnet connection should have switched to a new route that is up, typically the default gateway. _______________________________________________________________________________ netinet/raw_ip.c--sys lwa@MIT-CSR 30 Nov 83 +FIX When performing raw internet output, the ip_off field in the internet header is never completely cleared. Although the ip_output routine later zeroes everything but the IP_DF flag, this flag may still be randomly set (depending on the previous contents of the mbuf used to hold the ip header). As a result, raw output packets larger than the maximum local net packet size may be rejected as "too large". REPEAT BY: Try repeatedly sending packets larger than the maximum local net packet size using the raw interface. Some will be rejected as too large. _______________________________________________________________________________ netinet/tcp_input.c--sys rws@mit-bold (Robert W. Scheifler) 7 Nov 83 +FIX I just noticed this looking through the code. Don't know how it would manifest itself. REPEAT BY: Beats me. _______________________________________________________________________________ netinet/tcp_input.c--sys jsq@ut-sally.ARPA (John Quarterman) 10 Dec 83 +FIX TCP connections with TOPS-20 systems occassionally hang in FIN_WAIT_2 state. When this happens, the connection *never* closes and such hung connections accumulate. REPEAT BY: If you have frequent TCP connections with a TOPS-20 host, you will notice, using netstat, connections in FIN_WAIT_2 state accumulating. If you have frequent TOPS-20 traffic, all your incoming network pty ports will eventually be eaten up. _______________________________________________________________________________ netinet/tcp_input.c--sys spgggm@ucbopal.CC (Greg Minshall) 9 Feb 84 +FIX TCP sockets get left in FIN_WAIT_2 on one end, and LAST_ACK on the the other. Essentially, A and B are connected. A does a close (this is a unix close (PRU_DISCONNECT), not a TCP close (PRU_SHUTDOWN)). A sends a FIN to B, and also causes the RCV.WND to go to zero. B goes into CLOSE_WAIT, and ACKs A's FIN (causing A to go from FIN_WAIT_1 to FIN_WAIT_2). At some point, B would like to FIN (ie: when B's user does a close/shutdown). If B has NO DATA, the FIN goes to A, A ACKs and enters TIME_WAIT. B, now in LAST_ACK, gets the ACK, and also enters TIME_WAIT. Unfortunately, if B has DATA to send, he refuses (properly) to send the FIN, and instead tries to send DATA (length 1, every time the PERSIST timer goes off). A, with a socket window of 0, refuses to ACK this data. And so we sit. Every PERSIST time, a packet flows from B to A. No traffic in the opposite direction. Part of the problem is the mapping of unix file operations (in particular close()) to TCP operations. Sometimes, the close() SHOULD be PRU_DISCONNECT (ie: when the unix process has inherited a file descriptor, not knowing what it was, wrote/read, and the close()'d). Other times, when the unix process is knowledgeable, a close() should probably be a PRU_ABORT (because, eg, the unix process has been kill(1)'d); these people are supposed to use PRU_SHUTDOWN. REPEAT BY: Oh, do a "netstat -a", and look for WAIT_FIN_2 and/or LAST_ACK. Probably you can duplicate it by doing an rlogin(1), cat(1)'ing some large file, and doing "~." (but, I don't know). We caused it by kill(1)'ing "rlogind" (trying to understand some other phenomenon). _______________________________________________________________________________ netinet/tcp_input.c--sys Mike Muuss <mike@BRL-VGR.ARPA> 12 Oct 84 +FIX 1) Tune TCP max segment size based upon IP interface MTU. Bob Gilligan at sri-spam, Chris Kent at Purdue, and steveh at tektronix have provided bug reports which contain code to address this problem. This bug report includes code which gives the same results, but with no loss in efficiency, or in clarity of code. By making the correct abstractions, problem #2 can be easily fixed at the same time. On input, if a connection is in the LISTEN state (e.g., a server), an incoming maximum segment size option is ignored. The maximum segment size option is accepted on all packets, contrary to the spec which says it is only acceptable on packets with SYN. In no case does this code result in worse global network performance than the vanilla 4.2BSD code. For sites with most machines on Ethernets or Pronets behind a gateway to the ARPANET, this code will not make any difference. For sites communicating directly over devices with an MTU < 1064 bytes, (ARPANET, PRNET, etc), this code will prevent TCP from constructing IP packets bigger than the MTU, thus saving the CPU and network overhead associated with IP fragmentation/reassembly. Because of (a) the decreased reliability of TCP segment delivery when IP fragmentation is present (due to total non-delivery of the TCP segment if any IP fragment is lost in transit), and (b) the fact that most network interfaces have a constant "per-packet" cost, rather than a "per byte" cost, decreasing the number of packets being sent by nearly 50% on bulk data connections has a dual benefit of substantial magnitude. In my opinion, this new code should be mandatory for all 4.2 BSD systems directly attached to an IMP. 2) Choose local address in IP packets based upon route and interface being used to send packets. For background, I quote "A 4.2BSD Interprocess Communication Primer": Internet address binding Local address binding by the system is currently done somewhat haphazardly when a host is on multiple networks. Logically, one would expect the system to bind the local address associated with the network through which a peer was communicating. For instance, if the local host is connected to networks 46 and 10 and the foreign host is on network 32, and traffic from network 32 were arriving via network 10, the local address to be bound would be the host's address on network 10, not network 46. This unfortunately, is not always the case. For reasons too complicated to discuss here, the local address bound may be appear to be chosen at random. This property of local address binding will normally be invisible to users unless the foreign host does not understand how to reach the address selected. For example, if network 46 were unknown to the host on network 32, and the local address were bound to that located on network 46, then even though a route between the two hosts existed through network 10, a connection would fail. By calling in_getif() in in_pcbconnect() early on in the establishment of the TCP connection, the correct local IP address can now be assigned to the connection. This has important implications for traffic routing. REPEAT BY: 1) Connect to various sites with different mtu or max seg options, and look at the tcpcb's with adb. Of particular interest are sites which have very small mtu networks attached. 2) Default route all your packets through your second interface; use "netstat -i" to see where they leave and return... For example: default route all packets via second interface (from "netstat -i" listing). Packets will leave on that interface, but will return via the first interface. If this configuration is elected because of difficulties with the first path, you loose. _______________________________________________________________________________ netinet/tcp_output.c--sys rws@mit-bold (Robert W. Scheifler) 7 Nov 83 +FIX I just noticed this looking through the code. Don't know how it would manifest itself. REPEAT BY: Beats me. _______________________________________________________________________________ netinet/tcp_subr.c--sys Christopher A Kent <cak@PURDUE.ARPA> 2 Dec 83 +FIX In a letter of 7 Nov 1983, Jon Postel clarified the TCP max seg option default and its relation to the IP Maximum Datagram Size. The basic message was that the defaul Max Seg Size should be 536, not 512 (as in 4.2). REPEAT BY: Open a TCP connection, I guess. _______________________________________________________________________________ netinet/tcp_{input,output,subr}.c--sys Christopher A Kent 21 Mar 84 +FIX Handling of the TCP maximum segment size option is broken in many respects. Since they are all related, this is submitted as just one fix. On output, the max seg size is always offered as 1024. This causes IP fragmentation overhead for networks that do not support this large a packet; it is a particularly bad value for the Arpanet. On input, if a connection is in the LISTEN state (e.g., a server), an incoming maximum segment size option is ignored. The maximum segment size option is accepted on all packets, contrary to the spec which says it is only acceptable on packets with SYN. Both of these values should be tuned to the mtu of the interface being used for the connection; if the mtu is larger, use the offered value; otherwise set it to the mtu minus headers. REPEAT BY: Connect to various sites with different mtu or max seg options, and look at the tcpcb's with adb. Of particular interest are sites which have very small mtu networks attached. _______________________________________________________________________________ netinet/tcp_{input,output,subr}.c--sys gilligan@sri-spam 7 Dec 83 +FIX The TCP code selects its default maximum segment size, and negotiates the foreign end of the connection maximum segment size, based upon constants hard-wired into the code. These constants are appropriate only for connections over certain types of networks. For most networks, this causes large packets to be passed to the IP layer which will have to be fragmented. The result this unnecessary fragmentation is poor performance. The increased overhead of fragmentation and re-assembly, as well as the case where one fragment in a group is lost, forcing the discarding of all of the received fragments of a TCP segment, will degrade performance. Jon Postel, in his RFC 879 titled "The TCP maximum segment size and related topics" advises that the TCP maximum segment size should be selected and negotiated base upon the maximum transmission unit (MTU) of the attached network being used. The problem manifests itself in three places in the TCP code: 1) The initial maximum send segment size is set to a constant in tcp_subr.c; 2) If a TCP maximum segment size option is received, the maximum send segment size is automatically set to the requested size in tcp_input.c; and 3) A constant sized TCP maximum segment size option is always sent to the foreign side in tcp_output.c. The appropriate action in each of these cases is: 1) Choose an initial maximum send segment size that is: max_send_size = min (f (MTU), 536); 2) Upon receiving a TCP maximum segment size option, set the maximum send segment size to: max_send_size = min (f (MTU), requested_size); 3) Negotiate a foreign maximum segment size that is: foreign_seg_size = f (MTU); In the above three examples, f (mtu) translates a network mtu (maximum transmission unit) into a maximum appropriate TCP segment by subtracting the tcp and ip header sizes. REPEAT BY: Open TCP connections between two 4.2 BSD Vaxes over a network with MTU less than 1024 (e.g. Arpanet with MTU = 1007, or PRNET with MTU = 254). Use an external packet monitor to see fragmented packets passing between the two machines. _______________________________________________________________________________ netinet/udp_usrreq.c--sys rws@mit-bold (Robert W. Scheifler) 22 Nov 83 +FIX (This supercedes my previous report.) UDP checksums are turned off, and don't work when turned on. REPEAT BY: Turn on checksums (udpcksum = 1). _______________________________________________________________________________ netinet/udp_usrreq.c--sys Dave Rosenthal 5 Apr 84 +FIX The computation of the checksum for UDP packets is incorrect. The length enters into the computation twice, once in the actual header, and once in the implied header. As delivered, in the implied header the length is NOT in network order. REPEAT BY: Attempting to tftp files from an IBM PC to a 4.2 VAX using the MIT code on the PC. _______________________________________________________________________________ netser/misc/rexecd.c--ucb root.Oregon-Grad@Rand-Relay 17 Aug 83 +FIX TERM is not set in the environment by rexecd, leading to possible remote command failure due to "TERM: undefined variable" error. REPEAT BY: I do not know how to produce this problem, but similar code in rshd does cause problems as follows, and presumably remote commands using rexecd would also: The following conditions must be met: 1. root's shell on remote host is /bin/csh 2. $TERM is referenced in ~root/.cshrc Then, execute any remote command using the rshd server, e.g., % rsh <host> date or % rcp xxx <host>:zzz Both will give "TERM: undefined variable" errors; the rsh will succeed anyway, but the rcp will fail. _______________________________________________________________________________ netser/rwho/{ruptime.c,rwho.c,rwhod.c}--ucb ogcvax!root Sep 8 83 The integer values in struct whod are not converted to/from network byte order, despite the statement that they are in the RWHOD(8C) manual section. As a result, rwho information coming from/going to a machine with different host byte order is garbled. REPEAT BY: Execute "rwho" or "ruptime" on a SUN, and observe the uptime, load averages, and user login time data originating from a VAX. FIX: I have modified versions of ruptime.c, rwho.c, and rwhod.c which I can send; I did not do so because I suspect this is a well known problem which has already been fixed. Please let me know if you want copies of my fixes. Bruce Jerrick Oregon Graduate Center (503) 645-1121 ex. 355 CSNet: bruce@Oregon-Grad UUCP: ...teklabs!ogcvax!bruce _______________________________________________________________________________ netser/rwho/{ruptime.c,rwho.c}--ucb ogcvax!root.tektronix Sep 19 83 argv handling allows 0 to be passed as an argument to strcmp(), which on some machines may cause a seqmentation fault when strcmp() tries to access *0. REPEAT BY: Compile code on a SUN, then invoke it with at least one option (so argc >= 2 on entry). This will cause a USER BUS ERROR; an adb trace ($c) shows strcmp() called as strcmp(0, ...). FIX: I can send my mods to fix this; I haven't done so because I suspect the problem has already been fixed. A log of my mods follows: Fixed argument handling -- Would formerly allow null pointer to be passed to strcmp(), which can cause core dumps on some machines (from trying to access *0). Now allows combined options, e.g. "-al". --------------------------------------- Bruce Jerrick Oregon Graduate Center (503) 645-1121 ex. 355 CSNet: bruce@Oregon-Grad UUCP: ...teklabs!ogcvax!bruce _______________________________________________________________________________ GENERAL INFORMATION ON THE 4.2 BUGLIST FROM MT XINU _________________________________________________________________ --IMPORTANT DISCLAIMERS-- Material in this announcement and the accompanying reports has been edited and organized by MT XINU as a service to the UNIX community on a non-profit, non-commercial basis. MT XINU MAKES NO WARRANTY, EXPRESSED OR IMPLIED, ABOUT THE ACCURACY, COMPLETENESS, OR FITNESS FOR USE FOR ANY PURPOSE OF ANY MATERIAL INCLUDED IN THESE REPORTS. MT XINU welcomes comments in writing about the contents of these reports via uucp or US mail. MT XINU cannot, however, accept telephone calls or enter into telephone conversations about this material. _________________________________________________________________ Legal difficulties which have delayed the distribution of 4.2bsd buglist summaries by MT XINU have been resolved and three versions of the buglist are now available. The current buglist has been derived from reports submitted to 4bsd-bugs@BERKELEY (not from reports submitted only to net.bugs.4bsd, for example). Reports are integrated into the buglist as they are received, so that any distributions are current to within a week or so. Buglists now being distributed are essentially "raw". No judgment has been passed as to whether the submitted bug is real or not or whether it has been fixed. Only minimal edit- ing has been done to produce a manageable list. Reports which are complaints (rather than bug reports) have been eliminated; obscenities and content-free flames have been eliminated; and duplicates have been combined. The result- ing collection contains over 500 bugs. Three versions of the buglist are now ready for distribu- tion: 2-Liners: Two lines per bug, including a concise description, the affected module, the submittor. Approximately 55K bytes, it is being distributed to net.sources con- currently with this announcement. All-but-Source: All material, except that all but the most inocuous of source material has been removed to meet AT&T license restrictions. Nearly a mega-byte, this will be distributed to net.sources in several 50K byte pieces later this week. A paper listing or mag tape is also available, see below. Please note that local usenet size restrictions may prevent large files from being received and/or retransmitted. MT XINU will not dump this material on the net a second time; if your site has not received material of interest to you within a reasonable time, please send for a paper or tape copy. All-with-Source (FOR SOURCE LICENSEES ONLY): 4.2 licensees who also have a suitable AT&T source license can obtain a tape containing all the material, including proposed source fixes where such were submit- ted. Once again, MT XINU has not evaluated, tested or passed judgment on proposed fixes; all we have done is organ- ize the collection and eliminate obvious irrelevancies and duplications. A free paper copy of the All-but-Source list can be obtained by sending mail to: MT XINU 739 Allston Way Berkeley CA 94710 attn: buglist or electronic mail to: ucbvax!mtxinu!buglist (Be sure to include your US mail address!) For a tape, send a check for $110 or a purchase order for $150 to cover MT XINU's costs to the address given above (California orders add sales tax). For the All-with-Source list, mail us a request for the details of license verifica- tion at either of the above addresses.