[comp.bugs.4bsd] Serious IBM 4.3 network bug

cyrus@pprg.unm.edu (Tait Cyrus) (10/23/88)

There has been A LOT of talk lately concerning DEC LANBridges learning
ff-ff-ff-ff-ff-ff (the broadcast address).  Well, as I posted a few
days ago, I thought that I saw a relation between seeing these packets
and an IBM PC/RT booting.

Well, after some tests, we have hard evidence showing that an IBM PC/RT
running "IBM Academic Operating System 4.3" sometimes produces giant
packets or packets containing garbage upon boot up.  Unfortunately,
the data contained in these packets is mostly 1's which puts some DEC
LANBridges into a bad state.

Below are the results of our tests.  We have not fully tested AIX to
see if it also has the same problems, though preliminary tests indicate
that AIX does NOT have these problems.

Ways which do NOT cause it to generate bogus packets:
	- reboot while running (i.e. using reboot/shutdown/halt/etc)
	- software reset (i.e. CNTRL-ALT-Pause) while running

Ways which DO cause it to generate bogus packets:
	- halt & wait for "halting (via wait)" message -> power off ->
		power on -> normal reboot
	- halt & wait for "halting (via wait)" message -> software reset
		via CNTRL-ALT-Pause @ wait for boot prompt ":" ->
		power off -> power on -> normal reboot
	- simulated power hit (power off while running)


The two places in the boot process that we have seen these bogus packets
produced are:
1) when the ethernet device is probed when the kernel is
   looking for devices (marked with ---> below).

        4.3 BSD UNIX (GENERIC) #1: Fri Sep 23 10:32:48 PDT 1988
            ibmacis@clam:/usr/sys/GENERIC
        
        5799-WZQ (C) Copyright IBM Corporation 1986,1987
        All Rights Reserved
        Licensed Materials - Property of IBM
        
        Using 238 buffers containing 716K bytes
        Memory summary: total 12288K (0xc00000), available 10196K (0x9f5000)
        AFPA marked down pending microcode load and initialization.
        68881 enabled.
        autoconf
        hdc0: card level 0x4b microcode level 0x46 configuration bits 0xc7
        hdc0 adapter f00001f0 IRQ 12 CPU level 4 
        hd0 at hdc0 slave 0
        hd0: hd70e; interleave factor is 1 to 1
        hd1 at hdc0 slave 1
        hd1: hd70e; interleave factor is 1 to 1
        hd2 at hdc0 slave 2
        hd2: hd70e; interleave factor is 1 to 1
        fdc0 adapter f00003f2 IRQ 6 CPU level 4 
        fd0: 1.2M drive
        fd0 at fdc0 slave 0
        stc0 adapter f00001e8 IRQ 12 
        st0 at stc0 slave 0
--->    un0 adapter f4080000 IRQ 3 CPU level 3 
--->    un0: ethernet address 0:dd:0:f8:23:0
        lp0 adapter f00003bc IRQ 7 
        psp0 adapter f0008000 IRQ 2 CPU level 3 
        root on hd0
        configure end

2) and when the ifconfig is done in /etc/rc.local
        ifconfig ${network} inet ${hostname} ${net_flags}

We will be notifying IBM of these problems, but in the
mean time we thought everyone should be apprised of this
problem.

---
    @__________@    W. Tait Cyrus   (505) 277-0806
   /|         /|    University of New Mexico
  / |        / |    Dept of ECE - Parallel Processing Research Group
 @__|_______@  |    Albuquerque, New Mexico 87131
 |  |       |  |
 |  |  hc   |  |    e-mail:
 |  @.......|..@       cyrus@pprg.unm.edu
 | /        | /
 @/_________@/

ostholm@ce.chalmers.se (Stig Ostholm) (10/26/88)

In article <23657@pprg.unm.edu> cyrus@pprg.unm.edu (Tait Cyrus) writes:
>
>There has been A LOT of talk lately concerning DEC LANBridges learning
>ff-ff-ff-ff-ff-ff (the broadcast address).  Well, as I posted a few
>days ago, I thought that I saw a relation between seeing these packets
>and an IBM PC/RT booting.
			.
			.

We at Chalmers University of technology has also experiensed this behavior
and made the following observations:

	* Both the Miniroot diskett and the "normal" kernel sends garbage.

	* This bug is also present in ibm4.3 release 2

	* The bug is NOT present in AIX.

We have also made a fix to the kernel that makes all garbage packages to be
"controlled" (it sends "IBM RT/PC" from itself to itself).
The real cause of this behavior is currently unknown.

The fix in /usr/sys/caif/if_un.c:
2c2
<  * 5799-WZQ (C) COPYRIGHT IBM CORPORATION  1986,1987
---
>  * 5799-CGZ (C) COPYRIGHT IBM CORPORATION  1986,1987
6c6
< /* $Header: if_un.c,v 1.1 88/05/19 09:59:24 ostholm Exp $ */
---
> /* $Header: if_un.c,v 1.2 88/05/20 10:24:38 ostholm Exp $ */
11c11
< static char *rcsid = "$Header: if_un.c,v 1.1 88/05/19 09:59:24 ostholm Exp $";
---
> static char *rcsid = "$Header: if_un.c,v 1.2 88/05/20 10:24:38 ostholm Exp $";
120a121,125
> 	/*
> 	 * The next line puts the transmitter in loopbackmode so that no
> 	 * uncontrolled packeges are sent on the ethernet.
> 	MM_OUT(&addr->un_edlc.tmode, TM_NORMAL - TM_LBC);
> 	 */
124a130,133
> 	/*
> 	 * The next line puts the transmitter in normal mode.
> 	MM_OUT(&addr->un_edlc.tmode, TM_NORMAL);
> 	 */
876a886
> 	register int i;
879d888
< 	MM_OUT(&addr->un_edlc.reset, RESET_ON);
880a890,909
> 	MM_OUT(&addr->un_edlc.reset, RESET_ON);
> 	/*
> 	 * Set the Xmit-buffer area to a know value
> 	 */
> #define	DUMMY_MSG	"IBM RT/PC"
> #define	MSG_LEN		((sizeof DUMMY_MSG)-1)
> 	i = UN_XBSIZE - MSG_LEN;
> 	/* Set the dummy message */
> 	bcopyout(DUMMY_MSG, &addr->un_xmtbuf[0][i],MSG_LEN);
> 	/* set protocol type 0 (IEEE package, illegal) */
> 	i--, MM_OUT(&addr->un_xmtbuf[0][i], 0x00);
> 	i--, MM_OUT(&addr->un_xmtbuf[0][i], 0x00);
> 	/* source and destination address = this card */
> 	i -= ETH_ADDR_SIZE;
> 	bcopyout(&addr->un_eprom[UN_EADDROFF], &addr->un_xmtbuf[0][i],ETH_ADDR_SIZE);
> 	i -= ETH_ADDR_SIZE;
> 	bcopyout(&addr->un_eprom[UN_EADDROFF], &addr->un_xmtbuf[0][i],ETH_ADDR_SIZE);
> 	/* Set the tx-start registers */
> 	MM_OUT(&addr->un_xsar[1], i & 0xFF);
> 	MM_OUT(&addr->un_xsar[0], i >> 8 & 0xF);