efb@slced1.nswses.navy.mil (Everett F Batey) (03/07/91)
This evening I thought I had cslip ( UofT SLIP 4.0 with cslipbeta and the sunos4 fixes ) working .. got connected onto a remote VAX (+e?) .. slattach dev locip remip .. locally (Sun 4/20) .. slattach devc locip remip baud. No errors upon kernel or slattach make .. double checked the three READMEs for UT, cslipbeta and sunos4. Mar 6 19:01:39 slced1 vmunix: SunOS Release 4.1 (SLCED-SL) #1: Wed Mar 6 18:48:11 PST 1991 .. new kernel booted Got the good slip0 starting IP IP .. baud .. finally was able to route add hisnet myip 1 .. even got ping and finally rsh running .. compatible compress, baud .. as the traffic increased .. couldn't find much evidence of what else may have been going on. ALL at once with no visible degradation .. WHAT IS THE diffs from the above mentioned release configs to make a bullet proof kernel for an SLC (4/20 Sun) with SunOS 4.1.1 ? ( NO PPP not an option, really ) Thank you /Ev/ ************************* /var/adm/messages extract ************************* Mar 6 21:09:41 slced1 vmunix: slip0: coming up Mar 6 21:23:43 slced1 vmunix: panic: mclput Mar 6 21:23:43 slced1 vmunix: syncing file systems... BAD TRAP Mar 6 21:23:43 slced1 vmunix: pid 234, `tcsh': Data fault tcsh has been running stably for many moons .. Mar 6 21:23:43 slced1 vmunix: kernel read fault at addr=0x2000034, pme=0x0 Mar 6 21:23:43 slced1 vmunix: Sync Error Reg 80<INVALID> Mar 6 21:23:43 slced1 vmunix: pc=0xf80a63e4, sp=0xf8112be0, psr=0x1000c3, context=0x7 Mar 6 21:23:43 slced1 vmunix: g1-g7: f8185a5c, ff009fe0, f8138658, 0, f824c000, f8138400, f8138400 Mar 6 21:23:43 slced1 vmunix: Begin traceback... sp = f8112be0 Mar 6 21:23:43 slced1 vmunix: Called from f80a7f38, fp=f8112c40, args=ff027494 1 0 f8148f98 f818ffb4 2000000 Mar 6 21:23:43 slced1 vmunix: Called from f8064638, fp=f8112ca0, args=0 f81227a0 f8129b78 f8147730 f81227a0 f8112c38 Mar 6 21:23:43 slced1 vmunix: Called from f8065258, fp=f8112d00, args=f812266b 4006e6 f8038f44 40 1e6 f8122760 Mar 6 21:23:43 slced1 vmunix: Called from f80edb2c, fp=f8112d60, args=4006e1 0 ff02754c f813e800 ffffffff f8138010 Mar 6 21:23:43 slced1 vmunix: Called from f804f7a4, fp=f8112dc0, args=80 f8122328 f8116c00 f8122328 f8138000 4006e1 Mar 6 21:23:43 slced1 vmunix: Called from f805db24, fp=f8112e20, args=f8122328 800ae3 f8116c00 f813e800 5 2 Mar 6 21:23:43 slced1 vmunix: Called from f805d024, fp=f8112e80, args=ff64f400 f8138c72 f8116c00 f8138c70 68ab0000 237 Mar 6 21:23:43 slced1 vmunix: Called from f80179cc, fp=f8112ee0, args=ff64f400 20 6c 8001e4 8006e4 ff64f400 Mar 6 21:23:43 slced1 vmunix: Called from f8041ba4, fp=f8112f40, args=0 20 f8141768 8001e5 ff651190 ff651180 Mar 6 21:23:43 slced1 vmunix: Called from f8005c0c, fp=f8112fa0, args=1 4000c0 f8017364 0 1e6 f813ef4c Mar 6 21:23:43 slced1 vmunix: Called from f805039c, fp=f824bb60, args=f8169674 f824bea4 f824beb8 1 1 f824bea4 Mar 6 21:23:43 slced1 vmunix: End traceback... Mar 6 21:23:43 slced1 vmunix: panic: Data fault Mar 6 21:23:43 slced1 vmunix: 00000 low-memory static kernel pages Mar 6 21:23:43 slced1 vmunix: 00694 additional static and sysmap kernel pages Mar 6 21:23:43 slced1 vmunix: 00016 dynamic kernel data pages Mar 6 21:23:43 slced1 vmunix: 00163 additional user structure pages Mar 6 21:23:43 slced1 vmunix: 00000 segmap kernel pages Mar 6 21:23:43 slced1 vmunix: 00000 segvn kernel pages Mar 6 21:23:43 slced1 vmunix: 00153 current user process pages Mar 6 21:23:43 slced1 vmunix: failure dumping user stacks: proc=0xf816f868 as=0xff07dea8 seg=0xff0da2c0 Mar 6 21:23:43 slced1 vmunix: 01026 total pages (1026 chunks) Mar 6 21:23:43 slced1 vmunix: Mar 6 21:23:43 slced1 vmunix: dumping to vp ff020f5c, offset 72108 Mar 6 21:23:43 slced1 vmunix: panic: zero Mar 6 21:23:43 slced1 vmunix: rebooting... Mar 6 21:23:43 slced1 vmunix: SunOS Release 4.1 (SLCED-SL) #1: Wed Mar 6 18:48:11 PST 1991 Mar 6 21:23:43 slced1 vmunix: Copyright (c) 1983-1990, Sun Microsystems, Inc. Mar 6 21:23:43 slced1 vmunix: mem = 8192K (0x800000) Mar 6 21:23:43 slced1 vmunix: avail mem = 6680576 Mar 6 21:23:43 slced1 vmunix: Ethernet address = 8:0:20:2:da:5c Mar 6 21:23:43 slced1 vmunix: cpu = Sun 4/20 Mar 6 21:23:43 slced1 vmunix: zs0 at obio 0xf1000000 pri 12 Mar 6 21:23:43 slced1 vmunix: zs1 at obio 0xf0000000 pri 12 our gadgets .. all appearing as normal .. Mar 6 21:23:43 slced1 vmunix: root on sd0a fstype 4.2 Mar 6 21:23:43 slced1 vmunix: swap on sd0b fstype spec size 40170K Mar 6 21:23:43 slced1 vmunix: dump on sd0b fstype spec size 40156K Mar 6 21:37:21 slced1 vmunix: slip0: coming up Mar 6 21:46:29 slced1 vmunix: BAD TRAP Mar 6 21:46:29 slced1 vmunix: pid 123, `update': Data fault Mar 6 21:46:29 slced1 vmunix: bad kernel read fault at addr=0x68692098 Mar 6 21:46:29 slced1 vmunix: Sync Error Reg 80<INVALID> Mar 6 21:46:29 slced1 vmunix: pc=0xf80a63e4, sp=0xf81abda0, psr=0x1010c0, context=0x3 Mar 6 21:46:29 slced1 vmunix: g1-g7: 401ae4, 8000000, ffffffff, 80, f8116c00, f8138400, f8138400 Mar 6 21:46:29 slced1 vmunix: Begin traceback... sp = f81abda0 Mar 6 21:46:29 slced1 vmunix: Called from f80a7f38, fp=f81abe00, args=ff027494 1 0 f8149770 f81c9778 68692064 Mar 6 21:46:29 slced1 vmunix: Called from f8064638, fp=f81abe60, args=0 f81227a0 f8129b78 f8147730 f81227a0 ff09535c Mar 6 21:46:29 slced1 vmunix: Called from f80f2650, fp=f81abec0, args=f81abfe0 120 f811ff30 f8120050 f81ac000 f8122760 Mar 6 21:46:29 slced1 vmunix: Called from f80059e4, fp=f81abf58, args=f81ac000 f81abfb4 f81abfe0 f81ac000 f81ac000 f81abfb4 Mar 6 21:46:29 slced1 vmunix: Called from 22c4, fp=f7fff488, args=0 0 3 0 0 f816d75c Mar 6 21:46:29 slced1 vmunix: End traceback... Mar 6 21:46:29 slced1 vmunix: panic: Data fault Mar 6 21:46:29 slced1 vmunix: syncing file systems... done Mar 6 21:46:29 slced1 vmunix: 00000 low-memory static kernel pages Mar 6 21:46:29 slced1 vmunix: 00665 additional static and sysmap kernel pages Mar 6 21:46:29 slced1 vmunix: 00016 dynamic kernel data pages Mar 6 21:46:29 slced1 vmunix: 00229 additional user structure pages Mar 6 21:46:29 slced1 vmunix: 00000 segmap kernel pages Mar 6 21:46:29 slced1 vmunix: 00000 segvn kernel pages Mar 6 21:46:29 slced1 vmunix: 00002 current user process pages Mar 6 21:46:29 slced1 vmunix: 00106 user stack pages Mar 6 21:46:29 slced1 vmunix: 01018 total pages (1018 chunks) Mar 6 21:46:29 slced1 vmunix: Mar 6 21:46:29 slced1 vmunix: dumping to vp ff020f5c, offset 72172 Mar 6 21:46:29 slced1 vmunix: SunOS Release 4.1 (SLCED-SL) #1: Wed Mar 6 18:48:11 PST 1991 Mar 6 21:46:29 slced1 vmunix: Copyright (c) 1983-1990, Sun Microsystems, Inc. Mar 6 21:46:29 slced1 vmunix: mem = 8192K (0x800000) Mar 6 21:46:29 slced1 vmunix: avail mem = 6680576 Mar 6 21:46:29 slced1 vmunix: Ethernet address = 8:0:20:2:da:5c Mar 6 21:46:29 slced1 vmunix: cpu = Sun 4/20 Mar 6 21:46:29 slced1 vmunix: zs0 at obio 0xf1000000 pri 12 Mar 6 21:46:29 slced1 vmunix: zs1 at obio 0xf0000000 pri 12 our gadgets again .. Mar 6 21:46:29 slced1 vmunix: root on sd0a fstype 4.2 Mar 6 21:46:29 slced1 vmunix: swap on sd0b fstype spec size 40170K Mar 6 21:46:29 slced1 vmunix: dump on sd0b fstype spec size 40156K Mar 6 21:51:06 slced1 vmunix: slip0: coming up Mar 6 22:10:54 slced1 vmunix: BAD TRAP Mar 6 22:10:54 slced1 vmunix: pid 287, `ping': Data fault Mar 6 22:10:54 slced1 vmunix: kernel read fault at addr=0xff654000, pme=0x0 Mar 6 22:10:54 slced1 vmunix: Sync Error Reg 80<INVALID> Mar 6 22:10:54 slced1 vmunix: pc=0xf80d7320, sp=0xf8112db0, psr=0x1c3, context=0x7 Mar 6 22:10:54 slced1 vmunix: g1-g7: 4006e4, 400ae1, ffffff80, 0, f82f6000, 0, 0 Mar 6 22:10:54 slced1 vmunix: Begin traceback... sp = f8112db0 Mar 6 22:10:54 slced1 vmunix: Called from f8016900, fp=f8112e10, args=788 ff653878 0 4 24 58 Mar 6 22:10:54 slced1 vmunix: Called from f8018b5c, fp=f8112e70, args=ff653fe8 ff653860 0 f8138f48 ff653858 ff653800 Mar 6 22:10:54 slced1 vmunix: Called from f801783c, fp=f8112ee0, args=ff653fe8 f8138f48 c037 0 f811d400 b Mar 6 22:10:54 slced1 vmunix: Called from f8041ba4, fp=f8112f40, args=ff653f80 14 f8138f48 0 ff653fe8 0 Mar 6 22:10:54 slced1 vmunix: Called from f8005c0c, fp=f8112fa0, args=0 0 f8017364 0 8001e1 f813ef64 Mar 6 22:10:54 slced1 vmunix: Called from f805eba0, fp=f82f5d38, args=100 4001e3 4000e3 ff653080 0 0 Mar 6 22:10:54 slced1 vmunix: End traceback... Mar 6 22:10:54 slced1 vmunix: panic: Data fault Mar 6 22:10:54 slced1 vmunix: syncing file systems... [11] [11] [9] [7] [3] done Mar 6 22:10:54 slced1 vmunix: 00000 low-memory static kernel pages Mar 6 22:10:54 slced1 vmunix: 00657 additional static and sysmap kernel pages Mar 6 22:10:54 slced1 vmunix: 00016 dynamic kernel data pages Mar 6 22:10:54 slced1 vmunix: 00188 additional user structure pages Mar 6 22:10:54 slced1 vmunix: 00000 segmap kernel pages Mar 6 22:10:54 slced1 vmunix: 00000 segvn kernel pages Mar 6 22:10:54 slced1 vmunix: 00110 current user process pages Mar 6 22:10:54 slced1 vmunix: 00068 user stack pages Mar 6 22:10:54 slced1 vmunix: 01039 total pages (1039 chunks) Mar 6 22:10:54 slced1 vmunix: Mar 6 22:10:54 slced1 vmunix: dumping to vp ff020f5c, offset 72004 two more reboots with NO clues .. -- + efb@suned1.nswses.Navy.MIL efb@gcpacix.uucp efb@gcpacix.cotdazr.org + + efb@nosc.mil WA6CRE Gold Coast Sun Users Vta-SB-SLO DECUS gnu + + Opinions, MINE, NOT Uncle Sam_s | b-news postmaster xntp dns WAFFLE +
tim@appenzell.cs.wisc.edu (Tim Theisen) (03/10/91)
In article <8329@suned1.Nswses.Navy.MIL> efb@slced1.nswses.navy.mil (Everett F Batey) writes: >This evening I thought I had cslip ( UofT SLIP 4.0 with cslipbeta and the >sunos4 fixes ) working .. got connected onto a remote VAX (+e?) .. slattach >dev locip remip .. locally (Sun 4/20) .. slattach devc locip remip baud. > >No errors upon kernel or slattach make .. double checked the three READMEs >for UT, cslipbeta and sunos4. > >Mar 6 19:01:39 slced1 vmunix: SunOS Release 4.1 (SLCED-SL) #1: Wed Mar 6 >18:48:11 PST 1991 .. new kernel booted > >Got the good slip0 starting IP IP .. baud .. finally was able to route add >hisnet myip 1 .. even got ping and finally rsh running .. compatible compress, >baud .. as the traffic increased .. couldn't find much evidence of what else >may have been going on. ALL at once with no visible degradation .. > >WHAT IS THE diffs from the above mentioned release configs to make a bullet >proof kernel for an SLC (4/20 Sun) with SunOS 4.1.1 ? ( NO PPP not an option, >really ) Thank you /Ev/ > >two more reboots with NO clues .. OK, here is a clue. I found the following bug when I ported cslipbeta to Ultrix 4.0. In if_sl.c, there is a spot where the code looks at the data in the mbuf to do type of service queueing. However, it just blindly looks where it expects the data to be. When IP fragmentation is occuring, the IP header and data are in seperate mbufs. The code may access a memory location off the end of the mbuf. On the Ultrix MIPS kernel, if you were unlucky enough to have the mbuf against the top of kernel memory, the access would generate a trap and the kernel panics. Here is the fix I applied to cslipbeta. This might be the cause to your problem. In any case, it would not hurt to apply the patch. *** if_sl.c.old Sat Mar 9 14:09:12 1991 --- if_sl.c Sat Mar 9 14:11:35 1991 *************** *** 456,462 **** } ifq = &ifp->if_snd; if ((ip = mtod(m, struct ip *))->ip_p == IPPROTO_TCP) { ! register int p = ((int *)ip)[ip->ip_hl]; if (INTERACTIVE(p & 0xffff) || INTERACTIVE(p >> 16)) { ifq = &sc->sc_fastq; --- 456,464 ---- } ifq = &ifp->if_snd; if ((ip = mtod(m, struct ip *))->ip_p == IPPROTO_TCP) { ! register int p = -1; ! if (m->m_len > sizeof(struct ip)) ! p = ((int *)ip)[ip->ip_hl]; if (INTERACTIVE(p & 0xffff) || INTERACTIVE(p >> 16)) { ifq = &sc->sc_fastq; Hope this solves your problem, ...Tim -- Tim Theisen Department of Computer Sciences Systems Programmer University of Wisconsin-Madison tim@cs.wisc.edu 1210 West Dayton Street (608)262-0438 Madison, WI 53706