mwm@hslrswi.hasler.ascom.ch (Mike McGann) (04/10/91)
I have brought 4.3 Reno up on a uVax with an RD53 and a cdc xmd(ra81 emulation) There are some small problems: uda.c - now compares not only drive code but also geometry against its tables and they are wrong for a rd53. passwd.c - thinks the username is argv[0] instead of 1, so it works if you rename it to the user name. But I am having a bigger problem. It seems to trash a filesystem in some circumstances to the extent that fsck can't fix it. I only have to run as root, and try to make gnu emacs 18.55. Suddenly one of the directories disappears just looks like somebody unlinked it. Ok I run fsck on the filesystem and it acts just like that. It finds some disconnected directories and so forth. But I never get the space back, I can't delete the files from lost+found, and its downhill from there until I remake the fs. Anybody seen this problem, a fix? mike mwm@hslrswi.hasler.ascom.ch
jhma@tharr.UUCP (James Aldridge) (04/11/91)
In article <1926@hslrswi.hasler.ascom.ch> mwm@hslrswi.hasler.ascom.ch (Mike McGann) writes: >I have brought 4.3 Reno up on a uVax with an >RD53 and a cdc xmd(ra81 emulation) There are some >small problems: I have brought 4.3 Reno up on a uVAX and attemped it on a VAX11/750 >uda.c - now compares not only drive code but also > geometry against its tables and they are wrong for a rd53. Our uVAX II is entirely DEC kit so I can't comment on this one. >passwd.c - thinks the username is argv[0] instead of 1, so it works if you rename it to > the user name. The problem only exists if you don't run the Kerberos stuff. It can be made to work as normal (passwd username) if you add the following lines as a #else clause to the #ifdef KERBEROS in main(): #else argc--; argv++; which corresponds to the corrresponding code in the kerberos case (decoding the -l flag using getopt). >But I am having a bigger problem. It seems to trash a filesystem in some circumstances >to the extent that fsck can't fix it. I believe there was a bug in very early releases of Reno's fsck program - I seem to recall a posting some time back but can't remember when or exactly what he problem was. The Major problem I have had installing 4.3BSD-Reno on our 11/750 is that it doesn't recognise our tape drive when using the supplied kernels. We have an Emulex TC12 (ts11 compatible) tape controller which has worked quite happily under all previous 4.x BSD releases. 4.3 Reno, recognises that there is some sort of controller present but complains that "zs0: didn't interrupt". I have managed to get around the problem by using a modified version of the driver from 4.3BSD and now have to wait for a suitable weekend to do the rest of the installation! >mike >mwm@hslrswi.hasler.ascom.ch James -- James Aldridge / Solid State Logic Ltd. / Begbroke / Oxford / UK Telephone: +44 865 842300 x229 / Fax: +44 865 842118 ---------------------------------------------------------------- <- tharr *free* public access to Usenet in the UK 0234 720202 ->
dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) (04/11/91)
In article <1926@hslrswi.hasler.ascom.ch> mwm@hslrswi.hasler.ascom.ch (Mike McGann) writes: >But I am having a bigger problem. It seems to trash a filesystem in some circumstances >to the extent that fsck can't fix it. I only have to run as root, and try to make >gnu emacs 18.55. Suddenly one of the directories disappears just looks like somebody >unlinked it. Ok I run fsck on the filesystem and it acts just like that. We had vaguely similar symptoms. Our machines would crash frequently when the file systems got busy and would come back up with some really ugly file system inconsistancies. This started to happen when we turned on accounting. We found a bug in the accounting code which was causing the routine which checks to see if the file system is full enough that accounting should be turned off to be called on every clock tick, rather than once every 15 seconds as intended. A patch to kern/kern_acct.c follows. I don't think this change actually repaired the more serious problem (which I suspect is some sort of race condition in the file system code) but it did make the symptoms go away, which made us happy. Note too that the Reno version we were having trouble with is a homebrew port to the IBM RT, which may have its own unique set of problems. Dennis Ferguson *** /tmp/,RCSt1000583 Thu Apr 11 21:18:28 1991 --- kern_acct.c Thu Mar 7 18:38:11 1991 *************** *** 113,119 **** acctp = NULL; log(LOG_NOTICE, "Accounting suspended\n"); } ! timeout(acctwatch, (caddr_t)resettime, hzto(resettime)); } /* --- 113,120 ---- acctp = NULL; log(LOG_NOTICE, "Accounting suspended\n"); } ! timeout(acctwatch, (caddr_t)resettime, ! (int)(resettime->tv_sec * hz + resettime->tv_usec / tick)); } /*
torek@elf.ee.lbl.gov (Chris Torek) (04/13/91)
In article <2032@tharr.UUCP> jhma@tharr.UUCP (James Aldridge) writes: >The Major problem I have had installing 4.3BSD-Reno on our 11/750 is that it >doesn't recognise our tape drive when using the supplied kernels. We have an >Emulex TC12 (ts11 compatible) tape controller which has worked quite happily >under all previous 4.x BSD releases. 4.3 Reno, recognises that there is some >sort of controller present but complains that "zs0: didn't interrupt". I wrote this code. It worked fine on my VAXen at Maryland (which had Dilog `DEC' controllers and Emulex TC13s) and apparently works on a real DEC TS11 controller at Berkeley, but you are not the first one to report this. I cannot explain it---the code obeys all the rules in the Emulex manual. If I had some time with a system with this problem I could no doubt fix it (and maybe this explains why the original driver author thought it was `too hard' to make it interrupt). There is a `quick fix': in tsprobe(), just before if (cvec == 0 || cvec == 0x200) /* no interrupt */ ubarelse(numuba, &a); add if (cvec == 0x200 && ctlr == 0) { /* * No interrupt, assume standard vector XXX * (need to find out why this happens) */ cvec = 0224; br = 0x15; } This will only support ts0; if you prefer you can make it: if (cvec == 0x200) { /* * No interrupt, assume standard vector XXX * (need to find out why this happens) */ cvec = (unsigned)reg & 7 ? 0260 : 0224; br = 0x15; } (which then makes the following test for 0x200 fail, so it could be removed). This will make the code act as before except when the probe succeeds (then the code will use the real interrupt vector). The 4.3reno code is older than the stuff I have now, which just uses TS_IE|TS_SETCHR (no TS_SENSE) and allows up to 2 minutes for the interrupt (in case the controller is busy with a rewind), but this did not fix the problem on the other system on which it was reported. If you are in the SF Bay area and have a machine with this problem, and will let me experiment, send me mail.... -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
muller@sdcc10.ucsd.edu (Keith Muller) (04/13/91)
In article <12027@dog.ee.lbl.gov>, torek@elf.ee.lbl.gov (Chris Torek) writes: > If I had some time with a system with this problem I > could no doubt fix it (and maybe this explains why the original driver > author thought it was `too hard' to make it interrupt). There is a firmware bug in several versions of the proms in emulex tc12 and tc13 controllers that cause it to not interrupt in the probe routine. I know that there are proms for the tc13 that make it interrupt both under tahoe and reno. You probably want to give Emulex customer service a call. Keith Muller University of California, San Diego