henry@utzoo.UUCP (Henry Spencer) (10/17/84)
Well, we tested out a new kernel, with a couple of significant performance
improvements (hashing in the inode table, and a freelist for the file
table), plus other minor trivia. Looked fine single-user. When we came
up multi-user, bizarre things started happening. Shut down again, came
back up with the old /unix. Same bizarre problems. I noticed we were
getting disk errors. Shut down again, started out to figure out just what
was going on -- were our precious Eagles failing all of a sudden? -- and
suddenly realized that I'd write-protected drive B during the standalone
testing, and never un-protected it!! A flip of a switch, and we came back
up perfectly.
The real puzzle is, why was the system reacting so poorly to this? The
proper response to something like this is a spew of console messages;
there weren't any until I tinkered with the srm parameters. Even worse,
there were indications that user programs weren't seeing errors either.
I've seen some signs of error-handling problems in the rm driver before;
it's time for a thorough investigation.
--
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,linus,decvax}!utzoo!henry