[comp.unix.microport] uport drive bug?

zeeff@b-tech.UUCP (Jon Zeeff) (02/22/88)

> I was plagued with random pseudo-errors when I tried to use two hard drives.
> These errors were fake in that no media error existed, but real in that the
> file system got corrupted.  I finally had to solve it by goiong back to just
> one hard disk.  The problem was repeatable on demand by using cpio to try to
> 

This is exactly the problem I have here.  Ok, uport, it appears that there
is a problem here (in both '286 and '386 versions).


-- 
Jon Zeeff           		Branch Technology,
uunet!umix!b-tech!zeeff  	zeeff%b-tech.uucp@umix.cc.umich.edu

david@bdt.UUCP (David Beckemeyer) (02/23/88)

In article <4281@b-tech.UUCP> zeeff@b-tech.UUCP (Jon Zeeff) writes:
>
>> I was plagued with random pseudo-errors when I tried to use two hard drives.
>> These errors were fake in that no media error existed, but real in that the
>> file system got corrupted.  I finally had to solve it by goiong back to just
>> one hard disk.  The problem was repeatable on demand by using cpio to try to
>> 
>
>This is exactly the problem I have here.  Ok, uport, it appears that there
>is a problem here (in both '286 and '386 versions).

I also have this problem with the 286 version.  I have been told by
uport that this is fixed in the 386 version.

The fun part was that it took several calls to uport techs who just
kept saying "map out the bad track" and try again.  Each time this
meant a backup, re-format/divvy, and a restore.

I hope it is *really* fixed in the 386 versoin.  Does anybody have this
problem with 386 uport?

-- 
David Beckemeyer			| "To understand ranch lingo all yuh
Beckemeyer Development Tools		| have to do is to know in advance what
478 Santa Clara Ave, Oakland, CA 94610	| the other feller means an' then pay
UUCP: ...!ihnp4!hoptoad!bdt!david 	| no attention to what he says"

bhj@system5.UUCP (Burt Janz) (02/25/88)

Hmmm...

I am currently using two hard disk drives under Microport SV/AT 2.3 in both
the normal and Merge kernel (I alternately boot when necessary...) and
have noted NO problems with the hard disk drive.

Mebbe I missed something in an earlier posting.

Anyway, I'm using a WD controller, Maxtor 1065 and Quantum 540 drives.  My
configuration is normal, and I have the Quantum as an entire file system
(~36mb) as /k.

Burt Janz
..decvax!bhjat!bhj

john@wa3wbu.UUCP (John Gayman) (02/27/88)

In article <4281@b-tech.UUCP>, zeeff@b-tech.UUCP (Jon Zeeff) writes:
> 
> > I was plagued with random pseudo-errors when I tried to use two hard drives.
> > These errors were fake in that no media error existed, but real in that the
> > file system got corrupted.  I finally had to solve it by goiong back to just
> > one hard disk.  The problem was repeatable on demand by using cpio to try to


     Is there any common controller being used by the sites having massive
dual-disk errors ?  I have been running two hard disks for almost 9
months and haven't had any problems. (I should know better than to say
this). I had some problems with divvy setting up the file system properly
on the second disk, but once I got through that, the drives worked fine.
I even tried the cpio experiment. I cpio -p 10MB worth from the second
disk onto the first disk, simultaneously with News expire, simultaneously
with the first disk doing a "find . -depth -print" and had no problems.
I mean, the disks were cranking big-time forever. :-)  Im using fairly
bland hardware. A Compu-Add 8 Mhz 286, with a WD-1002 controller. One
unique thing about my second hard disk. Due to the annoying problem with
divvy, my second disk has always been a single partition configured
as /dev/dsk/1s10.  Hope this helps shed some light on the problem.


					John


-- 
John Gayman, WA3WBU              |           UUCP: uunet!wa3wbu!john
1869 Valley Rd.                  |           ARPA: wa3wbu!john@uunet.UU.NET 
Marysville, PA 17053             |           Packet: WA3WBU @ AK3P 

james@bigtex.UUCP (James Van Artsdalen) (02/28/88)

In article <505@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes:
>      Is there any common controller being used by the sites having massive
> dual-disk errors?

I was using a Western Digital 1003 with a Seagate 4096 and a Rodime 103E on
an old-style PCs Ltd 12MHz 286 with 2.5meg of RAM.  The problems occurred at
6MHz also, and at various interleaves.  The drive is believed not at fault,
as the reported error did not occur on a manufacturer-marked bad spot, and
both drives passed Novell's COMPSURF on a 150hr test (since uPort didn't
say *which* drive failed, both were so tested).  Multiple controllers were
tried, although all were various revisions of the WD1003, as that's all I had
access to.  The problem was observed with uPort 1.3.6, 1.3.8b3 & 2.2.

> I even tried the cpio experiment. I cpio -p 10MB worth from the second
> disk onto the first disk, simultaneously with News expire, simultaneously
> with the first disk doing a "find . -depth -print" and had no problems.
> I mean, the disks were cranking big-time forever. :-)

On my machine, as soon as the disk buffers had filled and the system started
bouncing between the two drives on almost every access, there would be a bogus
error within a minute.  This test would have showed it.
-- 
James R. Van Artsdalen    ...!uunet!utastro!bigtex!james     "Live Free or Die"
Home: 512-346-2444 Work: 328-0282; 110 Wild Basin Rd. Ste #230, Austin TX 78746

zeeff@b-tech.UUCP (Jon Zeeff) (02/29/88)

I used to get the drive errors when using a seagate 4096 and ST251 drives
in an IBM AT with uport '286 version 2.2.  I now get the problem with
the same drives and a PC CLUB '386 machine and '386 Unix version 2.1
(actually the current Merge version).

-- 
Jon Zeeff           		Branch Technology,
uunet!umix!b-tech!zeeff  	zeeff%b-tech.uucp@umix.cc.umich.edu

fyl@ssc.UUCP (Phil Hughes) (03/01/88)

I have had some 'bad spots' appear with microport 386. To early to
tell yet but I am concerned.
-- 
Phil    uunet!pilchuck!ssc!fyl 

fyl@ssc.UUCP (Phil Hughes) (03/01/88)

I have disk errors with a Seagate 4096 on uPort and it seems that
everyone else who posted a problem had a 4096.  My disk controller
is a National (of Japan) NDC-5425.  Maybe we can figure this out.
-- 
Phil    uunet!pilchuck!ssc!fyl 

james@bigtex.UUCP (James Van Artsdalen) (03/02/88)

In article <1053@ssc.UUCP>, fyl@ssc.UUCP (Phil Hughes) writes:
> I have disk errors with a Seagate 4096 on uPort and it seems that
> everyone else who posted a problem had a 4096.  My disk controller
> is a National (of Japan) NDC-5425.  Maybe we can figure this out.

I also tried the combination of a 72meg Toshiba and the 33meg Rodime with no
luck.  Remember, using any drive alone, even under heavy load, produces no
problems.  It's only when *both* drives are used that failure is induced.  Not
only that, but both drives must be under heavy load.  This is why I suspect
either a design flaw in the controller (WD1003) or bug in uPort driver.  DOS
never generates accesses back and forth rapidly between drives as far as I
know, so there isn't an easy way to test the controller separately.  Could ask
the Xenix folks who use a WD1003 I suppose.
-- 
James R. Van Artsdalen    ...!uunet!utastro!bigtex!james     "Live Free or Die"
Home: 512-346-2444 Work: 328-0282; 110 Wild Basin Rd. Ste #230, Austin TX 78746

mjy@sdti.UUCP (Michael J. Young) (03/02/88)

In article <1053@ssc.UUCP> fyl@ssc.UUCP (Phil Hughes) writes:
>I have disk errors with a Seagate 4096 on uPort and it seems that
>everyone else who posted a problem had a 4096.  My disk controller
>is a National (of Japan) NDC-5425.  Maybe we can figure this out.

I've had disk errors with a number of different disk configurations.
The original configuration was a Seagate 4051 and 4038.  The 4051 eventually
died a horrible death (unrelated problem), and was temporarily replaced
with a 251.  The intermittent I/O errors were still there.  Now I have a
Tandon 3085 along with the old 4038, and I can still create I/O errors
using the cpio -p test.  So it's not just limited to the 4096.

I have a WD WA2 controller in an ACER 900 10MHz.
-- 
Mike Young - Software Development Technologies, Inc., Sudbury MA 01776
UUCP     : {decvax,harvard,linus,mit-eddie}!necntc!necis!mrst!sdti!mjy
Internet : mjy%sdti.uucp@harvard.harvard.edu      Tel: +1 617 443 5779

mjy@sdti.UUCP (Michael J. Young) (03/02/88)

In article <882@bigtex.UUCP> james@bigtex.UUCP (James Van Artsdalen) writes:
>I also tried the combination of a 72meg Toshiba and the 33meg Rodime with no
>luck.  Remember, using any drive alone, even under heavy load, produces no
>problems.  It's only when *both* drives are used that failure is induced.  Not
>only that, but both drives must be under heavy load.  This is why I suspect
>either a design flaw in the controller (WD1003) or bug in uPort driver.

Actually, it isn't the amount of load on the drives, its the fact that the
system is trying to access BOTH drives simultaneously.  The easiest way to
get this to happen is to have multiple processes accessing the different
drives at the same time.  I used to get the errors a lot by generating a
lot of swapping activity (e.g., using pathalias) while doing a lot of
accesses to the drive that doesn't have the swap file.
-- 
Mike Young - Software Development Technologies, Inc., Sudbury MA 01776
UUCP     : {decvax,harvard,linus,mit-eddie}!necntc!necis!mrst!sdti!mjy
Internet : mjy%sdti.uucp@harvard.harvard.edu      Tel: +1 617 443 5779

david@bdt.UUCP (David Beckemeyer) (03/04/88)

I have the hard drive problem with a Micropolis 34.5 MB drive and a
CDC 72 MB drive.  Both drives work perfectly in a single drive configuration,
but I get disk errors when both drives are installed, regardless of
which drive is unit 0 and which is unit 1.   The system has a WD controller
and a 10 MHz 1MB motherboard, with 1MB of ext. RAM.
-- 
David Beckemeyer			| "To understand ranch lingo all yuh
Beckemeyer Development Tools		| have to do is to know in advance what
478 Santa Clara Ave, Oakland, CA 94610	| the other feller means an' then pay
UUCP: ...!ihnp4!hoptoad!bdt!david 	| no attention to what he says"

chad@anasaz.UUCP (Chad R. Larson) (03/05/88)

To add to the statistics:
  I have a Segate and an Atasi drive attached to a WD-1002 controller
  and have not seen the problem.  I used cpio -pdmv to move complete
  file systems between the drives.  It seems to me (no tests, just
  reading this newsgroup) that the regularity is a WD-1003
  controller.
---------------
"I read the news today, oh boy!"  --John Lennon
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| DCF, Inc.               | UUCP: ...noao!mcdsun!nud!anasaz!dcfinc!chad |
| 14623 North 49th Place  | Ma Bell: (602) 953-1392                     |
| Scottsdale, AZ 85254    | Loran: N-33deg37min20sec W-111deg58min26sec |
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|         Disclaimer: These ARE the opinions of my employer!            |
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

jay@splut.UUCP (Jay Maynard) (03/06/88)

I, too, get the wn io errors. (In fact, that was the subject of a memorable
flame I posted a few months ago.) I'm running an ST4051 and an ST251, with
the root file system on /dev/dsk/0s0, the /usr file system on /dev/dsk/1s2,
and a 1 MB ramdisk on /tmp. The ramdisk has greatly reduced my i/o errors.
This is on an Everex 1800A, with the standard Everex disk controller. I have
a WD1003 sitting here...maybe I oughtta try that, too.
BTW, has anyone tried making a new floppy-rooted kernel to replace the
(apparently) bad one supplied on the 2.3.0-L boot floppy? Is the unlimited
one bad, too?

-- 
Jay Maynard, EMT-P, K5ZC...>splut!< | GEnie: JAYMAYNARD  CI$: 71036,1603
uucp: {uunet!nuchat,academ!uhnix1,{ihnp4,bellcore,killer}!tness1}!splut!jay
Never ascribe to malice that which can adequately be explained by stupidity.
The opinions herein are shared by none of my cats, much less anyone else.

karl@sugar.UUCP (Karl Lehenbauer) (03/06/88)

In article <1053@ssc.UUCP>, fyl@ssc.UUCP (Phil Hughes) writes:
> ...My disk controller
> is a National (of Japan) NDC-5425. ...

Again, National (NCL) disk controllers do not work well, if at all, with
Microport System V/AT.  Anyone trying to use a National controller for
this purpose would be well advised to get a different one, say one based
on the Western Digital controller chip.
-- 
"Lack of skill dictates economy of style." - Joey Ramone
..!uunet!nuchat!sugar!karl, Unix BBS (713) 438-5018

mjy@sdti.UUCP (Michael J. Young) (03/09/88)

In article <415@splut.UUCP> jay@splut.UUCP (Jay Maynard) writes:
>BTW, has anyone tried making a new floppy-rooted kernel to replace the
>(apparently) bad one supplied on the 2.3.0-L boot floppy? Is the unlimited
>one bad, too?

What's wrong with the boot floppy kernel?  I had no problems with it -- of
course, I switched to the large kernel shortly after installation :-).
If you received a bad boot floppy, call Microport.  My 2.3 upgrade contained
a bad floppy (couldn't read the kernel), and they were very helpful.
They sent out a new floppy, which I received within 3 days.  In the meantime,
they gave me tips on how to install the upgrade without a working boot
floppy.  I thought technical support was both knowledgeable and helpful.
-- 
Mike Young - Software Development Technologies, Inc., Sudbury MA 01776
UUCP     : {decvax,harvard,linus,mit-eddie}!necntc!necis!mrst!sdti!mjy
Internet : mjy%sdti.uucp@harvard.harvard.edu      Tel: +1 617 443 5779

plocher@cat2.CS.WISC.EDU (John Plocher) (03/09/88)

I also am using 2 drives with NO problems - a Maxtor 1140 and a Seagate 225.
These are hung off a WD 1002-WAH controller.

Others reported problems with WD 1003 controllers - is this the only
difference?

/root, /swap, /usr, and /u are on the Maxtor, /usr/spool/news is the 225.
Needless to say, there is much copying between both drives.  and no "phantom"
bad sectors...

  -John

karl@sugar.UUCP (Karl Lehenbauer) (03/11/88)

I haven't had the problem, although I am running with two drives.  The
drives are both Miniscribe 72 meg units.  I noticed that everyone who's
posted about having these problems have unlike kinds of drives and I was
wondering if any people with two identical drives are having the problem.
-- 
"Lack of skill dictates economy of style." - Joey Ramone
..!uunet!nuchat!sugar!karl, Unix BBS (713) 438-5018

jay@splut.UUCP (Jay Maynard) (03/19/88)

From article <240@sdti.UUCP>, by mjy@sdti.UUCP (Michael J. Young):
> In article <415@splut.UUCP> jay@splut.UUCP (Jay Maynard) writes:
>>BTW, has anyone tried making a new floppy-rooted kernel to replace the
>>(apparently) bad one supplied on the 2.3.0-L boot floppy? Is the unlimited
>>one bad, too?
> 
> What's wrong with the boot floppy kernel?  I had no problems with it -- of
> course, I switched to the large kernel shortly after installation :-).

I had a problem both reading the boot kernel and running patch against
it. Both were due to a problem I discovered trying to make a copy of the
floppy: bad sectors on tracks 63, 64, and 65 of the boot disk. As it
turns out, because of some great help I got from Dwight Leu, I wound up
with two copies of the 2.3 upgrade, and both had bad sectors in the same
place, leading me to believe that the master was bad.

> they gave me tips on how to install the upgrade without a working boot
> floppy.  I thought technical support was both knowledgeable and helpful.

I managed, with the help of a bit of head scratching and my 2.2 kernel,
to get the large kernel on my system; at that point, it was no trouble
at all. I'm asking mainly so that I can build a good boot floppy for
fsck-ing. (A procedure I strongly recommend to guard against the fsck
file corrupter.)
-- 
Jay Maynard, EMT-P, K5ZC...>splut!< | GEnie: JAYMAYNARD  CI$: 71036,1603
uucp: {uunet!nuchat,academ!uhnix1,{ihnp4,bellcore,killer}!tness1}!splut!jay
Never ascribe to malice that which can adequately be explained by stupidity.
The opinions herein are shared by none of my cats, much less anyone else.