[comp.os.minix] 1.3 hard disk question

Chuck_M_Grandgent@cup.portal.com (06/04/89)

I'm still in the process of trying to upgrade my 1.2 to 1.3.
I have two partitions, hd2 @ 6680 blocks, and hd3 @ about 3500.
I'd changed 1.2 to have it load the file system from hd2.
I did this by changing ROOT_DEV in const.h for file system stuff.
I also changed main.c in file system stuff to not load f.s. to RAM
disk.  I set the RAM disk to just 10k,'cuz I really don't want it.
This worked fine for 1.2, but doesn't for 1.3.
As I initially load the 1.3 root file system floppy, it sets my
hd2 to the size of my f.s. floppy, 360 blocks.  It also copies
that root file system floppy contents to my hd2.
Which module is doing this and are there any other suggestions for
preventing Minix from trying to load my root file system to RAM
disk.  By the way, I'd make hd2 minimal, say 400 blocks, but then
I can't get mkfs to make a file system bigger than about 7000 blocks,
so that's why I like my root at 6680 and the rest in hd3.
============================================================
sun!portal!cup.portal.com!chuck_m_grandgent        K1OM
AEG Modicon, Industrial Automation Systems Group
North Andover / Concord, Massachusetts
============================================================

rbthomas@athos.rutgers.edu (Rick Thomas) (06/13/89)

Did you ever find the answer to this?  I am having a similar problem.

I have two hard disks in my AT.  One (the first one) I have reserved
for DOS.  (DOS refuses to boot unless there is an active partition on
the first hard disk, so I gave it the whole thing.) The second disk is
for Minix.  In Minix it is called /dev/hd5 and the partitions are
/dev/hd{6,7,8,9}

While booting -- during the copy root to ramdisk phase -- I get the
diagnostic "Illegal partition table" (always at block 45 for some
reason)  and partitions 6,7,8 and 9 all have length zero according to
"dd if=/dev/hd6 of=/dev/null"  (where 6 = one of 6,7,8,9)  DD claims
to have copied "0+0" records.

"fdisk /dev/hd5" shows a perfectly normal partition table with
partition 1 equal to the whole disk.  mkfs /dev/hd6 fails with a write
error, so I have put my /usr partition on /dev/hd5, and all seems well
except for the "illegal partition table" message and the fact that the
boot-time fsck refuses to check /dev/hd5 (it won't even give me the
option! It only accepts 1-4 and 5-9.)  Run-time fsck works fine.

Rick

BTW -- Im using Minix vers 1.3 "fresh out of the box" from P-H with no
hacks (yet).

========================================================================
Rick Thomas, Manager
Supercomputer Remote Access Center
Rutgers University, College of Engineering
Brett and Bowser Roads
Piscataway, NJ 08855-0909

Phone: (201) 932-4301
Internet: rbthomas@jove.rutgers.edu
UUCP: {any backbone site}!rutgers!jove.rutgers.edu!rbthomas
Alternate UUCP: {convex|c1apple|karna}!kingtut!rbthomas

There are lots of dangerous people on the streets of Manhattan...
	     muggers,  rapists,  mimes...
========================================================================

From Chuck_M_Grandgent@cup.portal.com Sat Jun  3 19:14:50 1989
Subject: 1.3 hard disk question
Date: 3 Jun 89 23:14:50 GMT
Organization: The Portal System (TM)

I'm still in the process of trying to upgrade my 1.2 to 1.3.
I have two partitions, hd2 @ 6680 blocks, and hd3 @ about 3500.
I'd changed 1.2 to have it load the file system from hd2.
I did this by changing ROOT_DEV in const.h for file system stuff.
I also changed main.c in file system stuff to not load f.s. to RAM
disk.  I set the RAM disk to just 10k,'cuz I really don't want it.
This worked fine for 1.2, but doesn't for 1.3.
As I initially load the 1.3 root file system floppy, it sets my
hd2 to the size of my f.s. floppy, 360 blocks.  It also copies
that root file system floppy contents to my hd2.
Which module is doing this and are there any other suggestions for
preventing Minix from trying to load my root file system to RAM
disk.  By the way, I'd make hd2 minimal, say 400 blocks, but then
I can't get mkfs to make a file system bigger than about 7000 blocks,
so that's why I like my root at 6680 and the rest in hd3.
============================================================
sun!portal!cup.portal.com!chuck_m_grandgent        K1OM
AEG Modicon, Industrial Automation Systems Group
North Andover / Concord, Massachusetts
============================================================
-- 

Rick Thomas
uucp: {ames, cbosgd, harvard, moss, seismo}!rutgers!jove.rutgers.edu!rbthomas
arpa: rbthomas@JOVE.RUTGERS.EDU
Phone: (201) 932-4301

rbthomas@aramis.rutgers.edu (Rick Thomas) (06/16/89)

I have received exactly two responses to my request for help mentioned in
the subject line.

One reply was from Andy Tanenbaum in Amsterdam, The Netherlands; and
the other was from Bruce Evans, in Sydney, Australia.  In other words,
my message has gone and been read half way round the world in both directions.

Isn't the net neat!

Rick
-- 

Rick Thomas
uucp: {ames, cbosgd, harvard, moss, seismo}!rutgers!jove.rutgers.edu!rbthomas
arpa: rbthomas@JOVE.RUTGERS.EDU
Phone: (201) 932-4301

rbthomas@athos.rutgers.edu (Rick Thomas) (06/22/89)

I finally solved the problem.  It turned out to be caused by having a
hard disk with 5 heads instead of 4, as fdisk and buddies expected.

The above is the executive summary, now for the gory details.


First a summary of the problem:  

I have two hard disks in my AT.  One (the first one -- 20 MB) I have
reserved for DOS.  (DOS refuses to boot unless there is an active
partition on the first hard disk, so I gave it the whole thing.) The
second disk (30 MB) is for Minix.  In Minix it is called /dev/hd5 and
the partitions are /dev/hd{6,7,8,9}

While booting -- during the copy root to ramdisk phase -- I get the
diagnostic "Invalid partition table" (always at block 45 for some
reason)  and partitions 6,7,8 and 9 all have length zero according to
"dd if=/dev/hd6 of=/dev/null"  (where 6 = one of 6,7,8,9)  DD claims
to have copied "0+0" records.

"fdisk /dev/hd5" shows a perfectly normal partition table with
partition 1 equal to the whole disk.  mkfs /dev/hd6 fails with a write
error, so I have put my /usr partition on /dev/hd5, and all seems well
except for the "illegal partition table" message and the fact that the
boot-time fsck refuses to check /dev/hd5 (it won't even give me the
option! It only accepts 1-4 and 5-9.)  Run-time fsck works fine.

Now an exciting tale of tracking the wild bug monster in its native
habitat:

My initial posting of the problem brought back two responses, one from
Andy Tanenbaum and the other from Bruce Evans.  Andy had no immediate
help (He noted that his home machine has two disks and it works just
fine!) but Bruce pointed out that doing mkfs on /dev/hd5 is a "no-no"
because it writes zero bytes all over the boot block (which includes
the partition table)  I had already discovered this myself, and
restored (I thought) the partition table using the Minix fdisk
program.  Certainly fdisk displayed a partition table that looked OK to
me.

My first step was to use grep to go looking through the whole source
tree for the place that was producing the "Invalid partition" message.
I found it in kernel/at_wini.c .  The actual message was somewhat
longer than that and included the number of the disk that it judged
bad, but the message was getting truncated by the screen-will-not-wrap
(mis-)feature of vanilla Minix 1.3 out of the box.  I edited the
message to put a newline at the beginning and end, so I could see which
disk was in trouble (hoping to see that it was as simple as that Minix
was not happy with the DOS partitioning of my first disk -- i.e.
something I could ignore.) and built a new kernel and boot disk.

Unfortunately, it turned out to be the Minix disk that was causing the
trouble.

So I hacked a little more on at_wini.c, and made it print out the
in-core partition table that it built from what it read off the boot
sector.  (Putting kernel printf's into system initialization code is a
scary business.  If you happen to rip open a timing window and print
to the console before its driver gets initialized, you can do all kinds
of bad stuff!  I was working without a net, too, in the sense that if I
clobbered my disk, I would have to go back to square one and spend
three or four evenings reloading from the distribution disks and
re-applying patches.)  Andy was the one who suggested that putting
printf's into the driver is a useful debugging technique when nothing
else seems to help.  Thanks, Andy!  The upshot was that the partition
table seemed to be OK on disk, but was all zeros in core.

So now its time to read the code. (-8 When all else fails, read the
instructions... 8-)  A line-by line examination of the code revealed
that there is a "magic number" (0xAA55) in the last two bytes of the
boot block that at_wini.c wants to see, and if it doesn't see it, it
prints the "Invalid partition" message and zeros out the in-core
partition table.  The zero partition tables was why "dd if=/dev/hd6"
found nothing to read.

I hacked together a version of fdisk that put back the magic number and
ran it.  Lo and behold, when I rebooted, the nasty message had gone
away and I could read from /dev/hd6.  But the in-core partition table
was a mess.  (The on-disk version still looked ok when I printed it
out with fdisk.  The in-core version had non-zero numbers in it but they
were wrong.)

Time to read the code again.  I had recently retrieved a public domain
disk partitioning program from the Clarkson University software
archives, and reading it gave me a very good description of how the
on-disk partition table was supposed to look.  (If anyone is
interested, I can post the relevant parts of the code and comments.
The formats are definitely non-obvious!) The program itself was not
much help, because it was tightly coupled with an odd-ball disk driver
TSR that I did not want to try out just then.  (I had enough
troubles!)  In the light of that new knowledge, I looked at fdisk with
a critical eye, and found that it had hard-coded assumptions about the
number of sectors and heads on a disk.  Those assumptions were wrong
for my second (Minix) disk, though they were right for my first (DOS)
disk.  I hacked on fdisk til I had a version that hard-coded the right
assumptions for my disk (which was *much* easier than doing the job
right by making it find out from the bios what the correct assumptions
were!)  I ran it, and everything is beautiful again.

Now "all" I have to do is backup my /dev/hd5 filesystem, run mkfs on
/dev/hd6, and reload.  Shouldn't take me more than a week.

Anybody want to hack fdisk to do the job *right* once and for all?
-- 

Rick Thomas
uucp: {ames, cbosgd, harvard, moss, seismo}!rutgers!jove.rutgers.edu!rbthomas
arpa: rbthomas@JOVE.RUTGERS.EDU
Phone: (201) 932-4301

Chuck_M_Grandgent@cup.portal.com (06/22/89)

> Did you ever find the answer to this?  I am having a similar problem.

> I have two hard disks in my AT.  One (the first one) I have reserved
> for DOS.  (DOS refuses to boot unless there is an active partition on
> the first hard disk, so I gave it the whole thing.) The second disk is
> for Minix.  In Minix it is called /dev/hd5 and the partitions are
> /dev/hd{6,7,8,9}

> While booting -- during the copy root to ramdisk phase -- I get the27 to 3842

> diagnostic "Illegal partition table" (always at block 45 for some
> reason)  and partitions 6,7,8 and 9 all have length zero according to
> "dd if=/dev/hd6 of=/dev/null"  (where 6 = one of 6,7,8,9)  DD claims
> to have copied "0+0" records.

I think I resolved my problem.
I think (but havent confirmed) that I had one of my v1.2 header
files (modified for my change) in place when I built the v1.3
kernel.  Hence 1.3's attempt to use /dev/hd2 as the root device.
I'm hoping that when I've got ALL 1.3 files in place when I build
the kernel it'll be allright....
=======================================================================
Chuck_M_Grandgent@cup.portal.com
sun!portal!cup.portal.com!chuck_m_grandgent                K1OM
AEG Modicon, Industrial Automation Systems Group
North Andover/Concord Massachusetts
=======================================================================