[comp.os.minix] Hard Disk Problem...

arthur@warwick.UUCP (02/25/87)

I would be grateful if anyone can help be on this one.

I posted a short while back saying I was having problems partitioning
my hard disk under MINIX. I have narrowed down the problem, and it appears
that MINIX cannot reliably access more than the first three or so mega
bytes of the disk. If I create three one megabyte partitions it works fine.
Also when I was running without partitions when I filled up the first three
megs I started to get disk errors, the fuller the disk the more the errors.
I have no problems when using MSDOS, so I think it must be a software
problem.

I assume it is some sort of incompatibility, between the real PC controller
and the one in my machine. Mine is a Xebec, I am not sure if it has a model
number, if it does I do not know it. If anyone can shed any light on this
problem I would be very grateful, as I don't have sufficient information
about the controller to fix this myself. What chipset does a real IBM XT use?

While I am at it I thought I'd mention a few more bugs:-

1) You can backspace over your prompt, I think this is a tty driver problem.

2) If the argument list to a program gets too long the shell complains
that it cannot excecute the program. I understand that exec is meant to return
some error status telling it that the arg list was too big. I guess this is
a kernel problem(?)

3) $# and $0 are set wrongly in shell scripts. $# is one too big, and $0 is
set to /bin/sh rather than the name of the shell script. A colleague of mine
has fixed this and I guess he will post the fix soon.

4) tar won't read from standard input using the -, and rm won't let you
remove things that start with a -. Probably missing features rather than bugs
really.

5) C programs must expicitly flush all buffers before exiting. I don't if
this a V7ism or if it is a bug as I don't have access to a V7 system.


I am sorry to keep reporting all these things, maybe I'll get round to
trying to fix some of them just as soon as I sort this hard disk out!


John.


UUCP: ..ukc!warwick!arthur          JANET: arthur@uk.ac.warwick.uu


"Na, that's not an operating system. THIS is an operating system!"

patwood@esquire.UUCP (02/28/87)

In article <496@ubu.warwick.UUCP>, arthur@warwick.UUCP (John Vaudin) writes:

> 1) You can backspace over your prompt, I think this is a tty driver problem.

Only the BSD driver doesn't allow you to backspace over your prompt.
On other versions of UNIX, the Korn shell prevents this; however, on vanilla
UNIX, you are allowed to backspace over your prompt.

Pat Wood

diamant@hpfclp.UUCP (03/01/87)

> 4) tar won't read from standard input using the -, and rm won't let you
> remove things that start with a -. Probably missing features rather than bugs
> really.

I don't think MINIX is to blame here for rm.  This is because if you say
something like "rm -foobar" it looks like an option to rm.  I have seen this
behavior in most UNIX systems.  You can do something like "rm -i *foobar" to
get rid of files like this.
> 
> 5) C programs must expicitly flush all buffers before exiting. I don't if
> this a V7ism or if it is a bug as I don't have access to a V7 system.

This is a documented incompatibility (in the Libraries part of appendix C, 
page 409-410).  The explanation is that to avoid including stdio in all
programs (to reduce their size), the MINIX C compiler doesn't flush stdio's
buffers since stdio may not be included.  I suspect that the same result
could have been accomplished without creating a source incompatibility by
one of two approaches:

	1) Use a compiler switch to tell the compiler not to include stdio
	   cleanup code.
	2) Include by default a very small cleanup routine that detects whether
	   stdio was included, and flushes the buffers only if the library
	   was loaded.

John Diamant
SCO				UUCP:  {hplabs,hpfcla}!hpfclp!diamant
Hewlett Packard Co.		ARPA Internet: diamant%hpfclp@hplabs.HP.COM
Fort Collins, CO

braun@m10ux.UUCP (03/02/87)

The way sys V (and 4.2 BSD) allow you to call exit() and
only include code to flush stdio's buffers is as follows:

Remember, main() is just another function.
The actual main program (csu.c or csu.s) works basically
like this:

/* First diddle stack pointer, etc. so main() can get environ, argc, argv */
r0 = main();
exit (r0);


Exit() looks like this:  (actually in assembler on sys V)

exit(exitval)
{
    _cleanup();
    _exit(exitval);
}

So far, so good. Csu makes it possible to not call exit() and still
flush things in _cleanup().  So the problem is: we need to load one of two
versions of _cleanup().  If we use stdio, we want a cleanup to flush buffers.
Otherwise, it should do nothing.  The first version is part of flsbuf.c
in the stdio library.  Since it is in the same object file, it will be loaded
whenever flsbuf.o is loaded, which is whenever you use stdio.
    Exit.o is located in libc.a after flsbuf.o, so if stdio is not
used, there will be no references to _cleanup() when flsbuf.o 
is searched by the loader, and it will not be loaded.
Of course if stdio is used, flsbuf.o (containing _cleanup())
will be loaded.
    When exit.o is searched by the linker, it will always be loaded,
because it is referenced by csu, the actual "main" program.
If we have used stdio, we have now loaded everything we need.
If not, we not have an outstanding reference to _cleanup(),
which must be the non-stdio version.
This dummy version of cleanup is located near the end of libc.a,
after flsbuf.o and exit.o.  This gets loaded only if there is an
outstanding reference to _cleanup(), which is only if stdio was not used.

Does this make any sense?  I imagine it will be obvious to some,
and incoherent to most others.  The important point is that
we can have everything work fine without any compiler switches,
or other action on the part of the user.   The only assumption
is that we have a loader which linearly searches the object library.


-- 

Doug Braun		AT+T Bell Labs, Murray Hill, NJ
m10ux!braun		201 582-7039

zemon@felix.UUCP (03/03/87)

In article <496@ubu.warwick.UUCP> arthur@ubu.UUCP (John Vaudin) writes:
>
>2) If the argument list to a program gets too long the shell complains
>that it cannot excecute the program. I understand that exec is meant to
>return some error status telling it that the arg list was too big. I guess
>this is a kernel problem(?)

This is a design limitation.  If I remember correctly,
there is a buffer in one of the routines in the memory
manager called by do_exec() which is large enough to
contain the initial stack for a new program.  Since the
command line and the environment are pushed onto the stack,
I am not surprised that you overflowed the buffer.  When
this happens, the exec() fails.

There is a single constant to change in one of the MM's
header files and then recompile.  That should solve your
problem at the expense of a slightly larger MM task.

>5) C programs must expicitly flush all buffers before exiting. I don't if
>this a V7ism or if it is a bug as I don't have access to a V7 system.

This is a designed-in incompatibility with Unix and is
mentioned in the appendix for Minix implementors.  C
programs must either flush all standard I/O buffers or call
cleanup() before calling exit().  The reason exit() doesn't
call cleanup() like on a Unix system is that most of the
utilities don't use standard I/O.  Did you ever wonder how
AST got most of the utilities down to just a few hundred bytes?

You can remove this incompatibility by changing exit() to
call cleanup() and then rebuilding libc.a  You might want
to keep the old libc.a around, though, in case you ever
want to write small programs.


Let me wind up by telling you that all of my comments are
based on reading the book.  I'm still waiting for my
diskettes.  I'm sure no saint when it comes to reading all
the documentation before plunging head first into a neat
new program.  But when all I have is the documentation....
			:-)

Cheers,
-- 
	-- Art Zemon
	   FileNet Corporation
	   Costa Mesa, California
	   ...!hplabs!felix!zemon

rs@mirror.UUCP (03/03/87)

Anyone hacking on the assembler/loader?  The way many Unix systems
avoid calling in stdio when they don't need it is that they have two
routines named exit() in their library.  The one that comes *before*
printf, _flsbuf, sprintf, fread, etc., doesn't call cleanup().  The
exit that comes *after* those routines does.

This idea goes all the way back to (at least) Version 7.
--
Rich $alz					"Drug tests p**s me off"
Mirror Systems, Cambridge Massachusetts		rs@mirror.TMC.COM
{adelie, mit-eddie, ihnp4, harvard!wjh12, cca, cbosgd, seismo}!mirror!rs

ron@brl-sem.UUCP (03/04/87)

In article <9490002@hpfclp.HP.COM>, diamant@hpfclp.HP.COM (John Diamant) writes:
> buffers since stdio may not be included.  I suspect that the same result
> could have been accomplished without creating a source incompatibility by
> one of two approaches:
> 
> 	1) Use a compiler switch to tell the compiler not to include stdio
> 	   cleanup code.

This used to be how it was done.  Standard I/O was in its own library.
You specified it as an option to the loader.  This caused a different
version of exit to be loaded.

I notice that the MINIX stdio is not true to the V7 code, at least not
in tar, in that it runs buffered to a terminal.  It's a pain for tar to
do this.  V7 used to do isatty in making this determination, but there
are probably better ways.

-Ron

steve@warwick.UUCP (Steve Rumsby) (03/09/87)

Cc:


In article <2371@felix.UUCP> you write:
>In article <496@ubu.warwick.UUCP> arthur@ubu.UUCP (John Vaudin) writes:
>>
>>2) If the argument list to a program gets too long the shell complains
>>that it cannot excecute the program. I understand that exec is meant to
>>return some error status telling it that the arg list was too big. I guess
>>this is a kernel problem(?)
>
>This is a design limitation.  If I remember correctly,
>there is a buffer in one of the routines in the memory
>manager called by do_exec() which is large enough to
>contain the initial stack for a new program.  Since the
>command line and the environment are pushed onto the stack,
>I am not surprised that you overflowed the buffer.  When
>this happens, the exec() fails.
>
>There is a single constant to change in one of the MM's
>header files and then recompile.  That should solve your
>problem at the expense of a slightly larger MM task.
>

Not quite. Perhaps the description of the problem was not clear enough. The
exec call doesn't return the appropriate error code. The shell code checks
the appropriate thing, and works correctly on a real unix system. Under
minix, however, something strange happens - can't remember what as I was
only briefly shown the bug. How about a reminder John?

					Steve.

-- 
_______________________________________________________________________________
|UUCP:	 ...!ukc!warwick!steve			| Steve Rumsby		      |
|JANET:	 steve@uk.ac.warwick.maths		| Maths Institute	      |
|ARPA:	 steve%uk.ac.warwick.maths@ucl-cs.ARPA	| University of Warwick	      |
|BITNET: steve%uk.ac.warwick.maths@UK.AC	| Coventry		      |
|						| CV4 7AL		      |
|PHONE:	 +44 203 523523 x2657			| ENGLAND		      |
-------------------------------------------------------------------------------

"For every problem there is one solution which is simple, neat, and wrong."
								-- H. L. Menken

-- 
_______________________________________________________________________________
|UUCP:	 ...!ukc!warwick!steve			| Steve Rumsby		      |
|JANET:	 steve@uk.ac.warwick.maths		| Maths Institute	      |
|ARPA:	 steve%uk.ac.warwick.maths@ucl-cs.ARPA	| University of Warwick	      |
|BITNET: steve%uk.ac.warwick.maths@UK.AC	| Coventry		      |
|						| CV4 7AL		      |
|PHONE:	 +44 203 523523 x2657			| ENGLAND		      |
-------------------------------------------------------------------------------

"For every problem there is one solution which is simple, neat, and wrong."
								-- H. L. Menken

arthur@warwick.UUCP (John Vaudin) (03/10/87)

I think I confused several people with my rather vague posting, so to
clarify things a bit:

The problem with having to flush buffers in C programs is indeed mentioned
in the book, so I appologise for that one. My excuse (and I'm sticking
to it :-) is that I had the software along time before I got the book. 
Some of the fixes suggested seem quite interesting I must try implementing
them just as soon as I get this hard disk going!

The problem with exec as steve points out is not that there is a finite 
stack limit, but that the kernel does not return the correct status if
it gets used up. Thus instead of getting the message "argument list too long"
you get the message "fnurd: cannot excecute" which is kind of non obvious. 
This is not a shell bug, as when the minix shell is run on a real UNIX 
system ( it worked first time ) it produces the correct result.

The problem with rm, which is not a bug, is simply that it lacks the facility
to say rm - -foobar to remove -foobar. I didn't expect to be able to say 
just rm -foobar. 

By the way Minix does indeed run on an Amstrad PC, but it does not seem to
work with the hard disk at all. Probably a new driver required I guess.
Just thought you'd like to know.

John.

UUCP: ..!ukc!warwick!arthur     JANET: arthur@uk.ac.warwick.uu

"Na, that's not an operating system. THIS is an operating system!"

zimmer@Shasta.UUCP (05/05/87)

I am having problems using the hard disk of an IBM XT (genuine) with Minix.
After creating a Minix partition following the instruction included
with the disks, I try to copy the /usr file system to the hard disk.
This results in one of two things happening:
1. I get a Winchester read error, and Minix remains running, or
2. Minix goes into a hard crash, constantly printing a message to
   the screen.  This message is sometimes about an unexpected interrupt.

Since Minix does copy some of the files over correctly, and can read 
them back, I suspect that the problem might be bad sectors on the hard disk.
(DOS does report bad sectors on the disk when I format it.)  Does Minix
have a facility to map bad sectors and to mark them unusable?  


reply to zimmer@shasta.stanford.edu

james@alberta.UUCP (05/06/87)

In article <1569@Shasta.STANFORD.EDU> zimmer@Shasta.STANFORD.EDU (Andrew Zimmerman) writes:
>I am having problems using the hard disk of an IBM XT (genuine) with Minix.
>After creating a Minix partition following the instruction included
>with the disks, I try to copy the /usr file system to the hard disk.
>This results in one of two things happening:
...
I am having similar problems with my AT no-name clone.  It may be
something with the file system.  When I copy a file over, it is fine
 (as determined by the CMP command).  later, when I copy over another
file, one of my earlier files may get trounced (usually only partially).
This also happens when I use the DD commands.  It gets really frustrating
as it seems to be somewhat deterministic - no matter what I try I can't
port over all of the files I need to make the C compiler work.
(This is happening on my Hard disk partition 2, the Dos partition
seems to work just fine)  

I do not believe that bad sectors are causing the problem.
james@alberta
PS. If anyone is interested, I will post a short program to read the
AT real time clock - useful for setting the date.

ast@cs.vu.nl (Andy Tanenbaum) (05/07/87)

In article <1569@Shasta.STANFORD.EDU> zimmer@Shasta.STANFORD.EDU (Andrew Zimmerman) writes:
>I am having problems using the hard disk of an IBM XT (genuine) with Minix.
>Does MINIX have a facility to map bad sectors and to mark them unusable?  
>
Unfortunately, no.  It assumes the disk is perfect.

Andy Tanenbaum (ast@cs.vu.nl)

egisin@orchid.UUCP (05/07/87)

In article <1171@botter.cs.vu.nl>, ast@cs.vu.nl (Andy Tanenbaum) writes:
> >Does MINIX have a facility to map bad sectors and to mark them unusable?  
> Unfortunately, no.  It assumes the disk is perfect.

It is possible to write a utility that allocates bad blocks to
a regular file.  Call the file something like ..badblocks
so that it doesn't accidently get read.

stuart@bms-at.UUCP (05/12/87)

Bad sectors can be handled by flagging them in the bit map (unless the
superblock or a bitmap block are bad).  The extra space in the super
block should be used to save a list of bad zones.  There are two 
approaches:

	1) modify fsck to know about the bad block table.  Write
	   a utility to maintain the list in the super-block.
	2) write a utility that reads a list of bad blocks and
	   turns them into files on an unmounted file system.
	   This does not handle bad blocks in the inode table.
-- 
Stuart D. Gathman	<..!seismo!dgis!bms-at!stuart>

jjc@sdiris1.UUCP (Jim J. Carter) (08/03/87)

I have been trying to track down a problem with my system's hard
disk(driver) and have come to a result many others may be interested in.

   The disk was formated with the controler's bios "format" program
   by executing under MS-DOS
C> debug
-g=c800:5
   ... etc format format format etc ...

   ... format bad tracks ? ...

PROBLEM :
To this I answered Yes, and filled in the bad cyl/heads which I was
told by the disk manufacture (Seagate).  Anyway to make a long story
short, I have been receiving 
   Unrecoverable Disk errors on device 3/3 ( the third partition on disk 0 )

ACTION :
I called a print routine in w_reset() to dump the command/results/stuff
which resulted in w_reset() to be called.  Behold, one of the cyl/head was 
one of the bad ones that seagate told me was bad.

The actual error sent back from the controler is
	wn_results[0] = 99h
	ww_results[1] = 01h
	ww_results[2] = 4fh
	ww_results[3] = 70h

These values are broken down to be :
	Error # = 19h 
    The oem manual for my controler, WD1002A-WX1 said :
  Track is Flagged Bad.  A sector had been encountered that has the bad
  block mark set in the ID field.  The format Bad Track command records
  this bit in all sectors of the designated track, flagging them as bad.
  No retries are attempted in response to this error.

	AV = 1
	Drive = 0
	head = 1
	sector = 15
	cyl    = 368

This cyl/head just happens to be one of the "hard" errors on this drive.

It took me so long to narrow down the problem because I have not been able
to "develop" on my own machine and I have had to use someone else's machine.

QUESTION :
  Does someone have a program that will mark bad tracks as Used or Error ?
	Or 
Is there a fix to the disk driver to do something with this type of error.

-----------------------------------------------------------------------
UUCP: ...!hp-sdd!crash!sdiris1!jim     |  Jim Carter 
 or:  ...!sdcsvax!jack!man!sdiris1!jim |  Control Data Corporation (CIM)
Work : +1 619 450 6516                 |  4455 Eastgate Mall, 
Home : +1 619 455 0607                 |  San Diego, CA  92121

baugh@hal.CSS.GOV (Nam Myoho Renge Kyo) (10/07/89)

'Folks,
	
	Still looking for some help/suggestions on how to get minix
up & running on my AT...again the problem was :

		Error: put_block couldn't write
		Line 1 being processed when error detected.

	Now, here are some other facts that may help or be part of the
the problem.  First the disks were 'set-up' with SEAGATES OnTRACK Disk
Manager -- DM.  I've looked at the Seagate 225 (20mb) with fdisk 
(MSDOS 3.3) and it says it has a non-Dos partition (but won't let me 
remove it...how do you get rid of a non-disk partition with fdisk...
possible?? )  The Seagate 251 (40mb) is divided in half, one dos partition,
the other is non-DOS according to fdisk.  Now, when I have installed
the device driver the 40mb looks like C: & D:, the 20mb E:  .   Without
the driver one time, I couldn't get to D or E, but C looks fine. One other 
time I tried, I could get to the first partition of the 40mb (C) and the 
20mb, but the 20mb was accessed as D not E and the second partition of the 
40mb was unavailable. (don't know what I did differently..though I was playing
with fdisk, so the time I couldn't get to the C drive may have been because of
this...) The disk drive controller is a genunine IBM controller (from an
6mhz PC-AT) and works fine under dos.    Am I experiencing a MINIX 1.1 at_wini
problem, or does it look like hardware??

	When I attempted to mkfs on my 2nd hard disk, I am pretty sure that 
the /dev/hd6 was created ok (though in 1.1 I maybe incorrect) so could some 
kind soul send me a 1.4a at_wini.c or whatever you think might be causing this.
I just need to get the drive up with a file system so I can begin the 'upgrade' 
since I only have 1 1.2mb floppy in the system.  (rebuilding the kernel on a single 
1.2 mb isn't fun, but it won't be as bad this time since I've saved the setup 
from the last time...) Also given I have access to a bunch of fdisk versions, which 
is the 'BEST' to use?...3.3??  I'm ready to start from scratch on the disk, so any 
'low-level' formating, etc.. that might be smart for me to do now I'd appreciate 
hearing also.

	I'd really like to get this up and going before the 1.4b postings...
	thanks.

-- 
	Earl D. Baugh Jr.
	ENSCO INC.
	Internet : baugh@hal.CSS.GOV
	"Open the pod bay doors...."
-- 
	Earl D. Baugh Jr.
	ENSCO INC.
	Internet : baugh@hal.CSS.GOV
	"Open the pod bay doors...."