[comp.sys.att] Looks like a bug in the 7300 disk driver

pfales@ttrde.UUCP (Peter Fales) (01/09/89)

I have discovered what appears to be a bug in the hard disk device driver
for the unix-pc.  To demonstrate the bug, open the device "/dev/rfp000",
using open(2) and read four bytes (one int) using read(2).  The
read will actually return 512 bytes!  However, if "/dev/fp000" is used,
the correct value of 4 is returned.

I am using version 3.51 of the operating system.  If anyone wants to try
out the following program on another version of the operating system
I would be interested in their results:

---------------------------------- cut here -----------------------------

#include <stdio.h>
#include <fcntl.h>

int	disk;

main(argc, argv)
char *argv[];
{
	int 	i,x[1024];
	
	disk=open("/dev/rfp000",O_RDONLY);
	if ( disk == -1 ) {
		fprintf(stderr,"Unable to open device /dev/rfp000\n");
		exit(2);
	}

	printf("File is %d\n",disk);
	i=read(disk,x,sizeof(int));
	printf("Read returned %d\n",i);
	printf("First four ints are %x, %x, %x, %x\n",x[0],x[1],x[2],x[3]);
}
-- 
Peter Fales			AT&T, Room 2F-217
				200 Park Plaza
UUCP:	...att!ttrde!pfales	Naperville, IL 60566
Domain: pfales@ttrde.att.com	work:	(312) 416-5357		

cjc@ulysses.homer.nj.att.com (Chris Calabrese[mav]) (01/10/89)

In article <813@ttrde.UUCP>, pfales@ttrde.UUCP (Peter Fales) writes:
| I have discovered what appears to be a bug in the hard disk device driver
| for the unix-pc.  To demonstrate the bug, open the device "/dev/rfp000",
| using open(2) and read four bytes (one int) using read(2).  The
| read will actually return 512 bytes!  However, if "/dev/fp000" is used,
| the correct value of 4 is returned.

Correct me (not flame me :-) if I'm wrong, but I believe
that  /dev/rfp000 is the 'raw', or block version of /dev/fp000.
As such, it has no idea of 'bytes', but only 'blocks', which
happen to be 512 'bytes' in this case.
-- 
	Christopher J. Calabrese
	AT&T Bell Laboratories
	att!ulysses!cjc		cjc@ulysses.att.com

jcm@mtunb.ATT.COM (was-John McMillan) (01/10/89)

In article <813@ttrde.UUCP> pfales@ttrde.UUCP (Peter Fales) writes:
>
>I have discovered what appears to be a bug in the hard disk device driver
>for the unix-pc.
...
It's a FEATURE, not a bug.  Moreover, this feature will never be changed ;^}

Consider TAPE drives: in many systems, you can only communicate to the
drive in EVEN byte multiples -- unless things have changed in the last
twenty years since I've used a tape drive %v).  Is this an error?

In the UNIX-PC, the RAW -- read that "character-device" -- interface
to the disk directly maps transfers into user RAM.  Therefore, since
transfers to/from the disk are in 512 byte blocks, ya can only read
in 512-byte multiples.

In some other systems, boundary cases -- un-aligned first and last blocks
-- are read into kernel buffers and NOT into user-space.  This permits
Peter's anticipated results.

Finally (fat chance of this ;-):
1)	TSK-TSK -- users are NOT supposed to be accessing the RAW disk
	at all.  And if this is permitted... the special characteristics
	of the RAW driver have to be coped with!

2)	I believe I've seen this feature documented -- and isn't that
	enough to satisfy everyone?

3)	Lord[s], let us not descend into the hell of other groups that
	have wasted untold kilo-bucks thrashing unresolvable differences
	of opinion over system features -- not to mention address-to-index
	computations by name.

PS: I haven't pursued the above issue through the source code so the above
drivel may be the first flawed argument of my life -- oops, I meant HOUR.

jc mcmillan	-- att!mtunb!jcm	-- just frothing for myself, not THEM.

karl@mstar.UUCP (Karl Fox) (01/10/89)

Come on, folks!  If you provide a 4-byte buffer to read, it is a BUG to
write 512 bytes to it!  Sure, the raw disk device has size limitations,
but it should never round up.  The driver should probably return EINVAL
if the size isn't a multiple of 512.
-- 
Karl Fox, Morning Star Technologies
UUCP:     osu-cis!mstar!karl -or- pyramid!mstar!karl -or- sequent!mstar!karl
Internet: osu-cis!mstar!karl@tut.cis.ohio-state.edu

andys@genesis.ATT.COM (a.b.sherman) (01/11/89)

In article <1360@mtunb.ATT.COM> jcm@mtunb.UUCP (was-John McMillan) writes:
>In article <813@ttrde.UUCP> pfales@ttrde.UUCP (Peter Fales) writes:
>>
>>I have discovered what appears to be a bug in the hard disk device driver
>>for the unix-pc.
(R) UNIX is a registered trademark of AT&T
>...
>It's a FEATURE, not a bug.  Moreover, this feature will never be changed ;^}
>

It is *NOT* a feature it's a bug.  See below.

>Consider TAPE drives: in many systems, you can only communicate to the
>drive in EVEN byte multiples -- unless things have changed in the last
>twenty years since I've used a tape drive %v).  Is this an error?

You might be excessively DEC oriented.  Stuff for the PDP-11 and VAX
does its DMA based on a *WORD* count, hence even numbers of bytes
are required.  This is not exclusive to tape.  Actually, since most
tape is 9-track, the drive reads a byte at a time. (8 bits + parity).
>
>In the UNIX-PC, the RAW -- read that "character-device" -- interface
>to the disk directly maps transfers into user RAM.  Therefore, since
>transfers to/from the disk are in 512 byte blocks, ya can only read
>in 512-byte multiples.
>

Why a CHARACTER device must program its DMA device in BLOCK
multiples escapes me.  Regardless of the block size, you can always
tell a DMA controller to transfer X bytes or X words.

>In some other systems, boundary cases -- un-aligned first and last blocks
>-- are read into kernel buffers and NOT into user-space.  This permits
>Peter's anticipated results.
>

Which, the 512 byte read on rfd000 or the 4 byte read on fd000?

>Finally (fat chance of this ;-):
>1)	TSK-TSK -- users are NOT supposed to be accessing the RAW disk
>	at all.  And if this is permitted... the special characteristics
>	of the RAW driver have to be coped with!

Why shouldn't I do raw I/O on the disk?  Maybe I want to make a raw
slice for my own special purposes, like building a TUXEDO database.
Why put the driver entry there if it's not to be used?  Features
that are not intended to be used do not belong in the system.

>
>2)	I believe I've seen this feature documented -- and isn't that
>	enough to satisfy everyone?

Where is it documented?  It is not on the manual pages for read(2),
open(2), gd(7).  Even so, documented brain damage is still brain
damage.

>
>3)	Lord[s], let us not descend into the hell of other groups that
>	have wasted untold kilo-bucks thrashing unresolvable differences
>	of opinion over system features -- not to mention address-to-index
>	computations by name.
>

Again I repeat, this is a bug.  Calling a feature doesn't make it
any less a bug.

>PS: I haven't pursued the above issue through the source code so the above
>drivel may be the first flawed argument of my life -- oops, I meant HOUR.
>
>jc mcmillan	-- att!mtunb!jcm	-- just frothing for myself, not THEM.


-- 
andy sherman / at&t bell laboratories (medical diagnostic systems)
room 2e-108 / 185 monmouth pkwy / west long branch, nj 07764-1394
(201) 870-7018 / andys@shlepper.ATT.COM
...The views and opinions are my own.  Who else would want them?

jcm@mtunb.ATT.COM (was-John McMillan) (01/11/89)

ASBESTOS-DONNED...

In article <513@genesis.ATT.COM> andys@shlepper.ATT.COM (a.b.sherman) writes:
...

>You might be excessively DEC oriented.

	... yup... I might be.  Or at least raised thereupon.

>Why a CHARACTER device must program its DMA device in BLOCK
>multiples escapes me.  Regardless of the block size, you can always
>tell a DMA controller to transfer X bytes or X words.

	... maybe... but, when ya buy a tres inexpensive computer -- made
	possible by (supposed) cost savings in every corner they could
	think of -- ya get things like the DMA in the 7300 which is NOT
	TOTALLY GENERAL.  While I think it COULD be trained to transfer
	N-WORDS, I doubt it could do this at anything but a SECTOR
	boundary -- which is a severely non-general case.

>>In some other systems, boundary cases -- un-aligned first and last blocks
>>-- are read into kernel buffers and NOT into user-space.  This permits
>>Peter's anticipated results.
>>
>
>Which, the 512 byte read on rfd000 or the 4 byte read on fd000?

	Peter: Wasn't your only surprise from "/dev/rfp000" accesses?
		If I missed your point, I apologize profusely.
	Andy:  If the boundary cases are read into temporary kernel
		buffers, then one can permit "N" byte RAW I/O to/from
		user-space by just bcopy-ing across.  The disk still
		requires full PHYSICAL-block writes and SECTOR-aligned
		reads.  Aligned, block transfers better be working!.

>>Finally (fat chance of this ;-):
>>1)	TSK-TSK -- users are NOT supposed to be accessing the RAW disk
>>	at all.  And if this is permitted... the special characteristics
>>	of the RAW driver have to be coped with!
>
>Why shouldn't I do raw I/O on the disk?  Maybe I want to make a raw
>slice for my own special purposes, like building a TUXEDO database.
>Why put the driver entry there if it's not to be used?  Features
>that are not intended to be used do not belong in the system.

	I've not the SLIGHTEST concern about your use of RAW disk
	accesses.  "TSK-TSK", as a serious social criticism,
	went out with knickers: I presumed this was a giveaway that
	you can make NON-STANDARD DISK ACCESSES AT YOUR OWN RISK.

	I'm sorry you don't run FSCK, IV, or other system codes that use
	the RAW disk entries.  I do.  And, despite the pain, the system
	codes that do use the RAW entries work!  Mirabile visu!  So,
	I don't think I'll recommend that these features be pulled... yet!

>>2)	I believe I've seen this feature documented -- and isn't that
>>	enough to satisfy everyone?
>
>Where is it documented?  It is not on the manual pages for read(2),
>open(2), gd(7).  Even so, documented brain damage is still brain
>damage.

	O no.  So much for forgetting the ";^)"  Guess I thought the
	phrasing was, again, a giveaway.  So much for humor, let me
	present, then, a concise synopsis of how I feel about the
	issue of RAW I/O to the 3B1 hard-disk:

	a) It should work reliably.  I believe it does.
	b) It should be documented well: "better" is the operative word.
	c) I'd prefer that it were more general, but accept it as it is.
	d) I note: this is the first time I've heard a complaint on this
		in 4 yrs.
	e) It sounds like the Surprised_Party[0], Peter, has more grace
		than the Surprised_Party[1].
	f) I'd really be surprised if, knowing the limitations, these
		really place a meaningful limit on codes.
	g) I'm inclined to help Peter, if he asks for advice -- I'm just
		quite certain Product Management wouldn't risk diddling
		the disk-driver.

>>3)	Lord[s], let us not descend into the hell of other groups that
>>	have wasted untold kilo-bucks thrashing unresolvable differences
>>	of opinion over system features -- not to mention address-to-index
>>	computations by name.
>
>Again I repeat, this is a bug.  Calling a feature doesn't make it
>any less a bug.

	Confusing personal philosophy with critical insight is a curse.
	I wish you a cure, and hope I can offer you a warm blanket on some
	other issue, Andy.

NOMEX OFF -- Sweaty things, but necessary apparel.  Meanwhile, I'll try
to help where I can.

jc mcmillan	-- att!mtunb!jcm	-- speaking for himself, at most.

jcm@mtunb.ATT.COM (was-John McMillan) (01/11/89)

In article <983@mstar.UUCP> karl@mstar.UUCP (Karl Fox) writes:
>Come on, folks!  If you provide a 4-byte buffer to read, it is a BUG to
>write 512 bytes to it!  Sure, the raw disk device has size limitations,
>but it should never round up.  The driver should probably return EINVAL
>if the size isn't a multiple of 512.

	OK.  I agree.  But I live in terror of making this patch and
	finding it breaks someone's code.

jc mcmillan

sitongia@hao.ucar.edu (Leonard Sitongia) (01/11/89)

In article <513@genesis.ATT.COM> andys@shlepper.ATT.COM (a.b.sherman) writes:
>In article <1360@mtunb.ATT.COM> jcm@mtunb.UUCP (was-John McMillan) writes:
>>In article <813@ttrde.UUCP> pfales@ttrde.UUCP (Peter Fales) writes:
>>>
>>>I have discovered what appears to be a bug in the hard disk device driver
>>>for the unix-pc.
>(R) UNIX is a registered trademark of AT&T
>>...
>>It's a FEATURE, not a bug.  Moreover, this feature will never be changed ;^}
>>
>
>It is *NOT* a feature it's a bug.  See below.
>
[ALSO DONNING SUPER HIGH-TECH FLAME RETARDANT NOMEX GEAR]
[hot stuff deleted]

Seems to me that having raw devices do I/O on 512-byte blocks is pretty
standard.  I've seen it in BSD and SYSTEM-V.  Seems to me that this is
just the way it is done.  P. Fales has now learned a convention.  What
is all this argument about?  The original question was indicative of
unawareness about this convention.  That's it.  

[LEAVING FLAME RETARDANT GEAR ON FOR A WHILE]

-Leonard E. Sitongia    System Programmer		 (303) 497-1509
USPS Mail: High Altitude Observatory P.O. Box 3000 Boulder CO  80307
Internet:               sitongia@hao.ucar.edu
SPAN:			NSFGW::"hao.ucar.edu!sitongia"	[NSFGW=9580]

les@chinet.chi.il.us (Leslie Mikesell) (01/11/89)

>>In the UNIX-PC, the RAW -- read that "character-device" -- interface
>>to the disk directly maps transfers into user RAM.  Therefore, since
>>transfers to/from the disk are in 512 byte blocks, ya can only read
>>in 512-byte multiples.

For even more fun try writing from a buffer in a shared memory
segment -- the machine locks up competely.  (I ran into this
while trying to write a high-speed multi-process dump eventually
intended for a 3B2 tape drive.  After crashing the 3B2 by putting
the tape in streaming mode and writing too large a block, I thought
I would work the bugs out on a 3B1. Oh, well..)

Les Mikesell

ditto@cbmvax.UUCP (Michael "Ford" Ditto) (01/11/89)

In article <813@ttrde.UUCP> pfales@ttrde.UUCP (Peter Fales) writes:
>I have discovered what appears to be a bug in the hard disk device driver
>for the unix-pc.

In article <513@genesis.ATT.COM> andys@shlepper.ATT.COM (a.b.sherman) writes:
>It is *NOT* a feature it's a bug.  See below.
[ ... ]
>Why a CHARACTER device must program its DMA device in BLOCK
>multiples escapes me.  Regardless of the block size, you can always
>tell a DMA controller to transfer X bytes or X words.

Well, there's your problem:  It's not a CHARACTER device.  The /dev/r*
devices are RAW device interfaces; that's what the 'r' stands for.  When
you access any raw device, you have to do it in accordance with that
particular device's physical requirements.  In the case of the Unix PC's
hard disk interface (and those of most other computers), partial block
transfers are not possible.  The so-called "block" device can be used
for more "structured" access (like 4 bytes at a time, etc.).

According to the "Unix Implementation" paper by Ken Thompson, the term
"character I/O" is "a complete misnomer".
-- 
					-=] Ford [=-

"The number of Unix installations	(In Real Life:  Mike Ditto)
has grown to 10, with more expected."	ford@kenobi.cts.com
- The Unix Programmer's Manual,		...!sdcsvax!crash!elgar!ford
  2nd Edition, June, 1972.		ditto@cbmvax.commodore.com

clewis@ecicrl.UUCP (01/12/89)

In article <1365@mtunb.ATT.COM> jcm@mtunb.UUCP (was-John McMillan) writes:
>In article <983@mstar.UUCP> karl@mstar.UUCP (Karl Fox) writes:
>>Come on, folks!  If you provide a 4-byte buffer to read, it is a BUG to
>>write 512 bytes to it!  Sure, the raw disk device has size limitations,
>>but it should never round up.  The driver should probably return EINVAL
>>if the size isn't a multiple of 512.
>
>	OK.  I agree.  But I live in terror of making this patch and
>	finding it breaks someone's code.

Which is nothing compared to the terror of discovering that someone
has exploited this bug to overwrite his own user structure and modify
and/or crash the kernel.  (In some kernels the user area is in a protected
area at the top of a process's stack)

This is an integrity hole if not security hole.

Hint: what's in the other 508 bytes?  If the driver is so stupid
as to *round up* certain requests, how can we be sure that it's actually
checking the bounds of *any* I/O request?

This is analogous to the buffer overrun of "gets()" which someone
(who shall remain nameless) used to inject a virus into the internet.

Sheesh.

This bug should be officially reported.
-- 
Chris Lewis, Markham, Ontario, Canada
{uunet!attcan,utgpu,yunexus,utzoo}!lsuc!ecicrl!clewis
Ferret Mailing list: ...!lsuc!gate!eci386!ferret-request
(or lsuc!gate!eci386!clewis or lsuc!clewis)

andys@genesis.ATT.COM (a.b.sherman) (01/13/89)

In article <5660@cbmvax.UUCP> ditto@cbmvax.UUCP (Michael "Ford" Ditto) writes:
>In article <513@genesis.ATT.COM> andys@shlepper.ATT.COM (a.b.sherman) writes:
>>It is *NOT* a feature it's a bug.  See below.
>[ ... ]
>>Why a CHARACTER device must program its DMA device in BLOCK
>>multiples escapes me.  Regardless of the block size, you can always
>>tell a DMA controller to transfer X bytes or X words.
>
>Well, there's your problem:  It's not a CHARACTER device.  The /dev/r*
>devices are RAW device interfaces; that's what the 'r' stands for.  When
>you access any raw device, you have to do it in accordance with that
>particular device's physical requirements.  In the case of the Unix PC's
>hard disk interface (and those of most other computers), partial block
>transfers are not possible.  The so-called "block" device can be used
>for more "structured" access (like 4 bytes at a time, etc.).
>
>According to the "Unix Implementation" paper by Ken Thompson, the term
>"character I/O" is "a complete misnomer".


Most of this is true.  However, I disagree with the assertion that
most computers' disk controllers require you to DMA in sector
multiples.  I believe that most or all disk controllers will require
that you begin your I/O *on a sector boundary*, because you can only
seek to a sector.  However, the transfer length is generally given
as a word count, not a sector count.  If you give a word count of 2
on a 16-bit controller, you will get 4 bytes.  I *have* seen the
restriction that the transfer must be a multiple of the word size,
because you DMA words, not bytes, on most busses.
-- 
andy sherman / at&t bell laboratories (medical diagnostic systems)
room 2e-108 / 185 monmouth pkwy / west long branch, nj 07764-1394
(201) 870-7018 / andys@shlepper.ATT.COM
...The views and opinions are my own.  Who else would want them?

wfl@lznh.UUCP (<10000>Bill Lanky) (01/14/89)

If the 7300 raw disk driver always transferred blocks but other UNIX systems'
raw disks could transfer smaller units, you would have a "7300 device driver
bug".  However, the 7300 disk driver actually works the same way that raw
disks do in all other UNIX systems, so it cannot be called a 7300 bug.

This behavior of the UNIX raw disk driver is neither unknown nor undocumented;
ask any kernel guru, or see M. Bach's book "Design of the UNIX Operating
System" for example.  The raw driver brings you closer to the disk hardware
for potentially better performance, but being close to the hardware means
you have to become aware of the disk block size (which incidentally is NOT
always 512 bytes on every UNIX system.)  By designing the raw disk interface
in this way, the developers of UNIX have provided a means for low-level
disk access which is the same across all UNIX implementations, regardless
of whether they have intelligent or dumb hardware disk controllers.
(Note that write() and lseek() behave consistently, too).

In short, since it was designed and documented to do what it does,
it cannot be called a bug.  If some UNIX version had a security loophole
associated with this feature, the bug would be that the implementors
did not provide enough security checks when they coded it.

Bill Linke