[comp.unix.wizards] Using the raw disk partition

mark@intek01.UUCP (Mark McWiggins) (07/22/90)

I've read of databases that let you use the raw disk partition instead
of ordinary files.  We do real-time factory automation work, and it
occurs to me that, if this is a significant speed win, this might be
worth looking into for some of our own applications (particularly as
I try to wrench my company away from DOS and toward SysVR4 for our
work.)

What are the pitfalls in doing this?  Is it as easy programmatically
as I envision (opening /dev/rdsk/<whatever> and using it as one big
file)?  Is it even enough of a perfomance win to bother with?

Thanks in advance for any insight.

-- 
Mark McWiggins			Integration Technologies, Inc. (Intek)
+1 206 455 9935			DISCLAIMER:  I could be wrong ...
1400 112th Ave SE #202		Bellevue WA  98004
uunet!intek01!mark		Ask me about C++!

marz@cbnewsm.att.com (martin.zam) (07/27/90)

: I've read of databases that let you use the raw disk partition instead
: of ordinary files.  We do real-time factory automation work, and it
: occurs to me that, if this is a significant speed win, this might be
: worth looking into for some of our own applications (particularly as
: I try to wrench my company away from DOS and toward SysVR4 for our
: work.)
: 
: What are the pitfalls in doing this?  Is it as easy programmatically
: as I envision (opening /dev/rdsk/<whatever> and using it as one big
: file)?  Is it even enough of a perfomance win to bother with?
: 
: Thanks in advance for any insight.
: 
: -- 
: Mark McWiggins			Integration Technologies, Inc. (Intek)
: +1 206 455 9935			DISCLAIMER:  I could be wrong ...
: 1400 112th Ave SE #202		Bellevue WA  98004
: uunet!intek01!mark		Ask me about C++!
: 

I have run Oracle Databases with Raw disk for production systems on heavily
loaded systems.  I must warn you that Raw disk uses the same Clist buffer
pool as terminals, and therefore your system must be appropriately tuned
to maintain data integrity.

UNIX systems are not very graceful about running out of system resources,
so you should plan for this requirement in advance.  Large blocks of data
moving to/from the Raw disk will steal Clist buffers at a much faster rate
than any terminal with typical flow control settings.  This means that your
seemingly abundant buffer pool could dwindle at a fantastic rate.

A typical symptom of exhausted Clists buffers would be losing characters
as they are being typed in, or blocks of data missing when you cat large
files to the terminal screen.  Now imagine that your screen is the Raw disk
device.  You should be able to see the extent of the exposure to data
integrity problems.

Another issue is speed.  On heavily loaded systems, I can measure no
*real world* difference in performance.  

					Martin Zam
					(201)564-2554

dave@dptechno.UUCP (Dave Lee) (07/27/90)

In article <1990Jul26.195530.24961@cbnewsm.att.com> marz@cbnewsm.att.com (martin.zam) writes:
>I have run Oracle Databases with Raw disk for production systems on heavily
>loaded systems.  I must warn you that Raw disk uses the same Clist buffer
>pool as terminals, and therefore your system must be appropriately tuned
>to maintain data integrity.
>

Someone please tell me this isn't so !  Last I looked, Clist blocks were 64 
bytes each, with a typical small system having only ~100.  

Also the raw disk shouldn't be doing buffering at all, otherwise it wouldn't
be raw.  

>A typical symptom of exhausted Clists buffers would be losing characters
>as they are being typed in, or blocks of data missing when you cat large
>files to the terminal screen.  Now imagine that your screen is the Raw disk
>device.  You should be able to see the extent of the exposure to data
>integrity problems.

This must be be due a high interrupt overhead, loaded system, or massive
terminal activity, not the write()'s to a DISK  stealing clists.   
Very likely is that the disk has a higher interrupt priority than the
serial ports and that massive raw (ie UNbuffered) disk io causes a 
large interrupt frequency, and potentially causing missed serial io 
interrupts or data.   

How could fsck possibly work on raw devices if it was as unreliable 
as implied here.  

-- 
Dave Lee
uunet!dptechno!dave

guy@auspex.auspex.com (Guy Harris) (07/28/90)

>Large blocks of data moving to/from the Raw disk will steal Clist
>buffers at a much faster rate than any terminal with typical flow
>control settings.

What on earth are you talking about?  Do you have some Mutant UNIX From
Hell that uses clists for raw disk I/O?  I've *never* seen any such UNIX
system (and I've seen lots of UNIX systems - V6, V7, 4.[123]BSD, System
{III, V Release [1234]}, SunOS [234].x), and frankly, I don't expect to
*ever* see such a UNIX system....

hutch@fps.com (Jim Hutchison) (07/28/90)

In <1990Jul26.195530.24961@cbnewsm.att.com> marz@cbnewsm.att.com (martin.zam):
>In article <281@intek01.UUCP> uunet!intek01!mark (Mark McWiggins) writes:
>: I've read of databases that let you use the raw disk partition instead
>: of ordinary files.  We do real-time factory automation work, and it
>: occurs to me that, if this is a significant speed win, this might be
>: worth looking into for some of our own applications [...]

Raw disk allows you to get the data straight from the disk into your programs
buffer.  You end up getting "stuck" doing I/O in multiples of your disk's
sector size (probably 512 bytes), since most disk drivers will not give
fractions of a sector.  (Aside: this is theoretically possible with SCSI)

You avoid the copy, and you loose any benefits the Unix buffer cache may have
given you.  The buffer cache loss is more of a hit when you write to the raw
device, because you loose Unix's delayed write feature.  If you aren't writing
and your data accesses are such that they would never get a buffer cache hit,
you don't loose anything on that score.  Note that the kernel still makes
sure that you can't use the device to write on other people when it does the
lock-in-physical-memory to make your buffer an I/O target/source.

Your mileage may vary, this is from an FPS 500 which is a derivative of
BSD 4.2/3 with system V extensions.  The same is said to be true on our
future machines, and reads-as-true in the Tahoe source I have.  System VrX
may do something else.

--
-
Jim Hutchison   	{dcdwest,ucbvax}!ucsd!fps!hutch
Disclaimer:  I am not an official spokesman for FPS computing