[comp.lang.c] Record-access libraries

eric@snark.UUCP (Eric S. Raymond) (09/23/88)

In article <207@cvbnet2.uucp>, aperez@cvbnet2.UUCP (Arturo Perez Ext.) writes:
> What I'm curious about is the fact that I've never heard of any record
> access libraries for Unix.  I know that I've written simpleminded record
> access applications.  I'm sure other people have as well.  Is there anyone
> actually selling record access libraries for the Unix community?  If not
> why isn't anyone doing it?

Shortest answer: because it's not worth doing.

Medium-length answer: because, for the most common case of sequentially-
accessed fixed-length records, 'record access' under UNIX is adequately
handled from C by fread(3)/fwrite(3) of a struct buffer. The less common cases
are traditionally handled by ASCII encoding with a \n-terminated line
structure, the one kind of variable-length 'record' for which there is strong
support under UNIX.

Long answer: well, what does a record-access library *do*, anyhow, that byte-
stream access doesn't give you? There are basically, two possible answers:

    1. Encourage code simplification by provision of 'higher-level' primitives.
    2. Access speed or space optimization.

Both of these turn out to be mostly illusion. UNIX's byte-stream primitives
are sort of ultimately simple, certainly a lot simpler than the thicket of RMS
entry points. The text-file support in stdio is pretty clean and basic too
(which is why it's become a de-facto standard for C implementations even on
non-UNIX systems).

A possibility of significant speed or space optimization over stdio or simple
fread/fwrite only enters if you're talking about access to variable-length
records that can't reasonably be represented in ASCII. Such applications tend
to a) require indexed access, in which case you're already talking database
and the 'record library' likely has to be custom to handle housekeeping info,
or b) be sufficiently speed or space-critical that no savvy programmer is
really going to like someone else's packaged library for access.

And don't forget that at some level the record-access has to be doing
byte-stream I/O *anyway*. The only way it can win is by buffering. And wouldn't
you rather tune your own buffer sizes?

Record-access libraries sound like a decent idea, but in practice they tend to
introduce a lot of interface complexity (which makes your code ugly) and
premature optimization (which actually hurts performance). VMS's RMS could
stand as a perfect example of both these lossages.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

mcdonald@uxe.cso.uiuc.edu (09/24/88)

/* Written 12:32 pm  Sep 22, 1988 by eric@snark.UUCP in uxe.cso.uiuc.edu:comp.lang.c */
/* ---------- "Record-access libraries (Was: Re: V" ---------- */
In article <207@cvbnet2.uucp>, aperez@cvbnet2.UUCP (Arturo Perez Ext.) writes:
> What I'm curious about is the fact that I've never heard of any record
> access libraries for Unix.  I know that I've written simpleminded record
> access applications.  I'm sure other people have as well.  Is there anyone
> actually selling record access libraries for the Unix community?  If not
> why isn't anyone doing it?

Shortest answer: because it's not worth doing.

<<<<<<long section deleted, read the original>>>>>>>>

>Record-access libraries sound like a decent idea, but in practice they tend to
>introduce a lot of interface complexity (which makes your code ugly) and
>premature optimization (which actually hurts performance). VMS's RMS could
>stand as a perfect example of both these lossages.

/* End of text from uxe.cso.uiuc.edu:comp.lang.c */

Well stated indeed. Except that for some unknown reason the VMS RMS is
very fast indeed. However, VMS has one feature that I can't find in Unix:
asynchronous IO, that is, start an IO operation and let processing
continue, and when the IO finishes it sets a flag or calls an
out-of-line routine. The only way I can see to do this is Unix
is with separate processes, which is a complicated loser.

Doug McDonald

bzs@xenna (Barry Shein) (09/25/88)

Eric, I disagree and think you dismiss the value of record access
libraries too glibly. There's more to it than fixed and blocked record
formats.

Consider various associative schemes like ISAM or B-TREE management of
records by keys (dbm is similar to this under Unix.) These can be very
critical to manageable performance in certain areas, particularly the
management of large numbers of records where lookup is only by a very
few keys and quick insertion of new key/record pairs is needed. Your
typical example is a customer data base where lookup/insertion keys
is based on some customer code.

Particular implementation strategies are a whole other kettle of fish.

	-Barry Shein, ||Encore||

eric@snark.UUCP (Eric S. Raymond) (09/27/88)

In article <3717@encore.uucp>, bzs@xenna (Barry Shein) writes:
> Eric, I disagree and think you dismiss the value of record access
> libraries too glibly. There's more to it than fixed and blocked record
> formats.
> 
> Consider various associative schemes like ISAM or B-TREE management...

I agree that libraries like UNIX dbm are a Good Thing -- but then ISAM, B-TREE
management, and other associative retrieval schemes are not within the scope
of the original question. Let's not muddy the waters by confusing 'record
access' in the RMS "fixed and blocked record" sense with more general database
access techniques.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

aperez@blazer.uucp (Arturo Perez Ext.) (10/04/88)

From article <e1Lt8#2CJhqZ=eric@snark.UUCP>, by eric@snark.UUCP (Eric S. Raymond):
> In article <3717@encore.uucp>, bzs@xenna (Barry Shein) writes:
>> Eric, I disagree and think you dismiss the value of record access
>> libraries too glibly. There's more to it than fixed and blocked record
>> formats.
>> 
>> Consider various associative schemes like ISAM or B-TREE management...
> 
> I agree that libraries like UNIX dbm are a Good Thing -- but then ISAM, B-TREE
> management, and other associative retrieval schemes are not within the scope
> of the original question. Let's not muddy the waters by confusing 'record
> access' in the RMS "fixed and blocked record" sense with more general database
> access techniques.

Why needlessly restrict the scope of the argument? I admit, the question I
originally asked was a query regarding the availability of record-access 
libraries under Unix.  However, I believe that any retrieval method is
a valid conception of "record."  After all, aren't we just trying to
optimize the retrieval of data from the file?  Isn't that what ISAM,
and fixed records and all that other cruft is about?  

I can almost (but not quite :-) concede your point that is isn't worth
optimizing the access of fixed block files.  But say I had a fixed block
file that I only accessed randomly, occasionally sequentially but not
too often.  Wouldn't it be a useful embedded attribute of the file?
That way the kernel wouldn't waste too much of its time trying to do
readahead for me, which I won't be using anyway. 

I guess I am starting to diverge.  However, I do believe that there
are uses for standard access methods that take these things into 
account.  What I don't understand is why isn't anyone providing them.

Arturo Perez
ComputerVision, a division of Prime
primerd!cvbnet!aperez
The difference between genius and idiocy is that genius has its limits.

fmayhar@killer.DALLAS.TX.US (Frank Mayhar) (10/05/88)

In article <258@cvbnet2.UUCP> aperez@blazer.uucp (Arturo Perez Ext.) writes:
>I guess I am starting to diverge.  However, I do believe that there
>are uses for standard access methods that take these things [record-level
>access, as in database applications, etc.] into 
>account.  What I don't understand is why isn't anyone providing them.
>
>Arturo Perez

I've thought about this myself.  The operating system I help support (Honeywell
Bull CP-6) supports many different kinds of file types, most of which are
record-level (the rest are block types, and some that are unique to CP-6).  I
feel that the major reason that this capability hasn't been (yet) provided for
Un*x is that (1) there hasn't been a great need for it up to now (very few
business-type applications run on Un*x, typically), (2) if you provide the
capability in a library (and you're not AT&T) applications that use it become
non-portable, and (3) providing it in a library is not as efficient as providing
it as part of the operating system, and if you do that what you end up with is
no longer Un*x, as such.  One of the things that I hate most about Un*x is that
it locks you into one way of looking at data:  as a stream of bytes.  While this
is fine for certain applications, it (in a word) sucks for most others.

What I would like to see is some company having the guts to build a Un*x-
compatible system that would allow multiple file types, more flexible file
access controls, decent async terminal handling, etc., etc.  I would prefer it
to be Honeywell Bull, but I really don't think that it will happen at all, and
if it does it certainly won't be soon.

The usual disclaimers.
-- 
Frank Mayhar            UUCP: fmayhar@killer.dallas.tx.us
                        ARPA: Frank-Mayhar%ladc@bco-multics.hbi.honeywell.com
                        USmail: 2116 Nelson Ave. Apt A, Redondo Beach, CA  90278
                        Phone: (213) 371-3979 (home)  (213) 216-6241 (work)

eric@snark.UUCP (Eric S. Raymond) (10/05/88)

In article <258@cvbnet2.uucp>, aperez@blazer.uucp (Arturo Perez Ext.) writes:
>                                      But say I had a fixed block
> file that I only accessed randomly, occasionally sequentially but not
> too often.  Wouldn't it be a useful embedded attribute of the file?
> That way the kernel wouldn't waste too much of its time trying to do
> readahead for me, which I won't be using anyway. 

Yes, but useful how often? That is, is the additional kernel and interface
complexity justified by the gain in that one case?

One of the great insights of the UNIX file system design is that, practically
speaking, all forms of file structuring above byte stream level have a low
marginal advantage relative to their complexity cost. I could argue for this
on lots of theoretical grounds, but I don't have to; the overwhelming
*practical* success of the byte-stream paradigm in UNIX and its adoption by
most OS designs of more recent vintage speaks for itself.

The trouble with your proposed optimization is that it's just one of a crowd
of equally marginal features that, once added, would render the UNIX interface
as cluttered as VMS's without really ever cost-justifying themselves.

Interestingly, some real-time-oriented UNIX versions have a 'write-through'
mode flag for files that will do what you want (disable read-ahead) as a
side-effect. The intent of this is to disable buffer caching for guaranteed
write of precious data -- a feature that is no mere optimization.

> I guess I am starting to diverge.  However, I do believe that there
> are uses for standard access methods that take these things into 
> account.  What I don't understand is why isn't anyone providing them.

Well, by Barry Shein's report, some outfits are. The *interesting* question
is why they're so rarely called for that neither you nor I have ever seen a
commercial product of that kind.

*I* think it's because the byte-stream paradigm *works*...
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

aperez@cvbnet2.UUCP (Arturo Perez Ext.) (10/10/88)

From article <e4G91#2Wa6Tw=eric@snark.UUCP>, by eric@snark.UUCP (Eric S. Raymond):
> In article <258@cvbnet2.uucp>, aperez@blazer.uucp (Arturo Perez Ext.) writes:
>> too often.  Wouldn't it be a useful embedded attribute of the file?
>> That way the kernel wouldn't waste too much of its time trying to do
>> readahead for me, which I won't be using anyway. 
> 
> Yes, but useful how often? That is, is the additional kernel and interface
> complexity justified by the gain in that one case?
> 
> Interestingly, some real-time-oriented UNIX versions have a 'write-through'
> mode flag for files that will do what you want (disable read-ahead) as a
> side-effect. The intent of this is to disable buffer caching for guaranteed
> write of precious data -- a feature that is no mere optimization.
> 
> *I* think it's because the byte-stream paradigm *works*...

Oh, I quite agree that the byte-stream paradigm works.  That isn't
the gist of my question.  In retrospect, it is obvious that the
reason no one is providing record access libraries is because
it is too easy to "roll your own."  The UNIX file system paradigm
is very powerful and very flexible; obviously, if you can access a
byte then you can access any number of them and treat that larger
aggregate as a record.

However, it is true that the kernel's read-ahead, write-behind philosophy
can sometimes get in the way, especially for things like critical 
databases where you want to be able to guarantee when the data is
written to disk.

If you access the raw disk device do you disable that read-ahead and
write-behind aspect of the UNIX filesystem abstraction?

And, in another vein, why does Sun's NFS (latest version notwithstanding)
disable that write-behind?  Isn't enough for the kernel to say 
"Yeah, I got the data" since that's the only guarantee you get in
the normal case anyway (i.e. with write-behind, you pass the data to
the kernel with write() and eventually it makes it to the actual disk).
Why does Sun's NFS require the server to wait for the data to actually
be on the disk before responses are sent to client?  It makes for
slow data transfers...

Arturo Perez
ComputerVision, a division of Prime
primerd!cvbnet!aperez
The difference between genius and idiocy is that genius has its limits.

gwyn@smoke.ARPA (Doug Gwyn ) (10/12/88)

In article <287@cvbnet2.UUCP> aperez@cvbnet2.UUCP (Arturo Perez Ext.) writes:
>If you access the raw disk device do you disable that read-ahead and
>write-behind aspect of the UNIX filesystem abstraction?

It doesn't disable it, it bypasses it.

andrew@frip.gwd.tek.com (Andrew Klossner) (10/13/88)

	"If you access the raw disk device do you disable that
	read-ahead and write-behind aspect of the UNIX filesystem
	abstraction?"

Yes.  But note: a raw disk does NOT support a byte stream model (at
least under Berkeley).  It implements a block model only.  An
application that tries to do a read(2) or write(2) beginning with a
byte that does not begin a block will find that its lseek pointer is
rounded down to the beginning of the block.  This burned some
application writers here.

  -=- Andrew Klossner   (uunet!tektronix!tekecs!frip!andrew)    [UUCP]
                        (andrew%frip.gwd.tek.com@relay.cs.net)  [ARPA]

friedl@vsi.COM (Stephen J. Friedl) (10/14/88)

In article <10467@tekecs.TEK.COM>, andrew@frip.gwd.tek.com (Andrew Klossner) writes:
> Yes.  But note: a raw disk does NOT support a byte stream model (at
> least under Berkeley).  It implements a block model only.  An
> application that tries to do a read(2) or write(2) beginning with a
> byte that does not begin a block will find that its lseek pointer is
> rounded down to the beginning of the block.  This burned some
> application writers here.

In addition, sometimes you'll get the *whole block* even if you request
just a couple of bytes.  We got burned by it by moving some code from
the 3B2 (allows byte-at-a-time raw reads) to another machine that didn't.
We found lots of memory getting trashed, and (vague recollection here)
the return value from read(2) indicated that the requested amount had
been read, not the whole block.

-- 
Steve Friedl    V-Systems, Inc.  +1 714 545 6442    3B2-kind-of-guy
friedl@vsi.com     {backbones}!vsi.com!friedl    attmail!vsi!friedl
---------Nancy Reagan on the Three Stooges: "Just say Moe"---------

jc@minya.UUCP (John Chambers) (10/18/88)

> If you access the raw disk device do you disable that read-ahead and
> write-behind aspect of the UNIX filesystem abstraction?

Oh, wow!  A question with a simple answer: Yes.  According to several
manuals, the main difference between /dev/dsk* and /dev/rdsk* is that
there is no buffering for the latter.  Reads always delay for physical
I/O, and writes always go immediately to disk (though with DMA, the
write may not be complete when write() returns).  There's also a
warning that the raw disks should be only accessed in multiples of
a sector.  In fact, most programs use multiples of BUFSIZ, which 
is invariably a multiple of a sector.

The exact wording in one of the manuals describes the "'raw' interface
which provides for direct transmission between the disk and the user's
read or write buffer.  A single read or write call results in exactly
one I/O operation and therefore raw I/O is considerably more efficient
when many words are transmitted."   Note the specific claim that the
transfer is direct between the disk and the buffer in user space,
without going through a kernel buffer.

Of course, this is all at the whim of the driver, so some vendors 
could have screwed it up...

-- 
John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)

[Any errors in the above are due to failures in the logic of the keyboard,
not in the fingers that did the typing.]