[comp.unix.wizards] VMS vs. UNIX file system

samperi@marob.MASA.COM (Dominick Samperi) (09/13/88)

Can people who have had experience working with both VMS files (at the
FDL level) and UNIX files (at the inode level, say) comment on the
advantages and disadvantages of the file systems used by these operating
systems? My experience is mostly with the UNIX file system, so I was a
little surprised when I discovered recently that VMS text files, object
code files, and executable files all have different record structures.
What does the added complexity of having to deal with RMS, FDL, CONVERT,
etc., buy?
-- 
Dominick Samperi, NYC
    samperi@acf8.NYU.EDU	samperi@marob.MASA.COM
    cmcl2!phri!marob        	uunet!hombre!samperi
      (^ ell)

dave@arnold.UUCP (Dave Arnold) (09/13/88)

samperi@marob.MASA.COM (Dominick Samperi) writes:
> [...] stuff deleted
> ...comment on the
> advantages and disadvantages of the file systems used by these operating
> systems? My experience is mostly with the UNIX file system, so I was a
> little surprised when I discovered recently that VMS text files, object
> code files, and executable files all have different record structures.
> What does the added complexity of having to deal with RMS, FDL, CONVERT,
> etc., buy?

The VMS file system doesn't buy you anything, unless your application
requires ISAM---However, how often do you need ISAM?

I think the VMS filesystem is overly complicated, and one of the major
downfalls of VMS (but can be tolerated).  If the original DEC designers
had it to do over again, I suspect they would have stuck with a
Stream-only based filesystem (Like UNIX), and provided ISAM libraries.
The FORTRAN record format, FIXED SIZE RECORDS, VARIABLE LENGTH,
CARRAIGE RETURN CARRIAGE CONTROL... Oh, don't forget the VFC record
format...  These are all completely archaic, and date the VMS
operating system.

I feel very strongly about this.  Anyone disagree?

VMS's stengths?

AST's, Timer queues, condition handling, exit handling, message
facility.

In regards to the above, VMS was way ahead of it's time circa 1978,
and life would be difficult without the above.

Other VMS pitfalls?

The resource quota system!!!!!!!

How often have you written a program, and got the famous:

%SYSTEM-F-EXCEEDED QUOTA

message?  Isn't it fun trying to figure out which bloody quota
was exceeded?!  Stupid!
-- 
Dave Arnold
dave@arnold.UUCP	{cci632|uunet}!ccicpg!arnold!dave

bzs@encore.UUCP (Barry Shein) (09/13/88)

>What does the added complexity of having to deal with RMS, FDL, CONVERT,
>etc., buy?
>-- 
>Dominick Samperi, NYC

There are plusses and minuses in both approaches. The intention of
formalizing a bunch of file access methods is to put the code
whereever the vendor (designer) believes it will do the most good. For
example, by knowing you have promised to access some file only
sequentially it can be stored in a manner optimal for that usage.
Similarly, an indexed file can have its read methods set up, perhaps
maintaining two separate cache's (indices,data), for optimal access.

It also means that you go through some standard set of routines with a
standard set of assumptions (eg. I can open an ISAM file, knowing a
few things about it, w/o asking for details about how it's stored, if
one builds their own ISAM file into a bag-of-bytes file it may not be
at all obvious how to read it without access to the original program
which wrote it.)

The downside is that these access methods tend to get used.

What I mean is, used unnecessarily where bag-of-bytes files would do
just fine and cause much less confusion.

For example, on an earlier release (probably 1.6) of VMS I wanted to
edit a file produced by RUNOFF (to do a few global changes so
underlining or some such would print properly on my printer.) Not as
easy as it sounded, EDT refused to load this print file for editing,
complained about an illegal file type.

One could point the finger at EDT and say it was deficient in not
handling enough file formats, I tend to think that barring super-human
effort it was inherent in the design environment, it would be hard to
properly edit every file type that was allowed (last I checked CONVERT
still couldn't convert some reasonable-looking conversions.) I believe
TECO did the job fine, but I was pretty shocked at not being able to
edit this fairly plain looking text file.

It wasn't the *data* which was preventing loading this into EDT (as
with, say, trying to load an a.out into VI which wouldn't work too
well either, but for a different reason), it was merely a bit
somewhere identifying this as a print file or some such nonesense and
thus EDT kicking it out without trying. Such problems were ubiquitous
(at least it always seemed like someone was coming to me trying to
work around a similar problem, utilities wouldn't cooperate.)

Under IBM systems with a similar record oriented philosophy I remember
real panic if we couldn't find the original parameters under which a
file was created. It basically couldn't be opened anymore unless you
could produce the right magic numbers it was created with (blocking
factors etc.) I'm sure some wizardly types could have solved that
directly but it sure wasn't obvious to us, other than guessing numbers
and paying real money to watch perhaps dozens of tries go down the
drain and feeling kind of foolish and seriously out of control.

The problem with the Unix "unstructured" approach is that either you
use some of the (very few) library routines (dbm is a major one, so
are the object deck readers in SYSV) or you roll your own, each
application will have its own way of storing data (compare termcap
with passwd with inittab with crontab with ...) often not terribly
well documented or efficient (agreed, often efficiency is a poor
excuse for obscurity.)

It's all a balancing act. In my ideal world there would be a variety
of standardized access methods and you would avoid using them like the
plague, especially in general system utilities, simple byte-stream
files should account for most input and output (a la Unix), but for
those occasional, carefully justified problems, access methods could
be resorted to. Also, the operating system would know as little about
them as possible (eg. opening any file as a byte-stream would do
something reasonable, *never* return an error.)

	-Barry Shein, ||Encore||

dave@arnold.UUCP (Dave Arnold) (09/14/88)

In article <3597@encore.UUCP>, bzs@encore.UUCP (Barry Shein) writes:
> 
> What I mean is, used unnecessarily where bag-of-bytes files would do
> just fine and cause much less confusion.

Exactly.

> For example, on an earlier release (probably 1.6) of VMS I wanted to
> edit a file produced by RUNOFF (to do a few global changes so
> underlining or some such would print properly on my printer.) Not as
> easy as it sounded, EDT refused to load this print file for editing,

EDT still gives a warning about files created with VAXC.  Dumb!

> The problem with the Unix "unstructured" approach is that either you
> use some of the (very few) library routines (dbm is a major one, so
> are the object deck readers in SYSV) or you roll your own, each
> application will have its own way of storing data (compare termcap
> with passwd with inittab with crontab with ...) often not terribly
> well documented or efficient (agreed, often efficiency is a poor
> excuse for obscurity.)

This is not a problem.  It's not often that your application requires
you to "Roll your own".  And you get a very simple filesystem.
When you try to design a filesystem that will attempt to please
everyone under all circumstances, you over build---A real mess.

Anyone try tuning a RMS ISAM file?  Some pretty spiffy analysis
tools :-,

> It's all a balancing act.

Tightrope.

> 	-Barry Shein, ||Encore||

I appreciate your points, Barry, but don't agree.
-- 
Dave Arnold
dave@arnold.UUCP	{cci632|uunet}!ccicpg!arnold!dave

jbw@bucsb.UUCP (Joe Wells) (09/15/88)

In article <178@arnold.UUCP> dave@arnold.UUCP (Dave Arnold) writes:
>VMS's strengths?
>AST's, Timer queues, condition handling, exit handling, message
 ^^^^^                ^^^^^^^^^^^^^^^^^^
>facility.

ASTs are for me VMS's greatest advantage.  The ability to have
multiple system calls outstanding at one time is a godsend for
realtime control systems.  Running a separate process for each
blocking task and using IPC just doesn't cut it.

Stack unwinding is really nice too.  The ability in LISP to abort
instruction sequences with "throw" and have everything clean itself up
as the stack is unwound is *very* powerful.  In addition, you can post
your own unwinding cleanup instructions.  Under VMS, you can do this
in *any* language.  The operating system and the VMS procedure
calling standard provide for generic stack unwinding.

Directory links under VMS are not necessary for a file to exist.
Under UNIX, when all the links to a file disappear, and all processes
close the file, the file is deleted.  In VMS, a file can exist without
a name.  It can be accessed by its unique file identifier.  In
addition, the problem of dangling directory links to deleted files
does not exist.  When the VMS equivalent of the UNIX inode is reused,
a counter in the index table slot is incremented.  Thus any dangling
pointers to the previous file that used the same slot won't have any
effect.

I would be much happier with the UNIX environment if it supported
these features, but then if money grew on trees, I probably couldn't
climb them.  I'm also not trying to imply that VMS doesn't have more
than its own share of ridiculously stupid features.

>Other VMS pitfalls?
>The resource quota system!!!!!!!
>How often have you written a program, and got the famous:
>%SYSTEM-F-EXCEEDED QUOTA
>message?  Isn't it fun trying to figure out which bloody quota
>was exceeded?!  Stupid!

Good lord!  Don't remind me of this!  What a royal pain in the *ss!

>-- 
>Dave Arnold
>dave@arnold.UUCP	{cci632|uunet}!ccicpg!arnold!dave

Joe Wells
UUCP: ...!harvard!bu-cs!bucsf!jbw
INTERNET: jbw@bucsf.bu.edu

chris@mimsy.UUCP (Chris Torek) (09/15/88)

In article <1986@bucsb.UUCP> jbw@bucsb.UUCP (Joe Wells) writes:
[ASTs and stack unwinding are nice: agreed, although I think ASTs
are `too complicated' for general use, by which I mean there should
be a nice simple method of just getting several syscalls going at
the same time ... e.g., lightweight processes.  (Note that you can
implement this yourself if you have the full-blown general mechanism.)
But neither of these have anything to do with the file systems:]

>Directory links under VMS are not necessary for a file to exist.
>Under UNIX, when all the links to a file disappear, and all processes
>close the file, the file is deleted.

(Easily argued to be the correct behaviour.)

>In VMS, a file can exist without a name.  It can be accessed by its
>unique file identifier.

Presumably a `unique file ID' is like a Unix <dev,inode> pair.  How
does one construct a file ID once the names are gone?  Search the disks
directly for IDs without names attached?  Store the IDs in another file?
(Then that file is a directory, so why not use a real directory?)
This really sounds more like a bug than a feature.

>In addition, the problem of dangling directory links to deleted files
>does not exist.

`The problem of dangling directory links to deleted files'?  If you
mean that I can remove a file that you depend upon, even though you
have an alternate name for that file: that does not happen in Unix,
only in VMS.  Sounds like a point *against* VMS to me.  (Not entirely
so: someone linking to your files can keep you from deleting them when
you had intended to.  This problem does not seem to occur often in
practise.)

Unix has its weak points, but its file system is not one of them.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

guy@gorodish.Sun.COM (Guy Harris) (09/15/88)

> Presumably a `unique file ID' is like a Unix <dev,inode> pair.  How
> does one construct a file ID once the names are gone?  Search the disks
> directly for IDs without names attached?  Store the IDs in another file?
> (Then that file is a directory, so why not use a real directory?)
> This really sounds more like a bug than a feature.

Well, maybe.  (The Xerox Pilot file system works this way also, I believe.)
There may be cases where you want a file to have references to other files, but
where the referencing file is more than just a list of referenced files; this
might be a case where "open by file ID" could be a win.  You can do this in
UNIX, using the file's pathname - full, or relative to some particular
directory if referenced files are always stored there - as the ID in question.

Of course, there are problems with doing what, as you observe, more-or-less
amounts to setting up your own directory mechanism.  One of them is that you
have to set up your own directory mechanism; since VMS doesn't maintain link
counts, you would at least not have to do this yourself.  Another is that if
somebody makes a new file and does the moral equivalent of an "mv" to replace
the referenced file with that file, you lose; this may be a feature in some
applications, and a bug in others.  (We won't discuss the issue of editing
the referenced file - named, say "DKn:[...]FILE.DAT;33" - to get
"DKn:[...]FILE.DAT;34", but having the file that references the file still
refer to "DKn:[...]FILE.DAT;33"; again, this could be a feature in some
instances or a bug in others.

> >In addition, the problem of dangling directory links to deleted files
> >does not exist.
> 
> `The problem of dangling directory links to deleted files'?  If you
> mean that I can remove a file that you depend upon, even though you
> have an alternate name for that file: that does not happen in Unix,
> only in VMS.

No, he said

	When the VMS equivalent of the UNIX inode is reused, a counter in the
	index table slot is incremented.  Thus any dangling pointers to the
	previous file that used the same slot won't have any effect.

which is, of course, a problem in UNIX *only* after a crash, and then only if
you don't run "fsck" to fix things.  (VMS's predecessor, RSX-11M, had its moral
equivalent of "fsck", and I suspect VMS has it as well.)  I would not classify
this as a real problem; it isn't *supposed* to happen, and there are standard
procedures for fixing it.

> Unix has its weak points, but its file system is not one of them.

I agree.  The lack of reference counting in Files-11 doesn't strike me as
unambiguously being a feature, and the ability to open files by unique ID may
not necessarily be a winning feature if, for instance, you have to give up
reference counting for it.

jbw@bucsb.UUCP (Joe Wells) (09/15/88)

In article <13562@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>... I think ASTs
>are `too complicated' for general use, by which I mean there should
>be a nice simple method of just getting several syscalls going at
>the same time ... e.g., lightweight processes.

Agreed.  Having separately scheduled tasks each with its own stack is
really the best way.

>>In VMS, a file can exist without a name.  It can be accessed by its
>>unique file identifier.
>
>Presumably a `unique file ID' is like a Unix <dev,inode> pair.  How
>does one construct a file ID once the names are gone?  Search the disks
>directly for IDs without names attached?  Store the IDs in another file?
>(Then that file is a directory, so why not use a real directory?)
>This really sounds more like a bug than a feature.

Yes, in VMS it's not really well thought out.  One use is for temp
files, which you have to mark as "delete on close".  Another use is
for more quickly opening files.  Opening by file id is much quicker.
This can be useful when you have a directory with 10,000 files in it.
But yes, you have to remember the file ids in *another* file ...

>>In addition, the problem of dangling directory links to deleted files
>>does not exist.
>
>`The problem of dangling directory links to deleted files'?  If you
>mean that I can remove a file that you depend upon, even though you
>have an alternate name for that file: that does not happen in Unix,
>only in VMS.

This problem can occur every time a UNIX system crashes.  A entry in a
directory file can point to a inode whose file has been deleted.  Then
when the inode is reused, it has two links, but a link *count* of one.
That is why fsck is run every time a UNIX system boots.  Under VMS,
the directory entry would have a file id.  The file id would still
point to a slot in the file index table, but its use count would be
different from the count in the table.  So it would simply be an
invalid link.

I don't think the ability of a file to exist without a directory entry
is an advantage.  However, I like the ability to open a file without
knowing a name for the file.  In UNIX such a feature would require
rethinking file security since a directory's priveleges are used to
protect files inside it.

>Unix has its weak points, but its file system is not one of them.

Its simplicity is its strength, in my opinion.

>In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
>Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

Joe Wells
INTERNET: jbw@bucsf.bu.edu
UUCP: ...!harvard!bu-cs!bucsf!jbw

peter@ficc.uu.net (Peter da Silva) (09/16/88)

The ability to open a file by unique id (for example dev/inode in
UNIX) is not necessarily desirable.  It breaks one aspect of UNIX
security... the ability to hide a group of files by putting  them
in a  mode 700 directory.  If it was possible to open a file by a
unique id (via, for example the flat file  system)  it  would  be
trivial to get around this.

-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

bzs@encore.UUCP (Barry Shein) (09/16/88)

Last things first...

>> It's all a balancing act.
>
>Tightrope.
>
>> 	-Barry Shein, ||Encore||
>
>I appreciate your points, Barry, but don't agree.
>-- 
>Dave Arnold

Not sure what you don't agree with, I assume it's the following:

>> The problem with the Unix "unstructured" approach is that either you
>> use some of the (very few) library routines (dbm is a major one, so
>> are the object deck readers in SYSV) or you roll your own, each
>> application will have its own way of storing data (compare termcap
>> with passwd with inittab with crontab with ...) often not terribly
>> well documented or efficient (agreed, often efficiency is a poor
>> excuse for obscurity.)
>
>This is not a problem.  It's not often that your application requires
>you to "Roll your own".  And you get a very simple filesystem.
>When you try to design a filesystem that will attempt to please
>everyone under all circumstances, you over build---A real mess.

It's a problem if you have the problem.

"It's not often" might be true in your world, I doubt you could
convince the people I know trying to store their library catalogues
(eg) that efficient keyed storage and lookup is an uncommon problem.
Or business types trying to keep payroll or customer lists etc.

I agree it's hard to design a general filing system which pleases
everyone.  I'm not sure it's a law of nature that one cannot. In fact,
Unix might be quite close, just missing some application level
standards in regards to file storage libraries (from which, perhaps,
interested people could investigate tuning the system a little, the
buffer cache probably does most of what they want anyhow.)

	-Barry Shein, ||Encore||

dave@arnold.UUCP (Dave Arnold) (09/17/88)

In article <3613@encore.UUCP>, bzs@encore.UUCP (Barry Shein) writes:
> 
> Last things first...
> 
> >> It's all a balancing act.
> >
> >Tightrope.

I should have added a bunch of :-) to my original followup.  It seems
there is a bitter feeling towards my posting.  I didn't intend to cause
such feelings.  I will be more careful in the future.

My appologies.

Signed,

Dave (egg on my face) Arnold
-- 
Dave Arnold
dave@arnold.UUCP	{cci632|uunet}!ccicpg!arnold!dave

jeh@crash.cts.com (Jamie Hanrahan) (09/18/88)

In article <3597@encore.UUCP> bzs@encore.UUCP (Barry Shein) writes:
	[much good stuff...]
>
>It's all a balancing act. In my ideal world there would be a variety
>of standardized access methods and you would avoid using them like the
>plague, especially in general system utilities, simple byte-stream
>files should account for most input and output (a la Unix), ...

I disagree.  I much prefer VMS's variable-length-record text file format
to Unix's byte-stream.  Why?  Because the Unix byte stream uses perfectly
legitimate data as a record separator.  To make matters worse, the standard
C method for dealing with strings uses a *different* character as a string
terminator!  Unix has a lot of GREAT ideas in it, but this isn't one of them.

Barry goes on to say that you should be able to open any file as a byte
stream and not get an error.  Well, you can do the equivalent under VMS--
you can open any file, sequential, relative, or indexed, for sequential
access, and RMS will happily hand you the records in order (in order by
primary key if it's an indexed file).  And if you prefer a byte-stream
rather than a record-oriented interface (and, yes, the byte-stream i/f
has GREAT advantages from a program style standpoint; non-believers,
particularly those who have never looked inside Unix utilities, should
take a look at Kernighan and Plauger's _Software Tools_ or _Software
Tools in Pascal_ to see what I mean), you or the system can provide a
set of byte-stream interface routines to do that with a record-oriented
file system.  (That DEC's VAX C RTL does this, shall we say, imperfectly,
is a problem in the implementation, not the concept.)

(Incidently, Barry's problem with EDT stems from Runoff's former use of
print-format files, wherein carriage control information for each record
is stored in a fixed-length field preceding the text information.  A 
program that expects to read an ordinary text file can read such a file,
but it won't see the fixed-length field, so if it's an editor it can't
reconstruct the field on output.  The print-format file is one use of
"Variable with Fixed Control" record format, and I'm very happy to report
that very few VMS programs generate such files these days; it's one record
format that VMS could have done without.  )

To give you an idea of the generality of VAX RMS, the system runs happily
using just a few of the available file formats.  Text is stored in variable-
length-record, sequential files.  So is object code (possible even though
you can have null bytes, line feeds, etc., etc., in object records... because
RMS doesn't use in-band data for record terminators!).  Images and
library files go in fixed-length-record files, essentially with their own
internal format implemented by the programs that deal with them.  There
are a few indexed files like the user authorization file.  And that's about
it.  

For me, the bottom line is that it works, that RMS with all its fabled 
"inefficiencies" runs rings around most folks who try to bypass it (whoa,
now!  I said "most".  This because most people don't do the good job with
read-ahead and write-behind buffering that RMS does.  Sure, if you do that,
AND implement the record handling yourself, you can beat RMS, barely.  My
point is that you don't have to bother to get good performance), and that
I've dealt with VMS's file system for years without feeling I was doing 
battle with it.  No doubt if I was moved to a Unix environment I would 
gripe a lot for a few weeks about "those stupid byte-stream files", but 
I'd like to think that I'd adapt and figure out how to do things the Unix way
and work with the system instead of fighting it.  I'd like to think that most
Unix folks who come to VMS would do the converse.  I'm probably wrong on
both counts... :-)

jeh@crash.cts.com (Jamie Hanrahan) (09/18/88)

In article <178@arnold.UUCP> dave@arnold.UUCP (Dave Arnold) writes:
>How often have you written a program, and got the famous:
>
>%SYSTEM-F-EXCEEDED QUOTA
>
>message?  Isn't it fun trying to figure out which bloody quota
>was exceeded?!  Stupid!

Apologies for the cross-followup to the unix group; I don't know if
Dave reads the VMS group.  

The VMS quota system has two good reasons for being.  First, it prevents
a runaway program from using up all of something that might be in short
supply, like nonpaged pool or process slots.  Without this you could sit
in a loop doing $QIO with no wait to an offline device, and you'd bother
everybody on the system.  With quotas in effect you only bother yourself.
You can always enable process resource wait mode, which will cause your
process to go into MWAIT state (usually seen, for this purpose, as RWAST)
until the needed quota is returned, presumably by the completion of a 
previously-requested operation.  (Process resource wait mode is enabled
by default.)

You can also get EXQUOTA if you try to do a buffered I/O operation that's
larger than the size permitted by the SYSGEN parameter MAXBUF.  This is a
common pitfall.  MAXBUF is only middling-sized by default (somewhere near
1K if I recall correctly).  Many sites routinely set this up to 8K or so,
especially those that have megabytes of pool available.  

The other purpose of the quota system is to make sure that everything 
you've started is finished before your image is allowed to run down.  
Say you start a direct I/O operation to a flaky device driver; the system
charges your DIOLM by one.  You wait and decide to ^Y out, but the driver's
cancel I/O code doesn't work right so the I/O doesn't get aborted.  The
system notes that the original DIOLM is different from the current value
and won't permit your image to run down until they're the same.  This is
one of the great banes of both system managers and driver writers, but it's
necessary much of the time; if that I/O op is a read, and it decides to 
complete AFTER your image has run down and the physical pages you used 
to own (and to which the DMA will be performed) get assigned to somebody
else, watch out!  It's impossible for the system to distinguish where this
is necessary and where it isn't, so it's done all of the time.  

jeh@crash.cts.com (Jamie Hanrahan) (09/18/88)

In article <178@arnold.UUCP> dave@arnold.UUCP (Dave Arnold) writes:
>The VMS file system doesn't buy you anything, unless your application
>requires ISAM---However, how often do you need ISAM?
>
>I think the VMS filesystem is overly complicated, and one of the major
>downfalls of VMS (but can be tolerated).  If the original DEC designers
>had it to do over again, I suspect they would have stuck with a
>Stream-only based filesystem (Like UNIX), and provided ISAM libraries.
>The FORTRAN record format, FIXED SIZE RECORDS, VARIABLE LENGTH,
>CARRAIGE RETURN CARRIAGE CONTROL... Oh, don't forget the VFC record
>format...  These are all completely archaic, and date the VMS
>operating system.

I strongly disagree.  I answered this in another note, but there are a
few other points here... 

How often do you need ISAM?  Well, if you have to implement it yourself,
probably you'll do without.  But if it's there it gets used, for good and
sufficient reasons.  There are MANY great applications for indexed files...
Netnews, for instance.  Some folks at BYU did a netnews workalike for VMS,
relying heavily upon indexed files to keep track of the newsgroup contents,
but storing the articles in individual files just as Unix netnews does.  It's
a VERY clean design, and they can process a batch of received news MUCH 
faster than Unix can running on the same hardware.  (To be as exact as dim
memory allows, I think they said ten times faster or so, and that the Unix
folks at the site were both amazed and jealous.)

Someone will likely complain that "all that RMS code" costs a lot in terms
of efficiency.  I offer this challenge:  Take a simple Unix filter like
DETAB running on some Unix system on a VAX (Ultrix, BSD, AT&T, whatever).
Rewrite it to use record-oriented I/O under VMS.  Boot VMS on the same 
hardware (or the equivalent).  We've done this and the VMS/RMS versions
run *at least* twice as fast, sometimes five or six times.  (The much greater
improvement in BYU News comes from a redesign to take advantage of indexed
files, not just conversion from stream- to record-oriented I/O.)

I know, I know -- for many applications stream I/O makes for much cleaner
program design.  But for others, it doesn't, at least not when you have
good alternatives available.  

I don't think that fixed vs.variable length records, implied
carriage control, etc., are archaic at all.  Variable with fixed control,
on the other hand, is right down there with punched cards and paper tape!

daryl@ihlpe.ATT.COM (Daryl Monge) (09/18/88)

What I would like in the UNIX file system the VMS system has is the ability
to break links when a file is written.  I have an application where we wish
to share information between two directories, but I want the link broken if
the file is written when accessed from any directory it is listed in.  This
happens with VMS links because of versions of files in the file system.

Daryl Monge				UUCP:	...!ihnp4!ihcae!daryl
AT&T					CIS:	72717,65
Bell Labs, Naperville, Ill		AT&T	312-979-3603

dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/18/88)

In article <3438@crash.cts.com> jeh@crash.CTS.COM (Jamie Hanrahan) writes:
>I much prefer VMS's variable-length-record text file format
>to Unix's byte-stream.  Why?  Because the Unix byte stream uses perfectly
>legitimate data as a record separator.

UNIX files have no records, so there is no record separator.

But if you consider lines of text to be records and the newline
character to be a record separator (the concept is in your mind, not in
the filesystem), then VMS has a similar problem:  The low-level I/O
routines use perfectly legitimate data for administrative information!
Only at the RMS level is the overhead data made out-of-band.  And even
under UNIX, it is perfectly possible for an ISAM library to maintain
out-of-band administrative data.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

chris@mimsy.UUCP (Chris Torek) (09/18/88)

In article <3438@crash.cts.com> jeh@crash.cts.com (Jamie Hanrahan) writes:
>... the Unix byte stream uses perfectly legitimate data as a record
>separator.

Do you know what a `byte stream' is?  Byte streams do not have records;
they can hardly have record separators.  If you want records in a Unix
file system file, you must define them yourself.  This is what Barry
Shein was talking about.

>Barry goes on to say that you should be able to open any file as a byte
>stream and not get an error.  Well, you can do the equivalent under VMS--
>you can open any file, sequential, relative, or indexed, for sequential
>access, and RMS will happily hand you the records in order (in order by
>primary key if it's an indexed file).  And if you prefer a byte-stream
>... you or the system can provide a set of byte-stream interface routines
>to do that with a record-oriented file system.

Simulating a byte stream on top of records is considerably more
difficult than simulating records on top of a byte stream.  I have been
lead to believe that, under VMS, each different kind of record-oriented
file must be read with a different primitive.  (You must also provide a
buffer that is as large as the largest record.)  Hence to simulate a
byte stream, you must know about every possible record format.

On the other hand, to simulate a record format, you must know about
every possible byte stream.  Fortunately, there is only one possible
byte stream, by the definition of `byte stream'. . . .
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

chris@mimsy.UUCP (Chris Torek) (09/18/88)

In article <3442@crash.cts.com> jeh@crash.cts.com (Jamie Hanrahan) writes:
>How often do you need ISAM?  Well, if you have to implement it yourself,
>probably you'll do without.  But if it's there it gets used, for good and
>sufficient reasons.

This was Barry Shein's point.  But he might, and I will, go a bit
further:  sometimes it also gets used for bad, insufficient reasons.
(That does not mean it should not be there; but maybe it should not
be *too* easy to use.)

>... Some folks at BYU did a netnews workalike for VMS [using indexed
>files] ....  It's a VERY clean design, and they can process a batch of
>received news MUCH faster than Unix can running on the same hardware.
>(To be as exact as dim memory allows, I think they said ten times
>faster or so, and that the Unix folks at the site were both amazed
>and jealous.)

You are comparing incomparable things here.  The reason their news
unbacher is that much faster than the one in B news is almost certainly
because `it's a very clean design' and not because it uses any
particular storage format.  The B news unbatcher is a model of
inefficiency, clumsy patches, and re-re-re-re-worked code.  For
instance, an uncompressed batch file is read by forking a separate
process for each article in the file.  B news's only saving grace is
that it works, and it works on everything from PDP-11s to Convexes.

(Henry Spencer and Geoff Collyer rewrote the B news software and got
a similar order of magnitude performance increase, without changing
the file formats at all.)

>...  I offer this challenge:

Oh dear.

>Take a simple Unix filter like DETAB running on some Unix system on a
>VAX (Ultrix, BSD, AT&T, whatever).  Rewrite it to use record-oriented
>I/O under VMS.  Boot VMS on the same hardware (or the equivalent).
>We've done this and the VMS/RMS versions run *at least* twice as fast,
>sometimes five or six times.

If you pick your benchmarks carefully, you can prove anything.  Many
real programs spend a fair bit of time doing I/O, and VMS RMS I/O is
indeed quite efficient when properly used.  But so is Unix I/O.  VMS
currently has an implementation edge if the application reads large
blocks, since it does this by playing games with the MMU.  On the other
hand, Mach can do the same trick.

>I don't think that fixed vs.variable length records, implied
>carriage control, etc., are archaic at all.

I like the way Ken Thompson put it:

    These concepts fill a much-needed gap in other operating systems.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bzs@encore.UUCP (Barry Shein) (09/18/88)

>I should have added a bunch of :-) to my original followup.  It seems
>there is a bitter feeling towards my posting.  I didn't intend to cause
>such feelings.  I will be more careful in the future.
>
>My appologies.
>
>Signed,
>
>Dave (egg on my face) Arnold

No bitter feeling or any such thing, I was just trying to draw out
exactly what in my note you were objecting to, I might have been
wrong, as inconceivable as that may be. Such disagreements are
oftentimes explained by nothing more (or less) than differing
perceptions of priorities, in this case the importance/frequency of a
need for efficient keyed (&c) storage access methods.

But do wipe the egg of your face, it's just left over from breakfast
and is irrelevant to the conversation, it's making us sick Dave :-)

	-Barry Shein, ||Encore||

bzs@encore.UUCP (Barry Shein) (09/19/88)

In the first place, it's not obviously an either/or situation. I
suspect that VMS's RMS could be implemented on top of Unix with little
or no change to the O/S (although performance tuning would have to
trade off asynch read-ahead/write-behind and Unix's buffer cache which
accomplishes much the same basic thing [ie. the block you want next is
highly likely to be off the disk and in memory by the time you need
it], albeit in a different manner with different considerations.)

I wouldn't be at all shocked to see DEC announce (essentially) RMS
under Ultrix (and I'll bet a dollar someone is working on this.) Fine
idea, as long as it's not in the OS.

One problem with structured files that's easy to see is whether
information stored in the file to represent the structure is part of
the file or not.

For example, if in a variable length, blocked format you store the
length of each record as a preceding field of 16-bits, is the size of
the file the size of all its data + NRECORDS*2 (2 bytes)? Or just the
size of the file (that is, what does a file status query return?)

That doesn't seem terribly important at first (who cares, choose
a solution and stick to it) until one wants to access the thing
as a raw file (something always trivial to do in Unix's scheme.)

Now, is the 16-bit field counted in a file position seek? Can I safely
take two positions, POS1 and POS2 (byte offsets into the file, a la
ftell or lseek) and subtract them, perhaps then allocating and copying
the data? Or might the result be larger (OS adds in the 16-bit fields)
or even incorrect (POS2 should have been incremented by NRECORDS*2,
but I can't really calculate that number NRECORDS very easily, in
advance.)

I'm not sure I'm claiming that Unix solves any of this other than
laying things out so very barebones and w/o OS interpretation that
it's totally up to the user, no hand-to-hand combat with a record
management system required.

Anyhow, I may not be expressing myself very well, but I have used VMS
and IBM record access methods enough over the years to know that
sometimes they can drive you to tears (usually because the OS feels it
has a better idea of what you are doing than the programmer does,
and modifies or otherwise "corrects" your requests.)

What's far more important, in my experience, is to have an orderly set
of access methods and to use them only where they are truly justified
(ie. simply because it's faster is not a good enough excuse if 99% of
the actual applications will perform faster than human response time
with either method, naive or sophisticated.)

I remember, for example, when the VMS HELP files went from a very
simple, textual format to their current library format and it made
working with them in new and creative ways nearly impossible (I had
written a full-screen access to the VMS help files in TECO, no
kidding, which was nearly impossible to salvage, I never bothered.)
I'm not sure the changeover was really much of an improvement, sped up
something which was fast enough already and added a lot of complexity
where it was unappreciated, adding a new help topic became more
complicated etc.

Not a flame, just trying to emphasize my point about it's good to
have access methods, but it tends to lead people astray into using
them just to avoid scanning a file when the latter would perform
fine and would greatly simplify later maintenance (typically, the
file can be manipulated with a text editor) etc.

	-Barry Shein, ||Encore||

schwartz@shire (Scott Schwartz) (09/19/88)

In article <3442@crash.cts.com> jeh@crash.CTS.COM (Jamie Hanrahan) writes:
|I offer this challenge:  Take a simple Unix filter like
|DETAB running on some Unix system on a VAX (Ultrix, BSD, AT&T, whatever).
|Rewrite it to use record-oriented I/O under VMS.  ...
|We've done this and the VMS/RMS versions run *at least* twice as fast, 
|sometimes five or six times.

I've seen unix programs (things like a grep replacement) that got
similar speedups by replacing stdio calls with read/write and a large
buffer.  I wonder how much of that 2-6x is from overhead in stdio,
rather than in the filesystem.  

Here is some sample data:

/* f1.c: */

#include <stdio.h>
main()
{
	int c;	

	while ((c = getchar()) != EOF)
		putchar(c);
}

/* f3.c: */

#include <stdio.h>
main()
{
	int len;	
	char buffer[BUFSIZ*10];

	while (len = read(0, buffer, sizeof(buffer)))
		write(1, buffer, len);
}

/* test file */

shire% wc test
26388  100728 1292508 test

/* results */

shire% time f1 <test >foo
5.8u 0.7s 0:06 100% 0+224k 2+163io 0pf+0w

shire% time f3 <test >foo
0.0u 1.0s 0:01 63% 0+248k 0+160io 0pf+0w

Shire is a Sun 4 running SunOS 4.0.  I got similar results on a Vax 780
running 4.3 BSD (except that it took 10 times longer to run.)

-- Scott Schwartz     schwartz@gondor.cs.psu.edu

Your array may be without head or tail, yet it will be proof against defeat.  
   Sun Tzu, "The Art of War"

james@bigtex.uucp (James Van Artsdalen) (09/19/88)

In article <3438@crash.cts.com>, jeh@crash.CTS.COM (Jamie Hanrahan) wrote:

> I disagree.  I much prefer VMS's variable-length-record text file format
> to Unix's byte-stream.  Why?  Because the Unix byte stream uses perfectly
> legitimate data as a record separator.

In reading the write(2) man page, I somehow completely missed the
discussion of file record separators in unix.
-- 
James R. Van Artsdalen    ...!uunet!utastro!bigtex!james     "Live Free or Die"
Home: 512-346-2444 Work: 328-0282; 110 Wild Basin Rd. Ste #230, Austin TX 78746

gwyn@smoke.ARPA (Doug Gwyn ) (09/19/88)

In article <3951@psuvax1.cs.psu.edu> schwartz@shire.cs.psu.edu (Scott Schwartz) writes:
>In article <3442@crash.cts.com> jeh@crash.CTS.COM (Jamie Hanrahan) writes:
>|I offer this challenge:  Take a simple Unix filter like
>|DETAB running on some Unix system on a VAX (Ultrix, BSD, AT&T, whatever).
>|Rewrite it to use record-oriented I/O under VMS.  ...
>|We've done this and the VMS/RMS versions run *at least* twice as fast, 
>|sometimes five or six times.
>I've seen unix programs (things like a grep replacement) that got
>similar speedups by replacing stdio calls with read/write and a large
>buffer.  I wonder how much of that 2-6x is from overhead in stdio,
>rather than in the filesystem.  
>Here is some sample data:

The point is valid, although your two examples were not functionally
identical, since in one case you were inspecting EVERY character in
a file and in the other you never inspected ANY character.  User-mode
overhead from stdio tends to be comparable to system overhead for
typical applications, assuming a fairly good implementation of stdio.
Certainly it is a mistake to use stdio to implement "cat", for example
(for several reasons), but for most applications the additional
services provided by stdio (buffering, etc.) are useful, as is the
fact that the stdio functions are available on all systems whereas
open()/read()/etc. may not be (and when they are, their semantics are
not as well defined).

The analogous UNIX "challenge" would be:  Take a simple UNIX filter
(I have no idea where he gets "detab", which is not standard on UNIX)
and rewrite it to use direct system calls on UNIX...

Personally I think I have better things to do than crank out system-
specific code.

guy@gorodish.Sun.COM (Guy Harris) (09/19/88)

> I wouldn't be at all shocked to see DEC announce (essentially) RMS
> under Ultrix (and I'll bet a dollar someone is working on this.) Fine
> idea, as long as it's not in the OS.

Or, more precisely, not in a more-privileged mode than user mode; I consider
the OS to be more than just the kernel - for instance, I consider UNIX standard
I/O to be part of the OS.

Under RSX-11, if I remember correctly, RMS is just a library that runs in user
mode; VMS decided to fill another much-needed gap by running it in executive
mode.  Neither of them stuffed it into the kernel, at least....

guy@gorodish.Sun.COM (Guy Harris) (09/19/88)

> What I would like in the UNIX file system the VMS system has is the ability
> to break links when a file is written.  I have an application where we wish
> to share information between two directories, but I want the link broken if
> the file is written when accessed from any directory it is listed in.  This
> happens with VMS links because of versions of files in the file system.

This can happen in UNIX as well, if you write your application to break links
in this fashion.  Few applications do this, but I suspect many users of those
applications may well consider this to be a feature....

(It can, I believe, also *not* happen in VMS if your application does whatever
magic is needed to write on an existing file rather than creating a new file
with a higher version number.)

guy@gorodish.Sun.COM (Guy Harris) (09/19/88)

> I disagree.  I much prefer VMS's variable-length-record text file format
> to Unix's byte-stream.  Why?  Because the Unix byte stream uses perfectly
> legitimate data as a record separator.  To make matters worse, the standard
> C method for dealing with strings uses a *different* character as a string
> terminator!  Unix has a lot of GREAT ideas in it, but this isn't one of them.

Umm, as others have already pointed out, UNIX doesn't use '\n' as a record
separator; it uses it as a *line* separator.  UNIX - like VMS - ultimately (at
the kernel level) implements files as a sequence of bytes (RMS sits on top of
QIOs that read virtual blocks of the file, *n'est ce pas?*).

One file format UNIX happens to implement atop this abstraction is the "text
file"; "text files" consist of "lines", which are sequences of bytes (not
containing '\0' - some applications can't handle them, since it's the C string
terminator) ending with '\n'.

Other file formats exist, such as executable images and archives, which are,
respectively, the UNIX equivalents of images (and object files - object files
and images use the same format) and library files.

However, UNIX doesn't come standard with any libraries that implement "record"
files.  Such libraries are available from third-party vendors (e.g., C-ISAM),
and I very much doubt that they use '\n' or any other particular byte value as
a record separator.

Some of the real differences between UNIX and VMS here are that:

	1) As already stated, VMS comes with libraries that implement "record"
	   files, while UNIX doesn't;

	2) Many UNIX utilities (e.g., "cp") deal with files at the byte-stream
	   level, so they don't care *what* format the file is in;

	3) Many more UNIX facilities use text files, rather than record files,
	   as their underlying file format; while one reason for this may be
	   the absence of a "record file" library, another reason is that you
	   can use the standard UNIX text file tools to manipulate those files.

mike@turing.unm.edu (Michael I. Bushnell) (09/19/88)

In article <68850@sun.uucp>, guy@gorodish (Guy Harris) writes:
>> I wouldn't be at all shocked to see DEC announce (essentially) RMS
>> under Ultrix (and I'll bet a dollar someone is working on this.) Fine
>> idea, as long as it's not in the OS.
>
>Or, more precisely, not in a more-privileged mode than user mode; I consider
>the OS to be more than just the kernel - for instance, I consider UNIX standard
>I/O to be part of the OS.

But standard I/O runs in user mode, not in a more-priviledged mode.  What
you consider the OS to be is not what it in fact is.  A good working 
description of OS is that part of the system which the arbitrary user
cannot rewrite and use in lieu of the distributed code.  You can rewrite
stdio, and then not use the distributed one.  This definition is
*very* closely linked to what privilege mode the code runs in...if it
runs in user mode, the user could replace it.

>Under RSX-11, if I remember correctly, RMS is just a library that runs in user
>mode; VMS decided to fill another much-needed gap by running it in executive
>mode.  Neither of them stuffed it into the kernel, at least....

But...the user can't necessarily replace RMS without getting to write
his own CHME dispatch table, something the kernel is not likely to let
him do.
-- 
-- 
                N u m q u a m   G l o r i a   D e o 

       \                Michael I. Bushnell
        \               HASA - "A" division
        /\              mike@turing.unm.edu
       /  \ {ucbvax,gatech}!unmvax!turing.unm.edu!mike

mike@turing.unm.edu (Michael I. Bushnell) (09/19/88)

In article <68855@sun.uucp>, guy@gorodish (Guy Harris) writes:
>> I disagree.  I much prefer VMS's variable-length-record text file format
>> to Unix's byte-stream.  Why?  Because the Unix byte stream uses perfectly
>> legitimate data as a record separator.  To make matters worse, the standard
>> C method for dealing with strings uses a *different* character as a string
>> terminator!  Unix has a lot of GREAT ideas in it, but this isn't one of them.
>
>Umm, as others have already pointed out, UNIX doesn't use '\n' as a record
>separator; it uses it as a *line* separator.  UNIX - like VMS - ultimately (at
>the kernel level) implements files as a sequence of bytes (RMS sits on top of
>QIOs that read virtual blocks of the file, *n'est ce pas?*).
>
>One file format UNIX happens to implement atop this abstraction is the "text
>file"; "text files" consist of "lines", which are sequences of bytes (not
>containing '\0' - some applications can't handle them, since it's the C string
>terminator) ending with '\n'.
>
>Other file formats exist, such as executable images and archives, which are,
>respectively, the UNIX equivalents of images (and object files - object files
>and images use the same format) and library files.

But a very important thing to remember is this:  The designers of UNIX
didn't expect to see people edit binaries, but they stuck with the
byte-stream abstraction.  Programs that are willing to stick to it
(like GNU emacs, and unlink ed, ex, and vi) can benifit tremendously.
I can and do edit binaries using emacs.  It didn't take *any*
modification of the operating system to do this, and emacs didn't
require *any* special modifications to do so...all it needed was to
learn how *not* to use separators. 

The point is that while you might not see the value in it now, you
might later, when it is too late.  Try using your favorite VMS editor
to edit a binary and change a string constant!  Not too likely, I'm afraid.
-- 
-- 
                N u m q u a m   G l o r i a   D e o 

       \                Michael I. Bushnell
        \               HASA - "A" division
        /\              mike@turing.unm.edu
       /  \ {ucbvax,gatech}!unmvax!turing.unm.edu!mike

sommar@enea.se (Erland Sommarskog) (09/20/88)

Jamie Hanrahan (jeh@crash.CTS.COM) writes:
>I know, I know -- for many applications stream I/O makes for much cleaner
>program design.  But for others, it doesn't, at least not when you have
>good alternatives available.  

I don't think one should over-emphasize the importance of what I/O-
concept the OS uses. If I program in an high-level langauge it is
rather the I/O-concept of that language which is of interest. At 
least if I/O is well-defined. In many modern langauges, I/O is not 
part of the langauge, but rather a library which could be more or
standardized. What is left is of course the question of efficiency.

So if the langauge like C only has stream I/O (I assume it is so, 
I don't speak C, so I could be wrong) then we don't benefit from 
a complex file system when all we want is simple streams.

Ada, on the other hand, has text files, and record files both for
sequential and direct access. For the compiler-writer it may be
of interest if the file system supports the appropriate formats,
for me as a programmer it does not. Whether it's in the file system
or the RTL doesn't matter.
  Jamie Hanrahan complained that stream I/O meant that in-band data
were used as a terminator. In practice this mean writing an LF in 
the middle of a text line is impossible in Unix, while is quite OK
in VMS. (Which on the other hand impose a maximum length on the line.)
  So what about Ada? If I write an LF character the result will be
different on VMS and Unix? Non-portable? Yes, but the manual also
clearly says that I/O of non-printable characters is not defined
by the language.

-- 
Erland Sommarskog            ! "Hon ligger med min b{ste v{n, 
ENEA Data, Stockholm         !  jag v}gar inte sova l{ngre", Orup
sommar@enea.UUCP             ! ("She's making love with best friend,
                             !   I dare not to sleep anymore")

jeh@crash.cts.com (Jamie Hanrahan) (09/20/88)

No, this isn't a followup rebuttal, even though I've been beat up pretty 
badly re. my statement about "record separators" (okay, okay, "line 
separators") in Unix files.  I said my piece already, right?  

But I was annoyed to see someone say "Please, don't start another
Unix vs. VMS war".  I don't think this is a "war" at all.  I think I've
learned a bit about the right way to think about Unix files, knowledge
which will no doubt come in handy some day, probably sooner than I think it
will (if past experience is any guide).  Maybe some other folks have learned
something about VMS files too.  Isn't this what the net is about?  (But if
someone says "Please don't let this get out of hand", I'll second.)

jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/20/88)

In article <68855@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes:
>Umm, as others have already pointed out, UNIX doesn't use '\n' as a record
>separator; it uses it as a *line* separator.  UNIX - like VMS - ultimately (at
>the kernel level) implements files as a sequence of bytes (RMS sits on top of
>QIOs that read virtual blocks of the file, *n'est ce pas?*).

vms has file attributes directly associated with the file.  qio does
read virtual blocks - but you can't easily convince rms to read a file
in some mode other than the mode the file was created with.  if you
have an isam file you want to read as a 80 character fixed length record
file, it's qio or nothing [ but grief ]
-- 
John F. Haugh II (jfh@rpp386.Dallas.TX.US)                   HASA, "S" Division

    "If the code and the comments disagree, then both are probably wrong."
                -- Norm Schryer

daryl@ihlpe.ATT.COM (Daryl Monge) (09/20/88)

In article <68853@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes:
>> What I would like in the UNIX file system the VMS system has is the ability
>> to break links when a file is written.
>
>This can happen in UNIX as well
>(It can, I believe, also *not* happen in VMS if your application does whatever
>magic is needed to write on an existing file rather than creating a new file
>with a higher version number.)

Sorry for not being complete enough.  We not only have our own applications, 
but normal UNIX commands need to run properly also. (ex: cc awk sed)

The observation that the VMS is not a complete solution is correct.  Some
tools modify the file in place.  However, most VMS commands work right 
on VMS linked files but most UNIX commands do not work the way I need.

Needless to way, there are some problems with what I want to do.  For
example, consider O_RDWR mode or O_APPEND.  The way you would want a link
broken in this case is to have the file copied before modified, or have some
type of block copy on write (as in paged virtual memory management.)

Daryl Monge				UUCP:	...!att!ihcae!daryl
AT&T					CIS:	72717,65
Bell Labs, Naperville, Ill		AT&T	312-979-3603

eric@snark.UUCP (Eric S. Raymond) (09/20/88)

In article <13608@mimsy.uucp>, chris@mimsy.UUCP (Chris Torek) writes:
> (Henry Spencer and Geoff Collyer rewrote the B news software and got
> a similar order of magnitude performance increase, without changing
> the file formats at all.)

And I did likewise, with similar results, for B3.0. Chris is, as usual, quite
correct; the fault lies not in our file formats, but in our code. The major
win was just eliminating the fork-per-article overhead in the unbatcher.

The principle exemplified here bears repeating yet again:

	A CLEAN DESIGN IS THE ROYAL ROAD TO SPEEDY CODE

and fiddling with flat-vs-ISAM files, clever code hacks or other 'micro-level'
optimizations is usually a recipe for lots of pain with very little gain.

-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

marc@ima.ima.isc.com (Marc Evans) (09/21/88)

Below is a program which I received from DEC a while back which demonstraits
a mechanism which can be used to modify the type of file that RMS thinks any
file is. I have used this as it is below, to work under VMS, while maintaining
UNIX like file IO characteristics.

Happy Hacking...
===============================================================================
Marc Evans | decvax<--\    /-->marc<--\               | That's not a bug...It's
Synergytics| harvard<--\  /            \  /--->norton | a design feature... 8-)
Pelham, NH | necntc<---->ima<---->symetrx<---->dupont | =======================
===============================================================================
-------------------- C U T   H E R E ------------------------------------------
#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  mungattr.c build.com
# Wrapped by marc@edogte on Tue Sep 20 11:41:23 1988
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'mungattr.c' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'mungattr.c'\"
else
echo shar: Extracting \"'mungattr.c'\" \(2842 characters\)
sed "s/^X//" >'mungattr.c' <<'END_OF_FILE'
X/****************************************************************************
X *	Program:	mungattr.c
X *	Purpose:	This program changes the file attributes and record
X *			length to a specified value.
X *	Author:		Mark Turner - Language Support Team (DEC)
X *	Date:		June, 1988
X ****************************************************************************
X *	Modified:	Marc Evans - Independant Contractor (Synergytics)
X *	Date:		Sept., 1988
X ****************************************************************************/
X
X#include <stdio.h>
X#include <iodef.h>
X#include <fibdef.h>
X#include <atrdef.h>
X#include <descrip.h>
X#include <stat.h>
X#include "fatdef.h"
X
struct
X{	short cond_value, count;
X	int info;
X} iosb;
X
struct
X{	unsigned short w_size, w_type;
X	unsigned int l_addr;
X} acb[2];
X
struct fibdef fib;
struct stat my_buff;
struct
X{	unsigned char rtype, rattrib;
X	short rsize;
X	char filler_2[12];
X	short mrec;
X	char filler_3[14];
X} ratt_area;
X
int status;
X
short func_code, chan;
X
X$DESCRIPTOR(fibdesc, &fib);
X$DESCRIPTOR(device, "SYS$DISK:");	/* Disk the file is on */
X
main(argc,argv)
char **argv;
X{
X	char *filename;
X
X	/* Did the invoker supply a filename? */
X	if (argc != 2)
X	{	fprintf(stderr,"USAGE: %s filename\n",argv[0]);
X		exit(1);
X	}
X	filename = argv[1];
X
X	/* Get the FID of the file */
X	stat(filename, &my_buff);
X
X	/* Assign a channel to the disk */
X	if ((status = SYS$ASSIGN(&device,&chan,0,0)) & 1 != 1)
X	{ LIB$STOP(status); }
X
X	/* Init the appropriate fields of the FIB */
X	fibdesc.dsc$w_length = FIB$C_LENGTH;
X	fib.fib$r_acctl_overlay.fib$r_acctl_bits0.fib$v_write = 1;
X	fib.fib$r_fid_overlay.fib$w_fid[0] = my_buff.st_ino[0];
X	fib.fib$r_fid_overlay.fib$w_fid[1] = my_buff.st_ino[1];
X	fib.fib$r_fid_overlay.fib$w_fid[2] = my_buff.st_ino[2];
X
X	/* Set up the attribute control block */
X	acb[1].w_size = 0;
X	acb[1].w_type = 0;
X	acb[1].l_addr = 0;
X	acb[0].w_size = ATR$S_RECATTR;
X	acb[0].w_type = ATR$C_RECATTR;
X	acb[0].l_addr = &ratt_area;
X
X	/* Access the file */
X	func_code = IO$_ACCESS | IO$M_ACCESS;
X	status = SYS$QIOW(0,chan,func_code,&iosb,0,0,&fibdesc,0,0,0,&acb,0);
X	if ((status & 1) != 1)
X	{ LIB$STOP(status); }
X	if ((iosb.cond_value & 1) != 1)
X	{ LIB$STOP(iosb.cond_value); }
X
X	/* Change the file to a sequential stream file */
X	ratt_area.rtype = FAT$C_FIXED | FAT$C_SEQUENTIAL;
X	ratt_area.rsize = 512;
X	ratt_area.mrec = 512;
X	ratt_area.rattrib = FAT$M_IMPLIEDCC;
X
X	/* Modify the file header information */
X	status = SYS$QIOW(0,chan,IO$_MODIFY,&iosb,0,0,&fibdesc,0,0,0,&acb,0);
X	if ((status & 1) != 1)
X	{ LIB$STOP(status); }
X	if ((iosb.cond_value & 1) != 1)
X	{ LIB$STOP(iosb.cond_value); }
X
X	/* Deaccess the file */
X	func_code = IO$_DEACCESS;
X	status = SYS$QIOW(0,chan,func_code,&iosb,0,0,&fibdesc,0,0,0,0,0);
X	if ((status & 1) != 1)
X	{ LIB$STOP(status); }
X	if ((iosb.cond_value & 1) != 1)
X	{ LIB$STOP(iosb.cond_value); }
X
X}
END_OF_FILE
if test 2842 -ne `wc -c <'mungattr.c'`; then
    echo shar: \"'mungattr.c'\" unpacked with wrong size!
fi
# end of 'mungattr.c'
fi
if test -f 'build.com' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'build.com'\"
else
echo shar: Extracting \"'build.com'\" \(80 characters\)
sed "s/^X//" >'build.com' <<'END_OF_FILE'
X$ set verify
X$ cc/opt/nodebug mungattr.c
X$ link/nodebug mungattr
X$ set noverify
END_OF_FILE
if test 80 -ne `wc -c <'build.com'`; then
    echo shar: \"'build.com'\" unpacked with wrong size!
fi
# end of 'build.com'
fi
echo shar: End of shell archive.
exit 0

guy@gorodish.Sun.COM (Guy Harris) (09/21/88)

> vms has file attributes directly associated with the file.  qio does
> read virtual blocks - but you can't easily convince rms to read a file
> in some mode other than the mode the file was created with.

As I remember, the VMS file attributes are maintained, but not really used, by
the code that I would refer to as the VMS file system (the ACPs or "extended
QIO processors" or whatever they call the new stuff they added in recent
versions).  I think there are QIOs (perhaps undocumented) that RMS uses to
fetch and store those attributes.

guy@gorodish.Sun.COM (Guy Harris) (09/21/88)

> But standard I/O runs in user mode, not in a more-priviledged mode.  What
> you consider the OS to be is not what it in fact is.

Help stamp out ontology in our lifetimes!

"In fact", the OS "is" what it somebody defines it to be.  I see no reason to
define it in such a way as to exclude code running in user mode; in fact, a
good reason to define it to *include* that code is that it helps people realize
that "putting something into the OS" need not be synonymous with "putting it
into the kernel".

> A good working description of OS is that part of the system which the
> arbitrary user cannot rewrite and use in lieu of the distributed code.

An equally good, if not better, working description is "that part of the system
that comes on the tape that the vendor calls the operating system distribution
tape".

If you want a term that describes that part of the system that the user can't
replace, "kernel" is an OK one, although systems such as VMS run some of that
part of the system in modes less privileged than kernel mode.  "OS" is somewhat
of a poor one, because it does not "in fact" correspond to the way "operating
system" is generally used.  "OS/360" includes a bunch of stuff not run in
supervisor state; for instance, as I understand it, the access methods actually
run in problem state, and build channel programs and issue SVCs to get them
started.

Lacking any better term, I would opt for "privileged portion of the OS" or some
such term.

> But...the user can't necessarily replace RMS without getting to write
> his own CHME dispatch table, something the kernel is not likely to let
> him do.

This depends on what you mean by "replace".

Under RSX-11, you can presumably "replace" RMS, in the sense of having a
package that does the same general sort of function, without being able to get
at a more privileged mode.  If an RMS-like package doesn't require QIOs that
can be run only from executive mode, you could do the same under VMS (although
I wouldn't put it past DEC to have stuck in executive-mode-only QIOs which RMS
uses).

The point is that lots of people confuse "OS" with "kernel", and I suspect many
of them either advocate putting features into the kernel, express alarm at the
prospect of features being put into the kernel, or think features are
implemented in the kernel because they confuse "OS" with "kernel".

jbs@fenchurch.MIT.EDU (Jeff Siegal) (09/21/88)

I'm not so much concerned with the issue of stream vs. record-oriented
file access.  One can always (perhaps not easily or quite so
efficiently, but if you need to get the job done...) be used to
simulate the other.

I think a more fundamental advantage of the VMS I/O system is that
QIO's can be queued to execute asynchronously (similarly for RMS
operations).  An AST (software trap) is delivered to your process when
the I/O completes.  The AST tells you which I/O operation completed,
and the error status.

On Unix, you can get asynchronous I/O by going through the buffer
cache, but this doesn't provide a clean way for the error status to be
returned (for example, NFS writes that go over quota can appear to
complete, but return the error later, when performing another
operation), and doesn't allow you full access to the device (for
example, the buffered tape device only provides 1024 byte records.

I believe certain VMS I/O operations are performed directly from/to
process memory, rather than being copied from/to the kernel, but this
is "only" a performance issue (and an arguable one at that).

Jeff Siegal

aperez@cvbnet2.UUCP (Arturo Perez Ext.) (09/21/88)

From article <68855@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris):
>> I disagree.  I much prefer VMS's variable-length-record text file format
>> to Unix's byte-stream.  Why?  Because the Unix byte stream uses perfectly
>> legitimate data as a record separator.  To make matters worse, the standard
>> C method for dealing with strings uses a *different* character as a string
>> terminator!  Unix has a lot of GREAT ideas in it, but this isn't one of them.
> 
> One file format UNIX happens to implement atop this abstraction is the "text
> file"; "text files" consist of "lines", which are sequences of bytes (not
> containing '\0' - some applications can't handle them, since it's the C string
> terminator) ending with '\n'.
> 
> Other file formats exist, such as executable images and archives, which are,
> respectively, the UNIX equivalents of images (and object files - object files
> and images use the same format) and library files.
> 
> However, UNIX doesn't come standard with any libraries that implement "record"
> files.  Such libraries are available from third-party vendors (e.g., C-ISAM),
> and I very much doubt that they use '\n' or any other particular byte value as
> a record separator.
> 

I'm curious.  I understand VMS's supposed need for the various file formats.
And although I disagree, that's DEC decision; let them live with it.  They just
want application designers to use the tools that DEC designed.  Maybe because
it makes their software easier to support.  I don't really know.  And I don't
really work with VMS often enough to really care.

But I do know from experience that the Unix file system is so straightforward
that ANYBODY can use it without having to worry about the millions of 
descriptors that are needed to set up an I/O request on RMS. 


What I'm curious about is the fact that I've never heard of any record
access libraries for Unix.  I know that I've written simpleminded record
access applications.  I'm sure other people have as well.  Is there anyone
actually selling record access libraries for the Unix community?  If not
why isn't anyone doing it?


Arturo Perez
ComputerVision, a division of Prime
primerd!cvbnet!aperez
The difference between genius and idiocy is that genius has its limits.

gwyn@smoke.ARPA (Doug Gwyn ) (09/21/88)

In article <3954@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>In practice this mean writing an LF in the middle of a text line is
>impossible in Unix, while is quite OK in VMS.

On the other hand, what is a "text line" that occupies portions of
multiple lines on a display device?  Change "text line" to "text
record" and the concept makes more sense, but then why is text
necessarily organized into records, and why do these records look
like they do instead of something like

x T aps
x res 723 1 1
x init
x font 1 R
x font 2 I
x font 3 B
x font 4 H
x font 5 CW
x font 6 S
x font 7 S1
x font 8 GR
V0
p1
s10
f1
H696
V480
h2075c-
35 33152 33-n120 0
H696
V960
cT
67h54i28sw71i28sw71a50nw86e45x50a50m82p52l28e45.n120 0
x trailer
V7953
x stop

dg@lakart.UUCP (David Goodenough) (09/21/88)

From article <3506@ihlpe.ATT.COM>, by daryl@ihlpe.ATT.COM (Daryl Monge):
> 
> What I would like in the UNIX file system the VMS system has is the ability
> to break links when a file is written.  I have an application where we wish
> to share information between two directories, but I want the link broken if
> the file is written when accessed from any directory it is listed in.  This
> happens with VMS links because of versions of files in the file system.

You have it. Consider the following:

	% mkdir foo bar
	% cat /etc/passwd >foo/file
	% ln foo/file bar

I now have one file in foo and bar - same data, cause it's a link

	% mv foo/file foo/file.old

Make a new version (o.k. so version numbers don't work so hot, but with a 10
line procedure it can be implemented)

	% update <foo/file.old >foo/file

You now have your new version, PLUS the old original, which is still a link.
In both instances we've done about the same amount of work: in either
instance update is going to have to cause the system to write the new file
completely, but that's life.

Just like you wanted.
-- 
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!cca!lakart!dg			+-+-+ |
						  	  +---+

jeremy@chook.ua.oz (Jeremy Webber) (09/21/88)

In all this discussion I have not seen mention of the fact that you can open a
VMS file for block i/o and then treat it as a stream of blocks.  This can be
useful for just moving data around.  It can also be dangerous, but no more so
than treating a file as a stream of bytes.

One thing that I think DEC stuffed up badly though is that they did not define
a standard for text files.  Instead, you have variable-length-carriage-control,
Fortran carriage control, List carriage control, stream-LF, stream-CR and
probably half a dozen others that I have not thought about.  This makes writing
text file manipulation programs, such as text editors, a real pain.  It also
makes manipulation of text by programs written in different languages
hazardous.  I believe that DEC should modify the run time libraries of all
languages to convert internal text to and from a standard text form when
reading and writing files.

I can see the performance advantages of letting the file system "know" about
RMS.  Particularly with regard to record locking and other commercial uses.

In short, there are advantages and disadantages in the VMS as against the UNIX
method of treating files, and you'll probably choose the one best for your
application.

-Jeremy Webber (jeremy@chook.ua.oz.au)
Computer Science, Adelaide University, Australia

"One of these days I'll get around to writing a .signature file"

guy@gorodish.Sun.COM (Guy Harris) (09/21/88)

> I think a more fundamental advantage of the VMS I/O system is that
> QIO's can be queued to execute asynchronously (similarly for RMS
> operations).

This is orthogonal to the issue of the VMS file system vs. the UNIX file
system.  You could have a VMS-style file system, RMS and all, atop a UNIX-style
I/O subsystem (imagine implementing RMS atop "read" and "write", with RMS doing
no read-ahead nor write-behind, but leaving it up to the "read" and "write"
code to do so), and you could have a UNIX-style file system, wherein the file
system things of files merely as collections of bytes, atop a VMS-style I/O
subsystem (imagine VMS with QIOs to open/read/write/close/etc. files being the
published fundamental I/O operations, and with "read" and "write" perhaps
implemented as library routines atop this).

I suspect you may see various flavors of asynchronous I/O on files supported
various versions of UNIX in the future.  ("Asynchronous I/O" means
"asynchronous" in the VMS sense - e.g., "aread" and "awrite" calls that do not
block until completion, and an "iowait" call to wait for one or more such calls
to complete - not "asynchronous" in the 4.[23]BSD sense, where the ability to
do some amount of I/O without blocking can be signalled for certain objects by
sending a SIGIO to interested processes.)  I think HP may already have this.

> On Unix, you can get asynchronous I/O by going through the buffer
> cache, but this doesn't provide a clean way for the error status to be
> returned (for example, NFS writes that go over quota can appear to
> complete, but return the error later, when performing another
> operation),

"fsync" provides this to some degree; NFS implementations tend to provide such
write-behind errors to the caller of "fsync", but I don't know if they also
provide write-behind errors for local file systems, e.g. an attempt to push a
buffer to disk getting a physical I/O error.

meo@stiatl.UUCP (Miles O'Neal) (09/21/88)

In article <68855@sun.uucp>, guy@gorodish.Sun.COM (Guy Harris) writes:
> 	3) Many more UNIX facilities use text files, rather than record files,
> 	   as their underlying file format; while one reason for this may be
> 	   the absence of a "record file" library, another reason is that you
> 	   can use the standard UNIX text file tools to manipulate those files.

If you even have your data files as text files, debugging
becomes much easier. For instance, would you rather debug

98764389437034gh307ytfhr398f39

or

12/22/88 01:30 10790 100 100 382 -1

?
These are not real data, but examples of what data files I've dealt
with looked like. The processing to do all this is cheap nowdays,
so why not use text files if there is no OVERWHELMING reason not to?

Another thing this buys you is that, in my experience, its easier
to change file formats if you use text files. It requires a little
plannning, but in general is a lot less work than doing the same
thing with any other type of data.

Strangely enough, you can do similar things with VMS, OS/32, or
even CP/M...

peter@ficc.uu.net (Peter da Silva) (09/22/88)

In article <10105@eddie.MIT.EDU>, jbs@fenchurch.MIT.EDU (Jeff Siegal) writes:
> I think a more fundamental advantage of the VMS I/O system is that
> QIO's can be queued to execute asynchronously (similarly for RMS
> operations).  An AST (software trap) is delivered to your process when
> the I/O completes.  The AST tells you which I/O operation completed,
> and the error status.

I really really wish UNIX supported this. At home I use AmigaDOS, and
it lets you do the moral equivalent:

	Issue a write by sending a message to the file's handler task.
	Do something else.
	Either:
		get a software interrupt when the I/O completes,
	or:
		wait on a signal bit (event flag, for DEC types).

Since every outstanding I/O has a signal bit, you can always wait for
whatever combination of events you want.

There's nothing in the UNIX file model that should prevent this, and
in fact the signal bit for a FD could be the FD.

	printf("Hit any key to abort.\n");
	if(aread(0, &ch, 1) && aread(fd, addr, nbytes)) {
		bitmap_t bits = (1<<fd)|(1<<0);
		bits=await(bits);
		if( (bits & (1<fd)) == 0) {
			printf("Aborted");
			akill(fd);
		}
		if( (bits & (1<0)) != 0)
			akill(0);
	}

Oh well, something else to put on the list of "what I'd do if I was
in charge of UNIX", along with things like mapping arbitrary files
into memory and cleaning up the ioctl/fcntl mess...
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

dave@arnold.UUCP (Dave Arnold) (09/22/88)

eric@snark.UUCP (Eric S. Raymond) writes:
> In article <13608@mimsy.uucp>, chris@mimsy.UUCP (Chris Torek) writes:
> > (Henry Spencer and Geoff Collyer rewrote the B news software and got
> > a similar order of magnitude performance increase, without changing
> > the file formats at all.)
> 
> [...]
> 
> The principle exemplified here bears repeating yet again:
> 
> 	A CLEAN DESIGN IS THE ROYAL ROAD TO SPEEDY CODE
> 

I couldn't agree any more.  People I work with seem to get bogged down
in the "How big of a QIO can I do" syndrome during early early program
design and development.  I really protest this (especially when they
encourage me to do the same).  One of the reasons why I am a *GREAT*
:-) programmer...is...because...: I much prefer to view things in the
most simple way.  I actually go to great effort rewriting things
(with my bosses glare $$$) just to acheive a simpler program design.
Sometimes the rewrite achieves better performance (not intentionally).
And if not, facilitates easier performance enhancements---But I save
those for last.

This is the thing that I love about UNIX so much that I wish VMS
shared: SIMPLICITY.  Everything is so damn simple, it goes right
over some people's head.  Now if UNIX only had AST's, timer queues,
exception handling, and a better "SHELL"---I would be in heaven.

Remember the days when we would bring monolithic
straight-line code to bed with us, and make marks on the listing?

I even remember back in the late 1970's my boss teaching me the
cons of structured programming by explaining to me that a function
call just turns into a JMP instruction :-)  This is the 80's!!!
Soon to be 90's!! Let's not get stuck in the dark ages!
-- 
Dave Arnold
dave@arnold.UUCP	{cci632|uunet}!ccicpg!arnold!dave

eric@snark.UUCP (Eric S. Raymond) (09/22/88)

In article <3453@crash.cts.com>, jeh@crash.CTS.COM (Jamie Hanrahan) writes:
>                 I don't think this is a "war" at all.  I think I've
> learned a bit about the right way to think about Unix files, knowledge
> which will no doubt come in handy some day, probably sooner than I think it
> will (if past experience is any guide).  Maybe some other folks have learned
> something about VMS files too.  Isn't this what the net is about?

Yup. Me, I learned a lot about VMS from your postings. Not that I'd ever use
it without you put a gun to my head, but I learned a lot. Thank you for your
lucid descriptions of how RMS works.

BTW, cultural differences are funny; I kept wanting to parse that acronym RMS
as "Richard M. Stallman", an entity even more complex and obscure (but much
less brain-damaged :-)) than VMS file I/O.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

allbery@ncoast.UUCP (Brandon S. Allbery) (09/23/88)

As quoted from <179@arnold.UUCP> by dave@arnold.UUCP (Dave Arnold):
+---------------
| In article <3597@encore.UUCP>, bzs@encore.UUCP (Barry Shein) writes:
| > The problem with the Unix "unstructured" approach is that either you
| > use some of the (very few) library routines (dbm is a major one, so
| > are the object deck readers in SYSV) or you roll your own, each
| > application will have its own way of storing data (compare termcap
| > with passwd with inittab with crontab with ...) often not terribly
| > well documented or efficient (agreed, often efficiency is a poor
| > excuse for obscurity.)
| 
| This is not a problem.  It's not often that your application requires
| you to "Roll your own".  And you get a very simple filesystem.
+---------------

This all ties together with the terminfo-vs.-termcap discussion.  Actually, I
have written an interpreted terminfo (as part of the "tgraph" compatibility
package for SVR2 curses); it is slow, but that's mainly because of laziness.
It should be quite possible to write it to work quickly, with the same
longer name usage *but* *extensible* unlike terminfo.

Just as byte-stream file systems are more general and more useful than typed
file systems, simple, general, FAST "access method" routines on top of the
stream file systems are better than either typed file systems or roll-your-
own access methods.  (Example:  COFF, or the new format perhaps, could
easily be generalized to make a "resource library file" similar to Macintosh
resource forks.  Which would make "ld" a general utility rather than just an
object relocation editor.)

Termcap's obscurity and outright bugs (skip a backslash or expand a tab to
spaces and the whole file goes to pot) make it a rather bad access method;
while fixed versions (such as the Gnu version) handle the bugs, it's still
harder to understand those two-character capnames than terminfo capnames.
The interpretive terminfo-style reader is a step in the right direction.  I
also have a terminfo-like routine (currently implemented via yacc, so it's
REALLY slow) which supports typed arrays.

On the other hand, termcap/info doesn't solve all problems; it's senseless
to complain about termcap and passwd not having the same format, they're
keyed and used differently.  Passwd uses yet another SIMPLE, GENERAL format,
which is easily manipulated even at the shell level.  Crontab is actually a
simple variant of that format, and perhaps should be merged, but the
existing tools can very easily deal with both.  (After all, there's really a
difference only in that a colon is used as passwd's field separator, while
crontab uses a tab.  Interpretation of fields varies, but that's going to
happen anyway in a real-world database situation.)

++Brandon
-- 
Brandon S. Allbery, uunet!marque!ncoast!allbery			DELPHI: ALLBERY
	    For comp.sources.misc send mail to ncoast!sources-misc
"Don't discount flying pigs before you have good air defense." -- jvh@clinet.FI

cmf@cisunx.UUCP (Carl M. Fongheiser) (09/24/88)

In article <69166@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes:
>> But...the user can't necessarily replace RMS without getting to write
>> his own CHME dispatch table, something the kernel is not likely to let
>> him do.
>
>This depends on what you mean by "replace".
>
>Under RSX-11, you can presumably "replace" RMS, in the sense of having a
>package that does the same general sort of function, without being able to get
>at a more privileged mode.  If an RMS-like package doesn't require QIOs that
>can be run only from executive mode, you could do the same under VMS (although
>I wouldn't put it past DEC to have stuck in executive-mode-only QIOs which RMS
>uses).

As a matter of fact, RMS does *not* do anything you can't do in user mode.
All of the QIO's for reading and writing file attributes are available in
user mode.  The only thing that makes having RMS run in executive mode
worthwhile is that open files can persist past the activation of a single
image.  (Remember that VMS processes typically last a lot longer than Unix
ones, normally from login to logout).  The tricky part about replacing RMS
is doing record-locking, since that's a concept foreign to the ACP itself.
Also note that in Version 4.0 and later, many of the system services use
RMS themselves.

				Carl Fongheiser
				University of Pittsburgh
				...!pitt!cisunx!cmf
				cmf@unix.cis.pittsburgh.edu
				cmf@PITTUNIX.BITNET

bzs@xenna (Barry Shein) (09/25/88)

If I can be permitted to summarize this discussion:

VMS's RMS can be useful in many situations and amounts to an added
application library bundled in with VMS which Unix folks would have to
go out and purchase separately (I've seen similar libraries for Unix
advertised in trade mags, they do exist.) Presumably one can add a
similarly useful access methods library to Unix, the biggest question
being the desirability of true asynchronous I/O (it's possible that,
from a pure performance standpoint, Unix wouldn't benefit that much
from this due to its buffer cache although some would still like it.)

VMS's biggest drawback, in regards RMS, is that there wasn't much more
discipline on the part of the applications designers to use
(preferably) one access method for most applications so utilities
could work together more smoothly. Having one utility produce a text
file which cannot be read in and manipulated by another seems to
violate "the law of least astonishment" in a major way. Simply
handling all the permutations is not as reliable as agreeing on one
format except where carefully justified. This is particularly true
when changing between programming languages (at least one reader
claims this.)

I think it's safe to say this was a constructive discussion.

	-Barry Shein, ||Encore||

mazumdar@fredonia.UUCP (Jin Mazumdar) (09/29/88)

	
	I have just been browsing through this discussion and have not
read all follow ups. Although UNIX does not have fixed length
records can one not convert any file in UNIX to fixed length records
using the dd utility?  On the other hand on fixed format systems the
best you could do is fake variable format with an end of record marker
and possibly wasting the rest of the record.

   Jin Mazumdar (uucp:) ...decvax!sunybcs!fredonia!mazumdar          
   >>>  The following are for historical interest only  <<<
   Dept. Of Math and C. S.     
   State University of New York College at Fredonia     
   Fredonia, N.Y. 14063         (716) 673 3459                               
 

dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/29/88)

In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes:
>Although UNIX does not have fixed length
>records...

It certainly does.  Look at the structure of /etc/utmp and /usr/adm/wtmp
or equivalent files on your system.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi

jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/30/88)

In article <4136@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes:
>>Although UNIX does not have fixed length
>>records...
>
>It certainly does.  Look at the structure of /etc/utmp and /usr/adm/wtmp
>or equivalent files on your system.

not in the typical sense.  there is no file-system level support for
fixed length records.  unix files are byte streams, meaning [ with the
exception of certain device files ] you can read 1 byte or, hardware
permitting, 1MB.

with other operating systems the size of the record is fixed at file
creation time and may not be changed without copying the contents of
the file using a file conversion utility of some type.  /etc/utmp may
be read one byte at a time, except that the "records" would not have
any meaning.


-- 
John F. Haugh II (jfh@rpp386.Dallas.TX.US)                   HASA, "S" Division

      "Why waste negative entropy on comments, when you could use the same
                   entropy to create bugs instead?" -- Steve Elias

allbery@ncoast.UUCP (Brandon S. Allbery) (10/07/88)

As quoted from <4136@bsu-cs.UUCP> by dhesi@bsu-cs.UUCP (Rahul Dhesi):
+---------------
| In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes:
| >Although UNIX does not have fixed length
| >records...
| 
| It certainly does.  Look at the structure of /etc/utmp and /usr/adm/wtmp
| or equivalent files on your system.
+---------------

The programs that use those files use fixed-length "records"; the file system
itself does not enforce them, however.  The difference is that you don't have
to tell your favorite binary editor that it must open /etc/utmp with a record
size of (sizeof (struct utmp)) bytes.

++Brandon
-- 
Brandon S. Allbery, uunet!marque!ncoast!allbery			DELPHI: ALLBERY
	  For comp.sources.misc send mail to <backbone>!sources-misc
comp.sources.misc is moving off ncoast -- please do NOT send submissions direct
	  "So many articles, so little time...."  -- The Line-Eater