[comp.lang.c] VMS C & records in files

rja@edison.GE.COM (rja) (08/17/88)

I'm not aware of any solution to the problem of VMS file types.  The
problem is precisely that VMS is so record-oriented.  Even nominal 
text files don't work like UNIX.  We find that we have to use a loop of
successive calls to read() to fill (for example) a 512 byte buffer
because it gives only 1 record at a time even though you asked for 512
bytes. :-(
  UNIX and even MS-DOS will let you read 512 bytes in a chunk so it's VMS
that is brain-damaged in this case.

  If anyone hears of a SOLUTION to this problem with VMS C, please e-mail
me the details.

______________________________________________________________________________
         rja@edison.GE.COM      or      ...uunet!virginia!edison!rja  
     via Internet (preferable)          via uucp  (if you must)
______________________________________________________________________________

ward@eplrx7.UUCP (Rick Ward) (08/18/88)

[ article about difficulties with reading records in C on VMS ]

On a related note, why couldn't Digital have made it easier to call system
functions from C.  This difficulty alone makes C unusable on VMS, at least 
in my opinion :(.  The problem is that you have to allocate and build 
structures describing each variable you want to pass to a system routine.
YUCK!

Rick
-- 
    Rick Ward                         |        E.I. Dupont Co.
    uunet!eplrx7!ward                 |        Engineering Physics Lab
    (302) 695-7395                    |        Wilmington, Delaware 19898
                                      |        Mail Stop: E357-302

gwyn@smoke.ARPA (Doug Gwyn ) (08/19/88)

In article <1609@edison.GE.COM> rja@edison.GE.COM (rja) writes:
>We find that we have to use a loop of successive calls to read() to fill
>(for example) a 512 byte buffer because it gives only 1 record at a time
>even though you asked for 512 bytes.

UNIX also returns no more than one record per read() call.  The main
difference is that UNIX disk files are just one big record.  Other
UNIX files may be multi-record (e.g. magtape, terminal).

scjones@sdrc.UUCP (Larry Jones) (08/19/88)

In article <1609@edison.GE.COM>, rja@edison.GE.COM (rja) writes:
> I'm not aware of any solution to the problem of VMS file types.  The
> problem is precisely that VMS is so record-oriented.  Even nominal 
> text files don't work like UNIX.  We find that we have to use a loop of
> successive calls to read() to fill (for example) a 512 byte buffer
> because it gives only 1 record at a time even though you asked for 512
> bytes. :-(
>   UNIX and even MS-DOS will let you read 512 bytes in a chunk so it's VMS
> that is brain-damaged in this case.
> 
>   If anyone hears of a SOLUTION to this problem with VMS C, please e-mail
> me the details.

The solution is to use STANDARD io instead of system-dependent calls like
read.  If you use fread instead of read all works wonderfully.

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@sdrc
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150                  AT&T: (513) 576-2070
Nancy Reagan on superconductivity: "Just say mho."

warner@hydrovax.nmt.edu (M. Warner Losh) (08/19/88)

In article <644@eplrx7.UUCP>, ward@eplrx7.UUCP (Rick Ward) writes...
>On a related note, why couldn't Digital have made it easier to call system
>functions from C.  This difficulty alone makes C unusable on VMS, at least 
>in my opinion :(.  The problem is that you have to allocate and build 
>structures describing each variable you want to pass to a system routine.

It has been my experience you only need to build descriptors whenever you
are playing with strings.  Everything else is simply a matter of maybe
putting a & in front of what you want to call.  You can make your life
a lot easier if you use the following function (DEC, why didn't you provide
this?):

#include <stdio.h>
#include <descrip.h>

struct dsc$descriptor * make_descriptor(st)
char *st;
{
	struct dsc$descriptor * temp;

	temp = (struct dsc$descriptor *) malloc 
		(sizeof (struct dsc$descriptor));
	if (temp == NULL)
		return (NULL);
	temp->dsc$w_length = strlen (st);
	temp->dsc$a_pointer = st;
	temp->dsc$b_dtype = DSC$K_DTYPE_T;
	temp->dsc$b_class = DSC$K_CLASS_S;
	return (temp);
}

>YUCK!

Well, it is yucky if you include that code every time you want to make a 
simple system call.  The only caveat about this method is that you must 
free those allocated descriptors at some point.  All in all, it's fairly 
straight forward.

Now then, if you had done it the way that DEC documented it .....  The
VAX-C documentation that came with V2.2 is very bad.  Not as bad as many of 
the UNIX manuals, mind you, but still bad.  The stuff that comes with V2.3 
looks a lot better.

Warner
hydrovax%nmt@relay.cs.net

joshua@uop.edu (Ed Bates: Joshua is my son's name.) (08/19/88)

In article <8353@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes:
> In article <1609@edison.GE.COM> rja@edison.GE.COM (rja) writes:
> >We find that we have to use a loop of successive calls to read() to fill
> >(for example) a 512 byte buffer because it gives only 1 record at a time
> >even though you asked for 512 bytes.
> 
> UNIX also returns no more than one record per read() call.  The main
> difference is that UNIX disk files are just one big record.  Other
> UNIX files may be multi-record (e.g. magtape, terminal).

Could it be, El Guapo, that you want to read a block?  If this is true, try
the VMS IO$_READxBLK operations.

						Ed Bates
						Academic Computer Specialist
						University of the Pacific
						    in sunny Stockton, CA!

joshua@uop.edu (Ed Bates: Joshua is my son's name.) (08/19/88)

In article <644@eplrx7.UUCP>, ward@eplrx7.UUCP (Rick Ward) writes:
> 
> [ article about difficulties with reading records in C on VMS ]
> 
> On a related note, why couldn't Digital have made it easier to call system
> functions from C.  This difficulty alone makes C unusable on VMS, at least 
> in my opinion :(.  The problem is that you have to allocate and build 
> structures describing each variable you want to pass to a system routine.
> YUCK!

It doesn't seem like such a difficult thing to create an include file with most
of the structures used by system services that you would need.  Granted, this is
more than you would need to do with Unix, but it would save the need for re-
creating this information each time you needed it.

-- Ed

leo@philmds.UUCP (Leo de Wit) (08/23/88)

In article <966@nmtsun.nmt.edu> warner@hydrovax.nmt.edu (M. Warner Losh) writes:
|It has been my experience you only need to build descriptors whenever you
|are playing with strings.  Everything else is simply a matter of maybe
|putting a & in front of what you want to call.  You can make your life
|a lot easier if you use the following function (DEC, why didn't you provide
|this?):
|
|#include <stdio.h>
|#include <descrip.h>
|
|struct dsc$descriptor * make_descriptor(st)
|char *st;
|{
   [body discarded]...
|}

Don't know which version of VMS C you're using, but this is from
descrip.h on our system, the last 5 lines (maybe you overlooked them):

/*
 *	A simple macro to construct a string descriptor:
 */
#define $DESCRIPTOR(name,string)	struct dsc$descriptor_s name = { sizeof(string)-1, DSC$K_DTYPE_T, DSC$K_CLASS_S, string }

Note that this deals with removal of the declared object as well (no need for
free). The initialization of an auto struct may not be portable, but then: who
cares; neither is the stupid use of $ in names.
So you can throw your function overboard now 8-).

                                          Leo.

terry@wsccs.UUCP (Every system needs one) (08/31/88)

In article <1609@edison.GE.COM>, rja@edison.GE.COM (rja) writes:
> I'm not aware of any solution to the problem of VMS file types.  The
> problem is precisely that VMS is so record-oriented.  Even nominal 
> text files don't work like UNIX.  We find that we have to use a loop of
> successive calls to read() to fill (for example) a 512 byte buffer
> because it gives only 1 record at a time even though you asked for 512
> bytes. :-(
>   UNIX and even MS-DOS will let you read 512 bytes in a chunk so it's VMS
> that is brain-damaged in this case.

Generally, I am on th UNIX side of the VMS/UNIX See-Digital-Sue-The-
Government-When-It-Mentions-UNIX-But-Still-Lose-The-Contract debate.

*BUT* it is the programmer, not VMS in this case.  Try one of:

	char *file = "myfile";

			/---- avoid the extra versions now!
			|
			v
	fopen( file, "r+");
	fopen( file, "r+", "rat=var,cr");
	fopen( file, "r+", "rat=fix", "mrs=512");

	If you must use open(), then at least have the decent-C to
read with more than on character in your buffer and write the same
way.  Files written with either the 2nd or 3rd above can't use putc()
at the low level, or the will (of course!) write out single records.
A record is sort of defined (gentleman's agreement?!?) as the result
of a single write operation.  The function fprintf() (for example)
uses putc() at the low level and so is broken.  If you can stand the
link warnings, write your own buffered putc().  The fprintf() can
be fixed by using your own with sprintf() and puts() or write()...
again, you will have to ignore the link errors, as well as the new
compile-time warnings about varadic functions.

The one real gripe I have is with record oriented files and implied
carriage control.  When reading from one of these things, VMS "fakes-up"
the record delimiter at the end.  I am of the opinion that after doing
that, the ftell() (which gives the record # even though you aren't
supposed to assume that) gives the fseek() target of the _beginning_
of_the_current_record_, even though the next character read will be
from the next record and should instead give the seek offset for the
beginning of the next record.  A fix that will survive them fixing it
(a real bummer if you're linked shared) follows:


#define ftell myftell	/* depends on not being a macro*/


....
....
....

/* DANGER! don't use ftell() after this in your code!*/
#undef ftell

myftell( fp)	/* not called myftell() to avoid becoming mymyftell()*/
FILE *fp;
{
	ungetc(getc());	/* inserting the fp is left as an exercise...*/
	return( ftell( fp));	/* uses the real ftell()*/
}

If ftell() is a macro on your system, happy search/replace!


| Terry Lambert           UUCP: ...{ decvax, ihnp4 } ...utah-cs!century!terry |
| @ Century Software        OR: ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
|                   'I have an eight user poetic liscence' - me               |

terry@wsccs.UUCP (Every system needs one) (08/31/88)

In article <644@eplrx7.UUCP>, ward@eplrx7.UUCP (Rick Ward) writes:
> On a related note, why couldn't Digital have made it easier to call system
> functions from C.  This difficulty alone makes C unusable on VMS, at least 
> in my opinion :(.  The problem is that you have to allocate and build 
> structures describing each variable you want to pass to a system routine.
> YUCK!

Not if your C has #include or #define... it does, doesn't it?

#include <descrip.h>

...
...
...

$DESCRIPTOR( foo, "whatever");

makes a descriptor foo.

As far as modding a descriptor struct, once built, the most common
problem is that you need a sizeof() op, and that's a preprocessor
thing and you can't pass it to your function to fill the descriptor
without being kludgy.  Observe:

#define filldesc( x, y) magicfilldesc( x, y, sizeof(y))

	filldesc( desc, "hello");


magicfilldesc( desc, object, size)	/* VMS enjoys long names*/
{
...
...
...
}

Admittedly, it makes for some verbose assembly, but what can you expect
from a converted optimising PL/1 compiler anyway?


| Terry Lambert           UUCP: ...{ decvax, ihnp4 } ...utah-cs!century!terry |
| @ Century Software        OR: ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
|                   'I have an eight user poetic liscence' - me               |

moore@utkcs2.cs.utk.edu (Keith Moore) (09/15/88)

In article <1609@edison.GE.COM>, rja@edison.GE.COM (rja) writes:
> I'm not aware of any solution to the problem of VMS file types.  The
> problem is precisely that VMS is so record-oriented.  Even nominal 
> text files don't work like UNIX.  We find that we have to use a loop of
> successive calls to read() to fill (for example) a 512 byte buffer
> because it gives only 1 record at a time even though you asked for 512
> bytes. :-(
>   UNIX and even MS-DOS will let you read 512 bytes in a chunk so it's VMS
> that is brain-damaged in this case.

To be fair, Unix does read a record at a time when it is 
appropriate to do so (e.g. from a 1/2 inch tape drive, but NOT
from a 1/4 inch tape drive, where records are always 512 bytes).  
Certainly record-at-a-time behavior is often appropriate for VMS 
when using non-text files (there are no carraige control attributes).  
Record-at-a-time behavior is generally inappropriate for text files.
The C library should treat text files as streams by default,
flush a record when it sees a newline, and read as many bytes as you ask for.  

The VMS C library needs a convenient way to change its default behavior.
This way you could say 
"treat the file as a stream" 
	(record boundaries don't correspond with read boundaries), or
"treat the file like a 1/4 inch tape" 
	(read in as many records as will fit in the buffer), or 
"treat the file like a 1/2 inch tape"
	(read in only one record at a time).

Some of these you can do with the VMS C library, with some convoluted
and obscure arguments to open, creat, etc.  At any rate, the default 
behavior of the library i/o routines often doesn't do what you want or expect.

In article <623@wsccs.UUCP> terry@wsccs.UUCP (Every system needs one) writes:
[...]
>	fopen( file, "r+");
>	fopen( file, "r+", "rat=var,cr");
>	fopen( file, "r+", "rat=fix", "mrs=512");
>
>	If you must use open(), then at least have the decent-C to
>read with more than on character in your buffer and write the same
>way.  Files written with either the 2nd or 3rd above can't use putc()
>at the low level, or the will (of course!) write out single records.

For the 2nd form of fopen (), the behavior is brain-damage.  For the
3rd form, it is probably correct, since you aren't creating a text
file.  Record-oriented files which aren't text files should look 
like magnetic tapes.

>A record is sort of defined (gentleman's agreement?!?) as the result
>of a single write operation.  
This is consistent with Unix behavior for record-oriented files.
For VMS, you need to be able to choose which behavior you want.

>The function fprintf() (for example)
>uses putc() at the low level and so is broken.  If you can stand the
>link warnings, write your own buffered putc().  The fprintf() can
>be fixed by using your own with sprintf() and puts() or write()...
>again, you will have to ignore the link errors, as well as the new
>compile-time warnings about varadic functions.

There's no reason why fprintf can't work right for any kind of file.
Since it's a library function, it could be taught to know about the
VMS file system.  For instance:  if you are writing to a text file,
fprintf should flush the file buffer (thus writing out a new record) 
when it sees a newline.  If the file is a fixed-length record text
file (!), pad the rest of the record with blanks.

Actually, this could all be done at the putc/flushbuf level and fprintf
would not have to be concerned with it.

I am of the opinion that the VMS C library is not very well designed.
The defaults for many of the i/o functions are poorly chosen, and it
is often difficult to work around them without circumventing C-style i/o
altogether in favor of RMS or system services.  More recent versions
of the library have attempted to address the design deficiencies, but
backward compatability considerations will keep Digital from ever fixing
the library to work well.

(Does a free or public-domain implementation of stdio exist?)
-- 
Keith Moore
UT Computer Science Dept.	Internet/CSnet: moore@utkcs2.cs.utk.edu
107 Ayres Hall, UT Campus	BITNET: moore@utkcs1
Knoxville Tennessee 37996-1301	Telephone: +1 615 974 0822

marc@ima.ima.isc.com (Marc Evans) (09/15/88)

I found that avoiding the RMS facilities is the most practical way of dealing
with the VMS system. It took a little work up front, but the end product was
quite a bit more easy to debug/use/code.

What I did was wrote the CLIB functions that I needed using the QIO facilities.
This bypasses the RMS problem entirely, and in fact has a reasonble mapping
to the read/write/lseek/open/close counterparts. The result was a scheme of
/* DO WHAT I SAY, DAMN IT! */.

The unfortunate part is that I am an independant consultant, and the code was
left behind at a client site. I don't seem to have a copy hanging around here
anyplace, but if I discover it, I will attempt to post it.

===============================================================================
Marc Evans | decvax<--\    /-->marc<--\               | That's not a bug...It's
Synergytics| harvard<--\  /            \  /--->norton | a design feature... 8-)
Pelham, NH | necntc<---->ima<---->symetrx<---->dupont | =======================
===============================================================================

scs@athena.mit.edu (Steve Summit) (09/16/88)

In article <2627@ima.ima.isc.com> marc@ima.UUCP (Marc Evans) writes:
>I found that avoiding the RMS facilities is the most practical way of dealing
>with the VMS system.  What I did was wrote the CLIB functions that I needed
>using the QIO facilities.  This bypasses the RMS problem entirely...

People from a Unix background who try to use VMS have my
sympathies; and it is true that the low-level QIO interface,
though unwieldy, is functionally reminiscent of good ol' Unix
read/write/ioctl, but using it instead of RMS for disk file I/O
is generally a bad idea.  When DEC changes things, they change
RMS, and programs which (properly) use RMS don't notice the
change.  Programs with have gone the long, lonely road of wheel
reinvention (they essentially end up reimplementing parts of RMS,
and there's an awful lot of code there to reimplement) find the
road very lonely indeed when operating system upgrades which were
supposed to be transparent (new filesystems, network file access,
etc.) render the renegade programs non-functional.

                                            Steve Summit
                                            scs@adam.pika.mit.edu