bagpiper@mcosm.uucp (06/02/90)
I've been trying to use VMS GNU c v1.36 (and I am using the VMS c rtl libs..so this probably isn't GNU specific). I had a piece of code under MS-DOS which reads a file in binary mode (fopen("filename","rb")). This piece of code does not work under VMS because all of the cr's get cooked out of the file. The files record attribute is "Carriage Return Carriage Control". I am using fgetc to read data out of the file. Any hints, clues, ect. I am fairly new to VMS so I don't know all the insides of RMS. Is there any way to do this using portable c? How about some RMS call? (oh, I am using VMS 5.1-1... that shouldn't make much of a difference should it???) Thankx for any help, Michael ------------------------------------------------------------------------------- + Michael Hunter {backbone}!hacgate!trwind!mcosm!bagpiper + + BIX:bagpiper + + NOTHING like a spacecraft with a bad attitude!!! + -------------------------------------------------------------------------------
a204@mindlink.UUCP (Alexander Stockdale) (06/11/90)
Regarding reads from RMS files, I've played with this at various times. It always works better if the the file characteristic is Stream-LF (the UNIX format). There are utilities which will convert files to this format for you (check some of the DECUS tapes). If you don't want to convert, you'll probably have to modify the code so it knows about the implicit carriage return information in the file. This can be a real hassle, depending on what the code is supposed to do. For example, if you're trying to strip CR/LF, no problem. If, on the other hand, you're trying to convert them to something else, this can be a real hassle. -- ------------------------------------------------------------------------ Alexander Stockdale | I'm not getting older -- I'm getting bitter. Vancouver, BC, Canada| - me (as far as I know)
pauls@lion.inmos.co.uk (Paul Sidnell) (06/13/90)
>which reads a file in binary mode (fopen("filename","rb")). This piece of code >does not work under VMS because all of the cr's get cooked out of the file. >The files record attribute is "Carriage Return Carriage Control". I am using >fgetc to read data out of the file. Any hints, clues, ect. I am fairly So you love the VAX too :-) If your program had survived a little longer, it's dying comment may have been "I CAN'T FIND EOF EITHER" ! My understanding (arrived at with much pain and frustration) is that if a file already exists, the mode that you open the file in is ignored completely. If you delete the file and THEN fopen("filename","rb"); then a 'STREAM-LF' file will be created and everything will be happy again. Similarly you will find many 'departures' from ANSII C using ftell and fseek on non 'STREAM-LF' files. Generally, the I/O is at it's sanest ONLY on these types of files. Please excuse any froth around my mouth while I discuss this subject. ------------------------------------------------------------------------------- | Disclaimer: IT'S ALL MY FAULT | -------------------------------------------------------------------------------
daniels@hanoi.enet.dec.com (Bradford R. Daniels) (06/15/90)
The VAX C RTL tries to "do the right thing" with record files. If the file has a carriage control attribute (e.g. carriage return carriage control in this case), it will (by default) append a newline to each record as it is read in. Your fopen() statement, however, specifies "rb" as the file open mode, or "read only in binary mode". This specification that the file is binary overrides C's default carriage control interpretation, and you get the data unadorned, as it were, with newlines. This is fine if you're truly reading a binary file which just happens to be in variable record format, but in the case of the file in question, the record format is significant, and the file is not, in fact, simply binary data. Hope this helps. - Brad ----------------------------------------------------------------- Brad Daniels | Digital Equipment Corp. almost DEC Software Devo | definitely wouldn't approve of "VAX C RTL Whipping Boy" | anything I say here...
daniels@hanoi.enet.dec.com (Bradford R. Daniels) (06/15/90)
In article <2081@mindlink.UUCP>, a204@mindlink.UUCP (Alexander Stockdale) writes: > Regarding reads from RMS files, I've played with this at various times. > It always works better if the the file characteristic is Stream-LF (the > UNIX format). There are utilities which will convert files to this format for > you (check some of the DECUS tapes). If you don't want to convert, you'll > probably have to modify the code so it knows about the implicit carriage return > information in the file. This can be a real hassle, depending on what the code > is supposed to do. For example, if > you're trying to strip CR/LF, no problem. If, on the other hand, you're trying > to convert them to something else, this can be a real hassle. The reason stream-LF works so well is that we access stream-lf files in block mode by default (i.e., we have RMS do direct QIOs to the file.) Since we pay no attention to the record format of the file in block mode, we can provide a much higher degree of Unix emulation, particularly as regards file positioning. Unfortunately, though, block mode I/O is much slower than record mode I/O because it can- not use such beneficial features as multibuffering, read-ahead, and write-behind. The important thing to remember about the VAX C/VMS RTL is that it always tries to make the data it reads in look like a stream, even if there is record structuring information in the file. That means that things like record boundaries and carriage control have an effect on the data you receive. If you accept the default behavior, you will usually get the right behavior. If you randomly throw RMS options at the open statement, specify binary data when you don't mean it, or give a file carriage control attributes it shouldn't have, however, you won't get the right behavior. I'm not saying the carriage control attributes on a file are always control, but in general, these are good rules of thumb: 1. If the file contains text, it should either be streamlf or some form of variable record length file with an appropriate carriage control attribute. 2. If the file contains binary data, it should either be streamlf format (possibly with carriage return carriage control,) or some other format with no carriage control. The latter is preferable, since a legal stream-lf file is supposed to end with a newline, and real binary files rarely do. Of course, if you never access the file in record mode, it doesn't really matter. 3. Don't use "b" mode in fopen unless the file really is just binary data. You will usually lose any line formatting information there may be in the file (e.g. newlines if the file uses carriage return carriage control). I had some others in mind when I started, but I can't remember them right now... I'm sure a future question will bring them to mind, though... - Brad ----------------------------------------------------------------- Brad Daniels | Digital Equipment Corp. almost DEC Software Devo | definitely wouldn't approve of "VAX C RTL Whipping Boy" | anything I say here...
daniels@hanoi.enet.dec.com (Bradford R. Daniels) (06/15/90)
In article <7459@ganymede.inmos.co.uk>, pauls@lion.inmos.co.uk (Paul Sidnell) writes: > My understanding (arrived at with much pain and frustration) is that if a file > already exists, the mode that you open the file in is ignored > completely. If you > delete the file and THEN fopen("filename","rb"); then a 'STREAM-LF' file > will be > created and everything will be happy again. Huh? "rb" opens the file read only. Of course the file will still have whatever attributes it had before it ws opened. VAX C tries to balance Unix compatibility and VMS integration wherever possible. On Unix, opening an existing file whether for input or output changes nothing about the file except the date and possibly its contents (if you're truncating the file). Other attributes (ownership, permissions, etc.) remain unchanged. The first implementors of the VAX C RTL decided (quite reasonably, I feel) to extend that concept to the much larger set of attributes files may have under VMS. Thus, if you supercede an existing file, it should (were it not for some bugs in certain code paths in the RTL) have the same protection, format, carriage control, and whatever else as the existing version of the file. If you did not explicitly specify any file format or carriage control options, the C RTL assumes you don't want to change anything. > Similarly you will find many 'departures' from ANSII C using ftell and fseek on > non 'STREAM-LF' files. Yeah, it kinda bugs me, too... It actually would have been possible to get better emulation for files smaller than 4MB if a different encoding had been chosen from day 1, but the algorithm would have been completely broken for larger files, and the values returned by ftell() would not have been simple integer offsets from the beginning of the file. As it is, the current algorithm is the best you can do given 32 bit integers and RMS' requirements. > Generally, the I/O is at it's sanest ONLY on these types of files. Actually, it's pretty sane on most file types if you know what it's supposed to do. The problem is a lack of documentation as to what it's supposed to do... > Please excuse any froth around my mouth while I discuss this subject. We've been looking over ways to improve the VAX C I/O system in a major way, perhaps even rewriting the whole thing. Constructive frothing is always appreciated. - Brad ----------------------------------------------------------------- Brad Daniels | Digital Equipment Corp. almost DEC Software Devo | definitely wouldn't approve of "VAX C RTL Whipping Boy" | anything I say here...
martin@mwtech.UUCP (Martin Weitzel) (06/26/90)
In article <1951@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: [about 32-bit values beeing too small to represent file position offsets] >A strategy that DEC could have used, but did not, would have been to >make ftell() return a magic cookie. The magic cookie would exactly >encode record number and offset for small files. For larger files, the >magic cookie would be the index of a seek value in an internal table. >This table would grow as needed. The "magic cookie" approach is exactly what ANSI-C supports through the (new) "fgetpos/fsetpos" functions. IMHO it could not be generally applied to "ftell/fseek", because it is existing practice to do some arithmetic with the return value (eg. to advance some bytes in either direction). This should also be taken as a guideline, *when* to use *which* of the two similar functions for positioning in files. 1) Use "fgetpos/fsetpos" regardless of the file type (binary or text), if you only want "mark" some places for later "restore". 2) Use "ftell/fseek" only if you are sure that the following restrictions will be met: a) For a "text"-file, you only fseek to the start (offset 0 with SEEK_SET), the current postion (offset 0 with SEEK_CUR, somewhat pointless but allowed) or to the end (offset 0 with SEEK_END). Otherwise you use *exactly* the value that is returned from "ftell" (no arithmetics!) as offset with SEEK_SET. b) For a "binary"-file you may "fseek" to any position, even one you calculate by doing some arithmetic with the return value of "ftell", but be sure that the maximum file size will fit into a long on the target hardware. Note that 2a) opens the door for the "magic cookie"-approach, but it can only be applied to "text"-files. Obviously, ANSI-C misses to specify a *portable* method to seek around within binary files with sizes that exceed the range of values for a long. IMHO, this is not a serious drawback (though I'm sure that there are some readers on the net who will exactly have this requirement :-)). But consider that many operating systems currently do not support files of this size, that even if they do, files of this size are not very frequently found, and that nobody forbids a C-implementation to support additional library functions that adress the problem. -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
peter@ficc.ferranti.com (Peter da Silva) (06/27/90)
In article <12865@shlump.nac.dec.com> daniels@hanoi.enet.dec.com (Bradford R. Daniels) writes: > It's provided as an alternative to the normal > mechanism, has slightly different semantics, and is a bit more > efficient than the alternative. The existing syntax, "#include <stdio.h>" was already perfectly well suited for those semantics. -- Peter da Silva. `-_-' +1 713 274 5180. <peter@ficc.ferranti.com>
djh@osc.edu (David Heisterberg) (06/28/90)
In article <.X94JD6@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: > The existing syntax, "#include <stdio.h>" was already perfectly well > suited for those semantics. Then use it. It works just fine. I don't understand what you're complaining about. VMS often has include-like files in text libraries, and the VAX C extension to #include takes advantage of that. If you don't want to use it, then don't. Who's putting a gun to your head? -- David J. Heisterberg djh@osc.edu And you all know The Ohio Supercomputer Center djh@ohstpy.bitnet security Is mortals' Columbus, Ohio ohstpy::djh chiefest enemy.
peter@ficc.ferranti.com (Peter da Silva) (06/28/90)
The subject is: "#include stdio" in VAX C. I claim: > The existing syntax, "#include <stdio.h>" was already perfectly well > suited for those semantics. In article <686@illini.osc.edu> djh@osc.edu (David Heisterberg) writes: > VMS often has include-like files in text libraries, and the VAX C > extension to #include takes advantage of that. They didn't need to create a new syntax for it. What I'm saying is that they should have made "#include <stdio.h>" extract those files from those text libraries, thus making the new syntax redundant. > If you don't want to use it, > then don't. Who's putting a gun to your head? The people writing the code that I had to support. I don't know about you, but I don't have the advantage of programming in a vacuum. -- Peter da Silva. `-_-' +1 713 274 5180. <peter@ficc.ferranti.com>
henry@zoo.toronto.edu (Henry Spencer) (06/29/90)
In article <686@illini.osc.edu> djh@osc.edu (David Heisterberg) writes: >> The existing syntax, "#include <stdio.h>" was already perfectly well >> suited for those semantics. > >Then use it. It works just fine. I don't understand what you're complaining >about. VMS often has include-like files in text libraries, and the VAX C >extension to #include takes advantage of that. If you don't want to use it, >then don't. Who's putting a gun to your head? Does the word "portability" mean anything to you? How about "many wasted man-months trying to port software written by clever people who think it's cute to use VMS-specific language features"? We don't want *you* to use it either, because someday we may have to port or maintain your code. -- "Either NFS must be scrapped or NFS | Henry Spencer at U of Toronto Zoology must be changed." -John Osterhout | henry@zoo.toronto.edu utzoo!henry
karl@haddock.ima.isc.com (Karl Heuer) (06/29/90)
In article <686@illini.osc.edu> djh@osc.edu (David Heisterberg) writes: >In article <.X94JD6@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: >> The existing syntax, "#include <stdio.h>" was already perfectly well >> suited for those semantics. > >Then use it. It works just fine. I don't understand what you're complaining >about. VMS often has include-like files in text libraries, and the VAX C >extension to #include takes advantage of that. The point is, there was no need to invent a syntax extension. DEC should have simply asserted that `#include <stdio.h>' searches a text library in addition to a directory. That way, we all get the benefit of the speed advantage, and DEC wouldn't now be in the embarrassing position of having a feature that is in direct conflict with the ANSI C `#include MACRONAME' feature. (It was a botch to use angle brackets as well as quotes in the first place, but it's way too late to correct that.) Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint