STEINBERGER@KL.SRI.COM (Richard Steinberger) (09/13/87)
I have 2 questions concerning VAX C. Thanks in advance to anyone who can help. Question 1: I wrote a short main routine (see below) to test a function. One of the inputs is an integer number, and the next input is a filename. The problem is that the scanf that reads the number apparently leaves a LF character in a buffer that is then read by the code that is expecting the filename. Because it's a LF, I never get a chance to enter a filename (see code). My "kludgy" solution was to put the line "i = getchar()" after the scanf to remove the LF character; when this is done the following lines that get a filename work fine. Am I missing something fundamental? I've tried using gets after the scanf and the result is the same, i.e. unless the "i = getchar()" is there to remove the LF, gets doesn't "work" either. Why does scanf leave a LF character, or am I misinterpreting what's going on? Are there solutions other than using a getchar() call after a scanf that preceeds a gets or getchar call? The relevant code section is below: printf("\nEnter a string size: "); scanf("%d",&n_chars); /* clear NEWLINE char left after scanf */ i = getchar(); /* WHY DO I NEED THIS ? */ printf("\nEnter a file to use: "); for (i=0; (input_file[i] = getchar()) != '\n'; i++); input_file[i] = '\0'; ______________________________________________________________________________ Question 2: I am trying to read and write binary files similar to ones created by FORTRAN open statements (that produce sequential, unformatted, fixed recordsize files). Specifying "rb" in an fopen statement seems to be enough to access existing files (using an fread call). If I want to create a new binary file with fixed-length records, which if any, of the keywords on p 4-5 of the C RTL Ref Man are appropriate? Is "rfm=fix" enough? Do I need "mrs=size" as well, and if so, is size in units of bytes, or longwords (like Fortran)? Also, is there any reason to use the chapter 4 open and read or write functions instead of the chapter 2 functions (fopen, fread and fwrite)? ______________________________________________________________________________ Ric Steinberger steinberger@kl.sri.com -------
leichter@VENUS.YCC.YALE.EDU.UUCP (09/17/87)
Question 1: I wrote a short main routine ... to test a function. One of the inputs is an integer... and the next a filename.... [T]he scanf that reads the number apparently leaves a LF character in a buffer that is then read by the code that is expecting the filename.... Why does scanf leave a LF character, or am I misinterpreting what's going on? Are there solutions other than using a getchar() call? ... printf("\nEnter a string size: "); scanf("%d",&n_chars); /* clear NEWLINE char left after scanf */ i = getchar(); /* WHY DO I NEED THIS ? */ printf("\nEnter a file to use: "); for (i=0; (input_file[i] = getchar()) != '\n'; i++); input_file[i] = '\0'; You've been bitten by a very common Unix programming "gotcha'". The problem is that you are thinking of scanf() as operating on lines - a seemingly natural way to look at things, because after all you are typing one value per line. Unfortunately, that's NOT the way scanf() is actually defined; it reads streams of bytes, and attaches no special significance to the end of an input line - as far as it is concerned, '\n' is just another whitespace character. You've provided nothing to "swallow" that whitespace character, so it remains in the buffer, ready to screw up the next input request. There are two work-arounds. A direct, strange-looking, and not very good technique is to provide something to "swallow" the newline. Since a SPACE in the format specification matches zero or more whitespace characters in the input, you need merely change your scanf() call to: scanf("%d ",&n_chars); (Actually, scanf("%d\n",&n_chars); is equivalent and looks a bit better.) The big problem with this technique is that it will keep reading input until it sees a non-whitespace character (which it will unget()). This is fine for reading files, terrible for inter- active input. So, scratch the direct approach. The RIGHT way to solve this problem is to forget about scanf() entirely. You want line-at-a-time parsing, which scanf doesn't give you - but you can easily build it: Use gets() or fgets() to read a line, then use sscanf() to parse it. (In fact, my advice - it's not just mine, my Unix-oriented officemate agrees - is that you'll probably be happiest if you take your manual and ink out the entries for scanf() and fscanf(). Actually, he says he wouldn't be at all unhappy if sscanf() suddenly vanished at the same time....) Question 2: I am trying to read and write binary files similar to ones created by FORTRAN open statements (that produce sequential, unformatted, fixed recordsize files). Specifying "rb" in an fopen statement seems to be enough to access existing files (using an fread call). If I want to create a new binary file with fixed-length records, which if any, of the keywords on p 4-5 of the C RTL Ref Man are appropriate? Is "rfm=fix" enough? Do I need "mrs=size" as well, and if so, is size in units of bytes, or longwords (like Fortran)? Yes, you must specify "mrs=512" as well. (The size is in bytes.) You may also need to specify the carriage control attributes (as none, probably) - check a full directory listing of a FORTRAN file. Note: You can specify one setting per argument to open/fopen. That is: fopen("foo","w","rfm=fix","mrs=512"); /* Correct */ fopen("foo","w","rfm=fix,mrs=512"); /* **WRONG** */ While the documentation says this, it's easy to misunderstand. Also, is there any reason to use the chapter 4 open and read or write functions instead of the chapter 2 functions (fopen, fread and fwrite)? open/read/write may be somewhat faster; I have no idea if the difference would be noticeable. I'd probably use them for clarity - I use the Chapter 2 functions when I want their "added value". In using the Chapter 4 ones, I'm saying that I'm NOT thinking of the files as Unix-style byte streams. -- Jerry ------
R022DB3L@VB.CC.CMU.EDU.UUCP (09/18/87)
For Richard Steinberger's query: > I wrote a short main routine (see below) to test a function. One > of the inputs is an integer number, and the next input is a filename. > The problem is that the scanf() that reads the number apparently leaves > a LF character in a buffer that is then read by the code that is > expecting the filename. ..... > Why does scanf leave a LF > character, or am I misinterpreting what's going on? Are there solutions > other than using a getchar() call after a scanf() that preceeds a gets() > or getchar() call? .... I've never actually programmed in Vax C, but just from my general C work, I would expect this to be the case, according to the definition of SCANF. Basically, your call to scanf is causing an I/O request, to which you enter the string size and press RETURN. Thus, your input buffer contains the number followed by a newline. You're asking scanf to read a decimal input field into an integer variable. According to scanf (or at least the definition of it that I have handy), an "input field" is: * All characters up to **(but not including)** the next whitespace character * All characters up to the first one that cannot be converted under the current format specification (such as an 8 or 9 under octal format) * Up to 'n' characters where 'n' is the specified field width. "Whitespace" is defined as one of the characters blank ( ), tab (\t) or **newline (\n)**. Therefore, when you do a scanf("%d",&variable), scanf simply parses through the input until it hits the whitespace character (the newline). The number it reads gets assigned into the variable, and according to the first rule above, the whitespace character is not parsed, and therefore stays in the input buffer for the next call to scanf (or any function accessing the standard input stream). As far as a solution, how about just changing your scanf to: scanf("%d\n",&variable) (Actually, according to scanf, if there is any whitespace in the format string, it will scan over as much whitespace as necessary in the input at that point, so I suppose technically, you could use a space or \t in place of the \n above since they're all whitespace, but the \n seems logical). I do presume however, that even the above code would run into problems if the user were to input a string like "10 20{RETURN}", since the scanf would stop at the space, and you'd still have the rest of the string in the buffer for the next scanf. Perhaps an even better solution would be: scanf("%d%*[^\n]\n",&variable) What this code will do is read an integer, followed by a string which will be ignored (due to the *), the string consisting of the series of characters in the input which aren't a newline. The [] construct is a character search set specifier - since the first character is ^, it's a inverted set, matching any characters not listed. Effectively, this will skip over any extra characters on the input up to the newline, and then the \n in the format string will read the newline out of the buffer. Please note that I haven't actually had the time to try out these function calls, but I used function calls very similar this summer, so if they aren't quite perfect, play around a bit with the concept... :-) -- David Bolen Arpanet: R022DB3L@VB.CC.CMU.EDU Carnegie-Mellon University Bitnet : R022DB3L@CMCCVB Pittsburgh, PA 15213
STEINBERGER@KL.SRI.COM (Richard Steinberger) (04/04/88)
C vs. Fortran speed: Since C variables are automatic (i.e., dynamic) by default and Fortran variables are static, is it fair to conclude that in general, a routine in C having the same number of local variables as a "roughly identical" Fortarn routine will take a bit longer because the OS must allocate (and deallocate) space for the local variables? If the C local variables are made static, does this possible performance advantage disappear? In Fortran, it is sometimes convenient to use the END=n, where n is a label number, in a READ statement to transfer control when an EOF is detected. Has anyone found a *simple* way to get the same effect in C? I had been fopen on TT:, then using a while (!feof(fptr)) {...}, but this still results in the loop body getting executed unless yet another foef() is put within the loop to "double" check for EOF. Thanks for any help or suggestions. -Ric Steinberger steinberger@kl.sri.com -------
hvo@hawk.ulowell.edu (Huy D. Vo) (04/05/88)
In article <12387779883.20.STEINBERGER@KL.SRI.COM> STEINBERGER@KL.SRI.COM (Richard Steinberger) writes: >C vs. Fortran speed: Since C variables are automatic (i.e., dynamic) by >default and Fortran variables are static, is it fair to conclude that in >general, a routine in C having the same number of local variables as >a "roughly identical" Fortarn routine will take a bit longer because >the OS must allocate (and deallocate) space for the local variables? This requires one instruction to raise the stack pointer to point beyond the C local variables. Now my point: most of the time, C functions want values, whereas Fortran subroutines invariably expect addresses. If fetching an address is faster than fetching a value, then Fortran wins. Any microcode experts out there? How about some benchmarks? Huy D. Vo hvo@hawk.ulowell.edu
jayz@cullsj.UUCP (Jay Zorzy) (04/05/88)
From article <12387779883.20.STEINBERGER@KL.SRI.COM>, by STEINBERGER@KL.SRI.COM (Richard Steinberger): > C vs. Fortran speed: Since C variables are automatic (i.e., dynamic) by > default and Fortran variables are static, is it fair to conclude that in > general, a routine in C having the same number of local variables as > a "roughly identical" Fortarn routine will take a bit longer because > the OS must allocate (and deallocate) space for the local variables? > If the C local variables are made static, does this possible performance > advantage disappear? It depends on the size of the variables. Dynamic variables are allocated from the stack, so if you've got huge arrays, VMS will obviously have more work to do to allocate them on the stack. Another point to consider, is that static variables are allocated during image activation in R/W, copy- on-reference (CRF) sections. Depending on the frequency these sections are accessed, you may have paging activity to consider. Jay Zorzy Cullinet Software San Jose, CA
levy@ttrdc.UUCP (Daniel R. Levy) (04/06/88)
In article <5981@swan.ulowell.edu>, hvo@hawk.ulowell.edu (Huy D. Vo) writes: > Now my point: most of the time, C functions want values, whereas > Fortran subroutines invariably expect addresses. If fetching an > address is faster than fetching a value, then Fortran wins. Any > microcode experts out there? I'm not a "microcode expert" but I can say right off the bat that I can't imagine indirection ever being faster, and it may well be slower (needing two bus cycles to fetch the address and then the object being addressed). If the object is available directly as a value, that obviates the need for a double fetch. In both cases, the object might then be cached in a register, which makes any further fetching unnecessary. -- |------------Dan Levy------------| Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa, | an Engihacker @ | <most AT&T machines>}!ttrdc!ttrda!levy | AT&T Data Systems Group | Disclaimer? Huh? What disclaimer??? |--------Skokie, Illinois--------|
jeh@crash.cts.com (Jamie Hanrahan) (04/07/88)
In article <281@cullsj.UUCP> jayz@cullsj.UUCP (Jay Zorzy) writes: >From article <12387779883.20.STEINBERGER@KL.SRI.COM>, by STEINBERGER@KL.SRI.COM (Richard Steinberger): >> C vs. Fortran speed: Since C variables are automatic (i.e., dynamic) by >> default and Fortran variables are static, is it fair to conclude that in >> general, a routine in C having the same number of local variables as >> a "roughly identical" Fortarn routine will take a bit longer because >> the OS must allocate (and deallocate) space for the local variables? >> If the C local variables are made static, does this possible performance >> advantage disappear? > >It depends on the size of the variables. Dynamic variables are allocated >from the stack, so if you've got huge arrays, VMS will obviously have more >work to do to allocate them on the stack. Another point to consider, is >that static variables are allocated during image activation in R/W, copy- >on-reference (CRF) sections. Depending on the frequency these sections >are accessed, you may have paging activity to consider. > Er, no. The job of allocating the space on the stack is accomplished by a single SUBL2 instruction -- the stack pointer value is decremented by the number of bytes in the dynamic variables. All references to such variables are handled as displacements from either the SP or another register in which the appropriate value of the SP is stored (this because other things might be pushed on the stack). The stack is in a copy-on-reference section just like Fortran static variables are; offhand I'd say the number of page faults would be similar for similarly-designed programs. Another question in the original posting had to do with the efficiency of pass-by-reference (Fortran default) vs. pass-by-value (C default). It will take longer for the called procedure to pick up an argument passed by reference, as one additional fetch is required. On the other hand, if we're comparing C to Fortran, recall that C always pushes its argument lists on the stack and uses CALLS, while Fortran allocates static argument lists, modifies only those arguments that need to be modified at runtime, and uses CALLG. So if your argument list has nothing that needs to be evaluated at runtime (an array element with a non-constant subscript, for instance), the Fortran call will be faster. Now you get to worry about frequency of procedure call vs. frequency of the procedure's access to its arguments...
darin@laic.UUCP (Darin Johnson) (04/08/88)
In article <281@cullsj.UUCP>, jayz@cullsj.UUCP (Jay Zorzy) writes: > From article <12387779883.20.STEINBERGER@KL.SRI.COM>, by STEINBERGER@KL.SRI.COM (Richard Steinberger): > > C vs. Fortran speed: Since C variables are automatic (i.e., dynamic) by > > default and Fortran variables are static, is it fair to conclude that in > > general, a routine in C having the same number of local variables as > > a "roughly identical" Fortarn routine will take a bit longer because > > the OS must allocate (and deallocate) space for the local variables? > > It depends on the size of the variables. Dynamic variables are allocated > from the stack, so if you've got huge arrays, VMS will obviously have more > work to do to allocate them on the stack. I haven't looked at the compiler output for a long time, BUT... doesn't the allocation of automatic variables just involve adjusting the stack (and possibly frame) pointers? (In fact, I am pretty sure that this is what happens, but I have been known to be worng :-) If this is all that happens, then the size and number of automatic variables makes no difference at all to the overhead, which would just be one or two instructions. Since I don't know how FORTRAN allocates static variables (I hope they aren't put into a demand paged section :-) I would assume the C method is just as efficient. It is probably more efficient memory-wise since variables in rarely called routines never get allocated, whereas the memory for these variables would always hang around (of course, they could be paged out). -- Darin Johnson (...ucbvax!sun!sunncal!leadsv!laic!darin) (...lll-lcc.arpa!leadsv!laic!darin) All aboard the DOOMED express!
IMHW400@INDYVAX.BITNET (04/09/88)
Regarding relative speed of C and FORTRAN code: no, you can't assume that dynamic allocation takes significantly more time. Most VAX languages "allocate" dynamic local storage by assuming that the stack has enough space left and defining their addresses relative to the stack pointer. The runtime overhead involved is a single MOVAx instruction with registers for source and destination, plus indirection overhead. Depending on the degree of overlap during instruction decoding for the particular machine, the indirection overhead could in fact be zero. So, the best-case cost for stack allocation is one very brief instruction per CALL. So the overhead may hold academic interest, but no real practical interest. On the other hand, IF your machine uses a significant amount of time for each indirection, AND the FORTRAN compiler is using absolute rather than PC-relative addressing (probably not!) then the overhead in C might become significant. But if it is really causing you grief, you probably have refined your code into too many itty-bitty subroutines. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Mark H. Wood IMHW400@INDYVAX.BITNET (317)274-0749 III U U PPPP U U III Indiana University - Purdue University at Indianapolis I U U P P U U I 799 West Michigan Street, ET 1023 I U U PPPP U U I Indianapolis, IN 46202 USA I U U P U U I [@disclaimer@] III UUU P UUU III
nagy%warner.hepnet@LBL.GOV (Frank J. Nagy, VAX Wizard & Guru) (04/10/88)
> C vs. Fortran speed: Since C variables are automatic (i.e., dynamic) by > default and Fortran variables are static, is it fair to conclude that in > general, a routine in C having the same number of local variables as > a "roughly identical" Fortarn routine will take a bit longer because > the OS must allocate (and deallocate) space for the local variables? > If the C local variables are made static, does this possible performance > advantage disappear? On the VAX there is very little performance lost due to the automatic variables. Assuming the variables are not initialized (in which case they would be essentially same as static variables in terms of performance), then all that is needed is a single instruction: SUBL2 #<# of bytes of auto variables>,SP to bump the stack pointer down to allocate the auto variables. Nothing need be done to deallocate them since the old value of the stack pointer is taken from the call frame by the RET instruction. This automatically "pops" the auto variables, the call frame AND the argument list (when a CALLS instruction is used). With VAX C versus VAX FORTRAN the performance "hit" in doing a routine call is that C (also PASCAL for that matter) uses a CALLS and must PUSH the argument list onto the stack (in reverse order) before the CALLS. (Well, normally the argument is build directly onto the stack and is not PUSHed on from the static location). VAX FORTRAN uses CALLG and a static argument list; but here FORTRAN must overwrite argument values which have changed from the values in the static list - this is true when a subroutine passes arguments it was called with to an inner routine. In these cases, much of the performance advantage of VAX FORTRAN is lost. The other side of the coin is that by building the argument list on the stack at run-time, C and PASCAL automatically support recursion and ASTs in a natural and easy manner. Try that in FORTRAN (I have, I know and that's why I use C these days). = Frank J. Nagy "VAX Guru & Wizard" = Fermilab Research Division EED/Controls = HEPNET: WARNER::NAGY (43198::NAGY) or FNAL::NAGY (43009::NAGY) = BitNet: NAGY@FNAL = USnail: Fermilab POB 500 MS/220 Batavia, IL 60510
jayz@cullsj.UUCP (Jay Zorzy) (04/13/88)
From article <281@cullsj.UUCP>, by jayz@cullsj.UUCP (Jay Zorzy): > It depends on the size of the variables. Dynamic variables are allocated > from the stack, so if you've got huge arrays, VMS will obviously have more > work to do to allocate them on the stack. Another point to consider, is > that static variables are allocated during image activation in R/W, copy- > on-reference (CRF) sections. Depending on the frequency these sections > are accessed, you may have paging activity to consider. I've taken a lot of heat on this one. Yes, it's true that all that's required to expand the stack is a simple SUBL instruction to reset the stack pointer to a lower address in P1 space. Where you pay a penalty is when this address is beyond any previously allocated P1 space; the next instruction that attempts to access this address will incur an access violation. The access violation is then handled by an exception service routine which, if possible, must allocate additional virtual memory in P1 space. So there is some dynamic memory allocation overhead involved in some cases, particularly in recursive routines that have large local data segments. The difference is the memory is allocated during execution instead of image activation. By the way, if the aforementioned exception service routine fails to allocate the needed memory (insufficient PGFLQUO/VIRTUALPAGECNT/etc), then the access violation is passed on to user or system condition handlers. So an ACCVIO with a reason mask bit 0 set and virtual address in P1 space (i.e. 3FFFFFFF < VA < 80000000) is really an insufficient memory condition. Speaking of stacks, here's a common gotcha to watch out for: If you're issuing any asynchronous $QIOs (no wait, I/O completion handled by AST routine), make sure to declare your IOSB variables as static. Otherwise, once the calling routine returns, the stack is reset; an IOSB on the stack may now point to never-never land. Hope this information is useful... Jay Zorzy Cullinet Software San Jose, CA