john@uw-nsr.UUCP (John Sambrook) (05/29/86)
[] Regarding error recovery in C compilers, I like the error recovery provided by the Data General C compiler. Here is an example of a botched program: main() { int a = 0 /* missing ";" */ printf("a: %s\n", (a == 1) ? "1" : "?"; /* missing ")" */ } When compiled the following is written on stderr: Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) printf("a: %s\n", (a == 1) ? "1" : "?"; ^ Syntax Error. A symbol of type ";" has been inserted before this symbol. Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) printf("a: %s\n", (a == 1) ? "1" : "?"; ^ Syntax Error. A symbol of type ")" has been inserted before this symbol. In this example the compiler produced a program that executed correctly. To be fair, both errors are "errors of omission." I believe, but do not assert, that these errors are easier to repair than other types of errors. In the event of serious errors the compiler will cease code generation and only check the remaining input. I don't know the parsing method used in this compiler; it does not seem to suffer from poor error recovery as do many recursive-descent parsers. While on the subject of compilers, I would like to share two other features of this compiler that I find useful. I have not found these features in other C compilers that I have used, although I have heard that the VAX/VMS C compiler is very good. The first feature is the ability to generate a stack trace ("traceback") in the event of a serious error. There are two compiler switches that control the amount of information in a traceback. The "-Clineid" switch causes the offending line number to be included while the "-Cprocid" switch causes the procedure name to be included. The second feature is the ability to declare certain data structures as "read only." This is done via a compiler switch "-R" and applies to all data structures that are initialized to a constant value within the compilation unit. Here is an example program that demonstrates both features: int a = 1; /* "read only" */ main() { int b; /* "read / write" */ /* this is legal */ b = a; /* prove it */ printf("a: %d b: %d\n", a, b); /* this is not */ a = 2; /* lie detector */ printf("Can't happen\n"); } This program was compiled with "cc -R -Clineid -Cprocid main.c -o main." Executing the program produced the following output on stderr: a: 1 b: 1 ERROR 71237. from line 13 of main. Call Traceback: from fp=16000002722, pc=16001754200, line 10 of main from fp= 0, pc=16001762472 Hardware protection violation: Write access denied. In the traceback the phrase "line 10" is because "-Clineid" was specified at compile time. The phrase "of main" is because "-Cprocid" was specified at compile time. Note though that the two "offending" line numbers differ. I suspect that this is because the last line to execute successfully was at line 10; line 13 did not execute successfully but rather generated a processor trap. Some people might say that this compiler is too verbose, or that the features cost too much in terms of execution overhead. I have not found this to be the case. And, while I hate to see any tracebacks, I find them to be far better than: % mumble foo bar Segmentation violation - core dumped. % Please note that I do not intend to participate in any "holy wars." I do not speak for Data General nor do I receive any compensation from them. I just felt that this might be of interest to the readers of this group. -- John Sambrook Work: (206) 545-2018 University of Washington WD-12 Home: (206) 487-0180 Seattle, Washington 98195 UUCP: uw-beaver!uw-nsr!john
bright@dataioDataio.UUCP (Walter Bright) (05/29/86)
In article <312@uw-nsr.UUCP> john@uw-nsr.UUCP (John Sambrook) writes: >While on the subject of compilers, I would like to share two other features >of this compiler that I find useful. >The second feature is the ability to declare certain data structures as >"read only." This is done via a compiler switch "-R" and applies to all >data structures that are initialized to a constant value within the >compilation unit. > > int a = 1; /* "read only" */ > main() { > int b; /* "read / write" */ > > b = a; /* this is legal */ > a = 2; /* this is not */ > } The declaration for a can be made 'read only' by declaring it as follows: const int a = 1; Doing an assignment to a will then cause a syntax error when compiling. This is in the draft ANSI C spec.
guy@sun.uucp (Guy Harris) (06/01/86)
> Regarding error recovery in C compilers, I like the error recovery > provided by the Data General C compiler. Here is an example of a > botched program: > > main() { > int a = 0 /* missing ";" */ > > printf("a: %s\n", (a == 1) ? "1" : "?"; /* missing ")" */ > } > > When compiled the following is written on stderr: > > Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) > printf("a: %s\n", (a == 1) ? "1" : "?"; > ^ > Syntax Error. > A symbol of type ";" has been inserted before this symbol. I think the Berkeley Pascal compiler does the same sort of thing. I don't know whether this would be practical for the C compiler or not. > I don't know the parsing method used in this compiler; it does not seem > to suffer from poor error recovery as do many recursive-descent parsers. For what it's worth, many (if not most) UNIX C compilers don't use recursive descent parsers; they are based on PCC which uses YACC. > While on the subject of compilers, I would like to share two other features > of this compiler that I find useful. I have not found these features in > other C compilers that I have used, although I have heard that the VAX/VMS > C compiler is very good. > > The first feature is the ability to generate a stack trace ("traceback") > in the event of a serious error. There are two compiler switches that > control the amount of information in a traceback. The "-Clineid" switch > causes the offending line number to be included while the "-Cprocid" switch > causes the procedure name to be included. You can certainly get the equivalent of that in many UNIX systems; the only difference is that you go into a debugger when the program drops core and ask the debugger for a stack trace. I'm not convinced that getting a traceback "for free" is any better than going "dbx mumble" or "sdb mumble" when you get the "Segmentation violation - core dumped" error you so dislike and asking for a stack trace. (If you've compiled the program with the "-g" flag, it will give you line numbers in the stack trace. It will also permit you to examine variables in the program. Merely being told what the call stack looked like may not be sufficient to enable you to figure out what happened, so you may have to go into a debugger anyway, even on the DG system.) > The second feature is the ability to declare certain data structures as > "read only." This is done via a compiler switch "-R" and applies to all > data structures that are initialized to a constant value within the > compilation unit. Much too crude. What you want is the feature which will appear in ANSI C, where you can specify which objects are to be constants and which are not. The objects which are to be constants can be put in sharable read-only regions. The trouble with the "-R" flag as you describe it is that you have to make sure that the initialized data structures which are to be read-only must be in separate source files from the ones which are to be read-write. -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.arpa
mike@peregrine.UUCP (Mike Wexler) (06/01/86)
Heres how I do it. > % mumble foo bar > Segmentation violation - core dumped. % dbx mumble Reading symbolic information.. Read 98 Symbols (dbx) where foo(b = 7), line 15 in "mumble.c" main(argc = 1, argv = 16776148, 0xfffbdc), line 5 in "mumble.c" (dbx) list 15 15 a=*q; (dbx) print q q = (nil) (dbx) quit The advantage of this is that dbx will give me all the information I want and not the information that I don't want. -- Mike Wexler Email address:(trwrb|scgvaxd)!felix!peregrine!mike Tel Co. address: (714)855-3923 ;-) Internet address: ucivax@ucbvax.BERKELY.EDU!ucivax%felix!mike@peregrine :-(
mike@peregrine.UUCP (Mike Wexler) (06/03/86)
In article <1009@dataioDataio.UUCP> bright@dataio.UUCP (Walter Bright writes: >In article <312@uw-nsr.UUCP> john@uw-nsr.UUCP (John Sambrook) writes: >>The second feature is the ability to declare certain data structures as >>"read only." This is done via a compiler switch "-R" and applies to all >>data structures that are initialized to a constant value within the >>compilation unit. >The declaration for a can be made 'read only' by declaring it as follows: > > const int a = 1; > >Doing an assignment to a will then cause a syntax error when compiling. >This is in the draft ANSI C spec. This only works if you have an ANSI C compiler. Do you? BTW, the C compiler on the SUN has a -R option that makes initialized variables shared and read-only. -- Mike Wexler Email address:(trwrb|scgvaxd)!felix!peregrine!mike Tel Co. address: (714)855-3923 ;-) Internet address: ucivax@ucbvax.BERKELY.EDU!ucivax%felix!mike@peregrine :-(
franka@mmintl.UUCP (Frank Adams) (06/03/86)
In article <312@uw-nsr.UUCP> john@uw-nsr.UUCP writes: >I don't know the parsing method used in >this compiler; it does not seem to suffer from poor error recovery as do >many recursive-descent parsers. My impression is that the quality of the error recovery has little to do with the parsing method used, and a great deal to do with how much effort is investing in making the error recovery good. Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108
john@frog.UUCP (John Woods, Software) (06/04/86)
>>While on the subject of compilers, I would like to share two other features >>of this compiler that I find useful. >> >>The first feature is the ability to generate a stack trace ("traceback") >>in the event of a serious error. There are two compiler switches that >>control the amount of information in a traceback. The "-Clineid" switch >>causes the offending line number to be included while the "-Cprocid" switch >>causes the procedure name to be included. > I have occaisionally used a trick for programs which only seem to crash when I am not around to poke through entrails, that of writing a C function that catches signals and prints an admittedly-crude stack backtrace. While it is probably nicer to have compiler assist in doing this (I can remember the reams of output that a student-oriented ALGOL-60 interpreter once gave), I still find that if I need much more than the hint that this little backtrace routine of mine gives, I need all of a debugger, anyway. By the way, my backtracer is written entirely in C, with the exception of a call on an assembly routine getfp(), which returns the current frame-pointer value. Which, by the way, is another function with a problematical type: it returns a pointer to a --\ ^ | \______________/ ... (I could have cheated on the getfp() by using the address of a parameter, but as long as I had the RIGHT tool...) -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101 ...!decvax!frog!john, ...!mit-eddie!jfw, jfw%mit-ccc@MIT-XX.ARPA "Imagine if every Thursday your shoes exploded if you tied them the usual way. This happens to us all the time with computers, and nobody thinks of complaining." Jeff Raskin, interviewed in Doctor Dobb's Journal
meissner@dg_rtp.UUCP (Michael Meissner) (06/06/86)
In article <312@uw-nsr.UUCP> john@uw-nsr.UUCP (John Sambrook) writes: > >Regarding error recovery in C compilers, I like the error recovery >provided by the Data General C compiler. Here is an example of a >botched program: > > main() { > int a = 0 /* missing ";" */ > > printf("a: %s\n", (a == 1) ? "1" : "?"; /* missing ")" */ > } > >When compiled the following is written on stderr: > > Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) > printf("a: %s\n", (a == 1) ? "1" : "?"; > ^ > Syntax Error. > A symbol of type ";" has been inserted before this symbol. > > > Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) > printf("a: %s\n", (a == 1) ? "1" : "?"; > ^ > Syntax Error. > A symbol of type ")" has been inserted before this symbol. > >In this example the compiler produced a program that executed correctly. > >To be fair, both errors are "errors of omission." I believe, but do not >assert, that these errors are easier to repair than other types of errors. >In the event of serious errors the compiler will cease code generation and >only check the remaining input. I don't know the parsing method used in >this compiler; it does not seem to suffer from poor error recovery as do >many recursive-descent parsers. It's a pleasant surprise when somebody says he likes something. I am the author of the Data General C compilers. The parsing method that I use is a standard LALR parse, based on an internal tool that constructs the tables from a BNF input grammar. In comparison to YACC, the tool is not as developer friendly, ie, it only creates the tables, I have to write the routine that actually interprets the parse state machine and dispatch on the semantic actions. The error recovery routines must also be provided as well. YACC on the other hand, encapsulates the parser into the the C program it generates. It also handles error recovery (badly in my opinion), so that in general, the user doesn't have to mess with it. It also means that the user does not really have the control either. The algorithm that I use, which is the first part of Jerry Fisher's (from SIGPLAN, compiler construction conference) first attempts to insert, delete, or replace the token that is in error with any of the tokens that are in the follow set (ie would be possible, legal input), and then parses ahead 3 tokens. The first parse that will succeed for 3 tokens is selected (the tokens are given a priority, and tried in priority order). The second part of Jerry Fisher's algorithm is a complicated secondary recovery, which I initially attempted, and gave up because adapting his algorithm to my parser kept coming up with errors in my translation, or areas where I did not really understand what is going on deep within the LALR tables. As near as I can understand from looking at it, the YACC approach is to discard tokens until it can reduce from an 'error' production. It's been my experience that this rarly does what the compiler writer wants. As far as local replacement goes, I am currently thinking of adding another pass that would attempt to glue two tokens together (to make += out of + and = separated by whitespace). The priorities are the hardest thing to get a feeling for, and I still play with them every so often. As far as secondary recovery goes, my feeling still is that if you ever need to go to more extereme methods, the program is hopelessly damaged, and I question whether the programmer gets anything useful after the first few error messages. >While on the subject of compilers, I would like to share two other features >of this compiler that I find useful. I have not found these features in >other C compilers that I have used, although I have heard that the VAX/VMS >C compiler is very good. > >The first feature is the ability to generate a stack trace ("traceback") >in the event of a serious error. There are two compiler switches that >control the amount of information in a traceback. The "-Clineid" switch >causes the offending line number to be included while the "-Cprocid" switch >causes the procedure name to be included. There have been a few responses saying dbx/adb gives you the information, if you compile with -g and look at the core file. The traceback feature (which is standard on almost all 32-bit DG compilers) produces smallish tables, which can be kept in the program file, even when it is shipped to users in production mode. We also support -g and dbx. >The second feature is the ability to declare certain data structures as >"read only." This is done via a compiler switch "-R" and applies to all >data structures that are initialized to a constant value within the >compilation unit. This came from Berkley 4.2 (and 4.3) and was added in attempt to be as compatible with both 4.2 and system V.2 as we could. At some point in the future, when the ANSI X3J11 draft stabilizes to the point of going for public review, the `const' feature will also allow this without having to set the option. The private Data General keyword $shared allows this in the released revisions. >John Sambrook Work: (206) 545-2018 >University of Washington WD-12 Home: (206) 487-0180 >Seattle, Washington 98195 UUCP: uw-beaver!uw-nsr!john Michael Meissner Data General Corporation ...{ decvax, ihnp4, ucbvax, ... }!mcnc!rti-sel!dg_rtp!meissner
bright@dataioDataio.UUCP (Walter Bright) (06/09/86)
In article <397@peregrine.UUCP> mike@peregrine.UUCP (Mike Wexler) writes: >In article <1009@dataioDataio.UUCP> bright@dataio.UUCP (Walter Bright writes: >>The declaration for a can be made 'read only' by declaring it as follows: >> const int a = 1; >>Doing an assignment to a will then cause a syntax error when compiling. >>This is in the draft ANSI C spec. >This only works if you have an ANSI C compiler. Do you? I use the Datalight C compiler which does support const and volatile as defined in the draft standard. No other compiler that I'm aware of does.