najmi@hpcuhb.HP.COM (Farrukh Najmi) (02/04/89)
I would appreciate some help with a YACC related question. I know I can look at the y.output (on HPUX anyways) to find the productions in each state's LALR(1) items. However I want to find the lookahead set for a given production for a given state. In other words I need to know how I can get the lookahead portion of the LALR(1) item by looking at YACC generated files. Thanks in advance for any help. Farrukh Najmi. [From hp-sdd!najmi@hpcuhb.HP.COM (Farrukh Najmi)] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
djones@megatest.uucp (Dave Jones) (02/07/89)
> I know I can look at the y.output (on HPUX anyways) to > find the productions in each state's LALR(1) items. However > I want to find the lookahead set for a given production > for a given state. > > In other words I need to know how I can get the lookahead portion > of the LALR(1) item by looking at YACC generated files. > Thanks in advance for any help. > > Farrukh Najmi. Tain't easy. And if you could do it, you probably would not have what you wanted anyway. Do you want to calculate these for error- messages? (If so, you might ought to have said so.) You see, which tokens can next be accepted depends not just on the state which is on the top of the stack, but potenitally on other states deep underneath. That's because yacc tables are LALR(1) (with default productions as well), not the space-consuming canonical LR(1) tables. So if you had a list of all tokens which could be shifted when a given state was on top of the stack, it might have entries which would not be legal as a next token in an input file which had blocked the parser with a syntax error. But be of good cheer! If you only need the information at runtime, for error-messages, you can hack the parser, yyparse() or its genotype, yaccpar, so that when it blocks, it will first save a copy of the stack, then run in simulation on each possible token until it either shifts or blocks. If it shifts it, remember the token. In any case, restore the stack and try again, until you've tested all the tokens. Then report, "legal tokens are...", and spew out all the ones you have remembered. Dave J. P.S. I'm writing a compiler using yacc, even as we speak, and I just figured this out on the fly. I think I'm going to incorporate it! Thanks for the inspiration. -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
steve@cs.utexas.edu (02/17/89)
In article <3328@ima.ima.isc.com>, djones@megatest.uucp (Dave Jones) writes: > [yacc error states are hard to analyze, particularly if you want to see > the follow sets in a particular context.] In fact here is the code that does this error dumping: 257,275c256,257 < if ((yy_n = yypact[yystate]) > YYFLAG && yy_n < YYLAST) < { < short index; < extern char *p_error(); < fprintf(stderr,"expecting: "); < for (index = yy_n>0 ? yy_n : 0; < index < YYLAST; < ++index) < { < if (yychk[yyact[index]] == index - yy_n < && index - yy_n != YYERRCODE) < fprintf(stderr,"%s, ",p_error(index-yy_n)); < } < fprintf(stderr,"found:"); < fprintf(stderr,"%s\n",p_error(yychar)); < } < else < yyerror( "syntax error" ); < goto skip_init; --- > yyerror( "syntax error" ); > goto skip_init; For those of you who cannot tell this is a diff of yaccpar that does that does this error dumping. This yaccpar is from the standard SYSTEM V. You do need to supply the function char *p_error(int). This function takes the token number passed in, and changes it into a string name for that token. Errors come out in the fashion: expecting: ')', ',', '+', found '['. Not totally coherent, but better than 'syntax error'. This message gets very long in some cases. Three last notes: 1) Credit for this change does not belong to me but to one of my partners at Remora, Inc. I think he got this trick from a book (which my copy is unfortunately at home) on building a compiler in UNIX (reference supplied if requested). At least I can claim I understand what is happening here. 2) Do not comment on the programing style too much. This code was hacked today from our (Remora's) version of yaccpar into the UNIX version just for this post. The code as posted works (is tested). 3) Excuse the cryptic nature of this message, I am typing it in on my own lunch time when I should be debuging Motorola's code. enough from this mooncalf - Steven -- Steven R Weintraub cs.utexas.edu!oakhill!devsys!steve Motorola Inc. Austin, Texas (512) 440-3023 (office) (512) 453-6953 (home) -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request