bobg+@andrew.cmu.edu (Robert Steven Glickstein) (01/15/90)
Here is a fragment of Yacc code: %token PLUS MINUS %% expr: mulexpr PLUS mulexpr | mulexpr MINUS mulexpr It's very straightforward; the yylex() routine must be written to return the constant PLUS when it encounters a '+' in the input, and the constant MINUS when it encounters a '-' in the input. However, Yacc allows you to rewrite the above fragment as %% expr: mulexpr '+' mulexpr | mulexpr '-' mulexpr My question is, where does Yacc find the '+' and the '-' characters? Apparently they're not gotten via a call to yylex(). Does Yacc simply do a getchar()? I ask because I have written a parser which can be configured to read from various input sources (standard input, file, string). There are many places in the parser where token-name constants are used when a single character will do; however, if the single character is retrieved by getchar() or some other hardwired mechanism, I'll have to stick to the yylex() approach (since my yylex() knows where to read characters from). Please e-mail your replies. Thanks in advance. _______________________________ Bob Glickstein, System Designer Information Technology Center room 220 Carnegie Mellon University Pittsburgh, PA 15213-3890 (412) 268-6743 Internet: bobg+@andrew.cmu.edu Bitnet: bobg%andrew.cmu.edu@cmuccvma.bitnet UUCP: ...!harvard!andrew.cmu.edu!bobg I could dance till the cows come home. On second thought, I'd rather dance with the cows till you come home. -- Groucho Marx
evan@plx.UUCP (Evan Bigall) (01/16/90)
> > expr: mulexpr PLUS mulexpr > | mulexpr MINUS mulexpr > >It's very straightforward; the yylex() routine must be written to return >the constant PLUS when it encounters a '+' in the input, and the >constant MINUS when it encounters a '-' in the input. However, Yacc >allows you to rewrite the above fragment as > > expr: mulexpr '+' mulexpr > | mulexpr '-' mulexpr > >My question is, where does Yacc find the '+' and the '-' characters? >Apparently they're not gotten via a call to yylex(). Does Yacc simply >do a getchar()? Quoting from the yacc section of my sys5.2 "Suport Tool Guide": } The rules section is made up of one or more grammar rules. A grammar }rule has the form } }A : BODY ; } }where "A" represents a nonterminal name, and "BODY" represents a sequence of }zero or more names and LITERALS {my emphasis}. The colon and the semicolon }are yacc punctuation. {later it says:} }A literal consists of a character enclosed in single quotes ('). As in C }language, the backslash (\) is an escape character within literals.... Really all that is going on here is that yacc is using the value of the character literal as the token number. This is why the yacc generated token numbers start at 257 (on machines with ""normal"" char sets). The standard way to represent this as a lex rule is: . return(*yytext); to return a literal for all charcters not recognized by another rule. Evan -- Evan Bigall, Plexus Software, Santa Clara CA (408)982-4840 ...!sun!plx!evan "I barely have the authority to speak for myself, certainly not anybody else"