[comp.compilers] context sensitive parsing

johnl@ima.UUCP (05/02/87)

Does anyone know any leads to compiler work where the target language
is context sensitive?  I'm interested in designing a compiler for a
language known as REXX, which is currently interpreted.  Most of the
development tools and books I've seen so far tend to emphasize
context free languages.  And here is the kicker: are there any
tools which work just as well in an IBM (EBCDIC) as well as the
ASCII environments?  Please respond to by BITNET address if
possible.

Kenneth Ng: Post office: NJIT - CCCC, Newark New Jersey  07102
uucp !ihnp4!allegra!bellcore!argus!ken
     ***   WARNING:  NOT ken@bellcore.uucp ***
bitnet(prefered) ken@orion.bitnet

Kirk: "I don't care if you hit the broadside of a barn"
Spock: "Why should I aim at such an object?"
[I wouldn't have thought that REXX was context sensitive.  Can you send along
some examples of stuff that you can't parse with the usual context-free
techniques?  Remember, just because something is ambiguous doesn't mean that
you can't fake it with yacc.  As far as EBCDIC goes, yacc will try to parse
the tokens you give it without regard to religion, national origin, or
character set.  Your lexer has to speak EBCDIC, but the parser needn't care.]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

dowding@bigburd.PRC.Unisys.COM (John Dowding) (05/05/87)

In article <562@ima.UUCP> Kenneth Ng writes:
>Does anyone know any leads to compiler work where the target language
>is context sensitive?  

I think that the use of term context-free and context-sensitive
are misleading here.  Most programming languages have context-sensitive
restrictions, but they are handled in the semantics, not the syntax.
As the semantic actions are generally arbitrarily powerful, they can
certainly do context-sensitive checking.  

An example of this is the constraint in some languages that a function
be defined before it is called.  This can not be checked for by
a context-free grammar.

John Dowding

dowding@bigburd.prc.unisys.com
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.UUCP (05/14/87)

In article <564@ima.UUCP>, dowding@bigburd.PRC.Unisys.COM (John Dowding) writes:
> I think that the use of term context-free and context-sensitive
> are misleading here.  Most programming languages have context-sensitive
> restrictions, but they are handled in the semantics, not the syntax.
> As the semantic actions are generally arbitrarily powerful, they can
> certainly do context-sensitive checking.  
> An example of this is the constraint in some languages that a function
> be defined before it is called.  This can not be checked for by
> a context-free grammar.

Syntax is to do with structure and can be checked by static
analysis of the program. We can then distinguish context free syntax
which is to do with raw structure from context sensitive syntax which
is to do with consistency within raw structure. We can also distinguish
concrete syntax which is to do with representation from abstract syntax
which is to do with meaningful structure. Semantics is to do with meaning
alone. In Chomsky terms: lexicon = type 3 grammar
                         context free syntax = type 2 grammar
                         context sensitive syntax = type 1 grammar
                         semantics = type 0 grammar

                         concrete syntax = surface structure
                         abstract syntax = deep structure

This confusion originates from the early days of syntax directed translation.
The term 'static semantics' used to be applied to context sensitive syntax
because it couldn't be handled by a context free approach and people thought
that all that was needed for parsing was a context free grammar. Thus,
anything else must be semantics! Furthermore, in denotational semantics,
details like context sensitivity are deemed irrelevant and so get pushed
into the semantic equations. That doesn't make them semantics though!

Nowadays, attribute grammars are used to specify context sensitivity and
to enable context sensitive checking in compilers. See articles by Knuth,
& Watt. See McGettrick's & Pagan's books on language definition.
[It's always been my impression that the distinction between syntax and
semantics degenerates into religion pretty quickly.  Practically any aspect
of a language could be argued to be either.  In useful compilers, though, it
seems that you're better off having your parser recognize an overgeneral
version of your language, and then throwing out invalid cases later; it makes
it a lot easier to produce useful error messages, e.g. "this should have been
a pointer but it's not" rather than "invalid token after `*'".  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request