[net.unix] Regular Expressions in Lex

bill@hao.UUCP (07/21/86)

I'm trying to use Lex to dig out the OPEN statements in a fortran source file.
Of course, I have to be aware of continuation lines, so I use a regular 
expression (see the example in 1.) to dig out the single continuation line.

	1. ^[^Cc].*open[ ]*\(.*\n[     .].*\)  printf ("%s", yytext);  

This seems to work fine.  But when I try to find multiple continuation lines
with the pattern given in 2., I don't even get the single continuation lines.

        2. ^[^Cc].*open[ ]*\(.*\n[[     .].*[\n\)]]+ printf ("%s", yytext); 

My questions are:
	a) Isn't the r.e. [[     .].*[\n\)]]+ valid?
	b) If it is, what am I doing wrong?  i.e. why don't it work?
	c) Do you have any suggestions on a better reference to Lex than that
	   given in the Unix Programmer's Manual, III, by Lesk and Schmidt?

Thanks in advance for any ideas or suggestions.

							Bill Roberts
							NCAR/HAO
							Boulder, CO
							!hao!bill

bill@hao.UUCP (Bill Roberts) (07/25/86)

Thanks to all who responded to my earlier query (msgs # 9080).  I finally got
it figured out the other day and thought some people might be interested in the 
solution.  To restate the problem:
  I needed a regular expression to use as part of a lex input file to dig out 
  all syntactically valid instances of f77 open statements, in their various
  guises.  That is, I want to find things like
       
	if (fred .eq. betty)
     1   open(unit=1, file=fred, .... )

     and

	open(unit=1, ....,
     #  more..............,
     @  xya, abc, and more,
	  .
	  .
	  .
     n  and finally this)

     as well as regular things like

	open (fred the open statement)

  In addition, I wanted to ignore any commented open statements.  One solution 
  is this: (note that this is a Lex input file)

C	\n"     ".
A	[ \t{C}]*
B	[ \t]
D	[^=\n]
%%
^{B}+{D}*o{A}p{A}e{A}n{A}\(.*({C}.*)*\)[ ]*\n  {
			return(OPENR);
			}
.	;

\n	;

  This seems to work on all of the cases I've stated above (are there others?).
  It even handles the cases where the 'open' word is separated by spaces, tabs,
  or newlines (as is allowed in fortran, arg!).  I hope this might be of 
  interest or help to someone out there.  It was quite enlightening (and 
  frustrating) to me.  I only had to read the Lex paper
  ("Lex - A Lexical Analyzer Generator", M.E. Lesk and E. Schmidt) n times!

							Bill Roberts
							NCAR/HAO
							Boulder, CO
							!hao!bill