[comp.archives] [perl] Re: Quoting and Splitting

hakanson@ogicse.ogi.edu (Marion Hakanson) (09/26/90)

Archive-name: dnsparse/25-Sep-90
Original-posting-by: hakanson@ogicse.ogi.edu (Marion Hakanson)
Original-subject: Re: Quoting and Splitting
Archive-site: cse.ogi.edu [129.95.10.2]
Archive-directory: /pub
Reposted-by: emv@math.lsa.umich.edu (Edward Vielmetti)

In article <adler.654289321@betwixt> adler@betwixt..caltech.edu (B. Thomas Adler) writes:
>. . .
>spacing.  My question is, is there a way to have split() split on
>white-space, while respecting the restrictions imposed by any double quoting?
>
>ie, I'd like the line
>	Field_1  parm_1		"This is example one"
>
>to split into three components, rather than 6.

This has been discussed several times before.  If you allow the quotes
to be escaped (with a backslash, which can also be escaped by a
backslash, etc.), then you aren't going to be able to do this with a
regular expression.  Even if you don't, the r.e. will be ugly.

Since you mentioned nameserver files, you may find the approach I took
to be of use to you.  Use anonymous FTP to retrieve from host
cse.ogi.edu the file pub/dnsparse-2.0.tar.Z.  Briefly, there is a
lexical analyzer (tokenizer) written in C, which is used by Perl code
to fully parse DNS master files.  The lex-er deals with quotes, etc.,
and the Perl code does the rest.

-- 
Marion Hakanson         Domain: hakanson@cse.ogi.edu
                        UUCP  : {hp-pcd,tektronix}!ogicse!hakanson