jkh@pcsbst.UUCP (jkh) (03/12/89)
[ something *is* actually eating this line these days.. ] I'm using the regex code out of 18.52 and would like to construct a regular expression that will match *all* of the characters in a string of text containing newlines. I tried ".*" (the most obvious), but it just matches up to the first newline. I haven't got newlines being treated specially in any way (I.E. as "or"s) so I can't figure out why this is happening. Any suggestions? Jordan Hubbard PCS Computer Systems pyramid!pcsbst!jkh
pinkas@hobbit.intel.com (Israel Pinkas ~) (03/16/89)
In article <768@pcsbst.UUCP> jkh@pcsbst.UUCP (jkh) writes: > I'm using the regex code out of 18.52 and would like to construct a > regular expression that will match *all* of the characters in a string > of text containing newlines. I tried ".*" (the most obvious), but it > just matches up to the first newline. I haven't got newlines being treated > specially in any way (I.E. as "or"s) so I can't figure out why this is > happening. The regular expression for . is defined to match any single character except a newline. See the GNU Emacs manual, section 13.5, "Syntax of Regular Expressions." Also see the man page for grep, vi, sed, or any other Unix program that uses regular expressions. -Israel Pinkas -------------------------------------- Disclaimer: The above are my personal opinions, and in no way represent the opinions of Intel Corporation. In no way should the above be taken to be a statement of Intel. UUCP: {amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!cadev4!pinkas ARPA: pinkas%cadev4.intel.com@relay.cs.net CSNET: pinkas@cadev4.intel.com -- -------------------------------------- Disclaimer: The above are my personal opinions, and in no way represent the opinions of Intel Corporation. In no way should the above be taken to be a statement of Intel. UUCP: {amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!cadev4!pinkas ARPA: pinkas%cadev4.intel.com@relay.cs.net CSNET: pinkas@cadev4.intel.com
gaynor@athos.rutgers.edu (Silver) (03/16/89)
"[^]" fails with an error (it would be nice if this were fixed even if just this purpose, and "[]" for completeness), but "[\0-\255]" did the trick. The same could have been performed at extra cost with "\(\n\|.\)". Regards, [Ag] gaynor@rutgers.edu
piet@ruuinf (Piet van Oostrum) (03/16/89)
In article <768@pcsbst.UUCP>, jkh@pcsbst (jkh) writes:
`
`I'm using the regex code out of 18.52 and would like to construct a
`regular expression that will match *all* of the characters in a string
`of text containing newlines. I tried ".*" (the most obvious), but it
`just matches up to the first newline.
The definition of '.' in r.e's is any character except newline:
`. (Period)'
is a special character that matches any single character except a
newline. Using concatenation, we can make regular expressions like
`a.b' which matches any three-character string which begins with `a'
and ends with `b'.
So use (in Lisp syntax):
"\\(.\\|\n\\)*"
--
Piet van Oostrum, Dept of Computer Science, University of Utrecht
Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands
Telephone: +31-30-531806. piet@cs.ruu.nl (mcvax!hp4nl!ruuinf!piet)
worley@EDDIE.MIT.EDU (Dale Worley) (03/16/89)
I'm using the regex code out of 18.52 and would like to construct a regular expression that will match *all* of the characters in a string of text containing newlines. I tried ".*" (the most obvious), but it just matches up to the first newline. I haven't got newlines being treated specially in any way (I.E. as "or"s) so I can't figure out why this is happening. The reason it is happening is because '.' is defined to not match newlines. (See Info node "Regexps", or search for 'anychar' (the internal code for the '.' regexp) in regex.c and examine the code in those areas.) If you want to match newline also, you will have to say '\(.\|^J\)', or in C, "\\(.\\|\n\\)". Dale
merlyn@intelob.intel.com (Randal L. Schwartz @ Stonehenge) (03/17/89)
In article <Mar.15.20.52.02.1989.11741@athos.rutgers.edu>, gaynor@athos (Silver) writes: | "[^]" fails with an error (it would be nice if this were fixed even if just | this purpose, and "[]" for completeness), but "[\0-\255]" did the trick. The | same could have been performed at extra cost with "\(\n\|.\)". Ahh, but "[^]]" is a valid reg-ex, and matches any single character *but* the right bracket. Similarly, "[]]" matches *just* the right bracket. So, there is *nothing* to fix. True, there is no trivial way to say "everything" and "nothing", but so what. Now, how to match just a "["? :-) -- Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 on contract to BiiN (for now :-), Hillsboro, Oregon, USA. ARPA: <@intel-iwarp.arpa:merlyn@intelob> (fastest!) MX-Internet: <merlyn@intelob.intel.com> UUCP: ...[!uunet]!tektronix!biin!merlyn Standard disclaimer: I *am* my employer! Cute quote: "Welcome to Oregon... home of the California Raisins!"