schmidt%crimee.ics.uci.edu@PARIS.ICS.UCI.EDU ("Douglas C. Schmidt") (01/26/89)
Hi,
G++ 1.32 seems to have a problem. Consider the following program,
which uses regular expression matching in libg++:
----------------------------------------
#include <String.h>
Regex rx_param ("[()\\[a-zA-Z_0-9*\\]]+",1);
main () {
String param = "foobar[]";
if (param.matches (rx_param)) {
printf ("works\n");
}
else {
printf ("fails\n");
}
}
----------------------------------------
The problem here is trying to match a literal ']' within a character
class (the constructor for rx_param seems to believe it is the
trailing ']' for the character class). When I looked at the sparc ASM
output, it didn't reduce the '\\' to '\' at all, i.e., the asm output
looked like:
.text
.align 0
LC0:
.ascii "[()\\[a-zA-Z_0-9*\\]]+\0"
which doesn't work correctly.
Unfortunately, using a single backslash, i.e.,
"[()\[a-zA-Z_0-9*\]]+"
DOES strip the backslash off, so that this leaves
"[()[a-zA-Z_0-9*]]+"
which is clearly not going to work (since the regular expression
scanner will mistake the first ']' for the end of the character class.
So the question remains: how does one get a single backslash before
a ']'?
Doug