[comp.unix.questions] CSH Help -- I don't get it ...

rich@eddie.MIT.EDU (Richard Caloggero) (08/06/88)

     What am I doing wrong?
I think the following script should print
'yes' three times.

-----
#!/bin/csh
set x=stuff_and_more
echo try one:
if "$x" =~ "stuff*" echo yes

echo try two:
alias x 'if "$x" =~ "stuff*" echo yes'
x

echo try three:
alias x 'if "$x" =~ "stuff"* echo yes'
x

----------


    It only prints 'yes' once -- on try three.
I'm sure it has something to do with my
poor understanding of csh's various quoting conventions.


-- 
						-- Rich (rich@eddie.mit.edu).
	The circle is open, but unbroken.
	Merry meet, merry part,
	and merry meet again.

greim@sbsvax.UUCP (Michael Greim) (08/08/88)

In article <9832@eddie.MIT.EDU>, rich@eddie.MIT.EDU (Richard Caloggero) writes:
< 
< 
<      What am I doing wrong?
< I think the following script should print
< 'yes' three times.
< 
< -----
< #!/bin/csh
< set x=stuff_and_more
< echo try one:
< if "$x" =~ "stuff*" echo yes
< 
< echo try two:
< alias x 'if "$x" =~ "stuff*" echo yes'
< x
< 
< echo try three:
< alias x 'if "$x" =~ "stuff"* echo yes'
< x
< 
< ----------
< 
< 
<     It only prints 'yes' once -- on try three.
< I'm sure it has something to do with my
< poor understanding of csh's various quoting conventions.
< 
It's a bug in csh or at least an undocumented strangeness.
There has been some discussion some months ago about
how to fix it, or indeed if we should fix it at all. Rob McMahon (hello!)
and me have had a lot of arguments on how to fix it, and we have come up with
a solution, well, sort of. I wanted to write a short text on pattern matching,
but as I am rather lazy normally, this has rested in my shelves until now.
I promise I will look into it again.
If you have source code, you are lucky, if not, you will have to contact
someone who has for the modified version.

(the following may lack in accuracy due to some memory fault of the author's)

In your first example the string resulting from expanding "$x" has
no 8 bit turned on, but the string "stuff*" has. Thus "*" is not recognized
as a shell meta character. A fix I made some months ago and posted to the
net took care of this bug, I think. It was listed as <486@sbsvax.UUCP>
from 25.apr.88. But there were still some flaws in pattern matching
which it might even be impossible to straighten out.

BTW:
Did you ever wonder why 'if "$x" =~ "stuff"*' does not try to do
any filename substitution? The big question is : should it ?

	-mg
-- 
+------------------------------------------------------------------------------+
| UUCP:  ...!uunet!unido!sbsvax!greim   | Michael T. Greim                     |
|        or greim@sbsvax.UUCP           | Universitaet des Saarlandes          |
| CSNET: greim%sbsvax.uucp@Germany.CSnet| FB 10 - Informatik (Dept. of CS)     |
| ARPA:  greim%sbsvax.uucp@uunet.UU.NET | Bau 36, Im Stadtwald 15              |
| Phone: +49 681 302 2434               | D-6600 Saarbruecken 11, West Germany |
+------------------------------------------------------------------------------+
| # include <disclaimers/std.h>                                                |
+------------------------------------------------------------------------------+

pdc@otter.hple.hp.com (Damian Cugley) (08/10/88)

/ comp.unix.questions / greim@sbsvax.UUCP (Michael Greim) /  Aug  8, 1988 /

> < if "$x" =~ "stuff*" echo yes
> < alias x 'if "$x" =~ "stuff*" echo yes'
> < x
> < alias x 'if "$x" =~ "stuff"* echo yes'
> < x
> <     It only prints 'yes' once -- on try three.
> It's a bug in csh or at least an undocumented strangeness.

It may not be a bug. It *is* consistent with the rest of csh.

"..."  quoting prevents pattern patching when expanding filenames in
commands, so it makes sense for the same to happen when using patterns
with =~ or !~.  

In the first two of his 'if' examples, the * was quoted, i.e.  it was no
longer 'magical'.  In the last one the * is magic and so pattern
matching works.  The quotes around the $x and the use of 'alias' are all
red herrings.

> In your first example the string resulting from expanding "$x" has
> no 8 bit turned on, but the string "stuff*" has. Thus "*" is not recognized
> as a shell meta character.

I don't think this makes much sense to me - I wonder what the person
asking the question made of it :-( .  The * isn't recognised as being
magical because it is quoted (i.e.  in "...").  Exactly how csh keeps
track of what's quoted and what isn't is beside the point.

>                             A fix I made some months ago and posted to the
> net took care of this bug, I think. It was listed as <486@sbsvax.UUCP>
> from 25.apr.88. But there were still some flaws in pattern matching
> which it might even be impossible to straighten out.

I don't *think* it's a bug (I don't claim to be an expert).  In the man
page for csh it says "..."  prevents filename expansion, i.e. makes *
unmagical.  It should've said it made it unmagic in =~ patterns too, but
that's a reasonable deduction.  

Why would it be useful for pattern-matching to work even when quoted?
How then would we turn it *off*!?  If there's no bug, 'fixing' it will
only cause more problems (i.e. introduce more bugs...).

> BTW:
> Did you ever wonder why 'if "$x" =~ "stuff"*' does not try to do
> any filename substitution? The big question is : should it ?

No, I didn't wonder.  No, it shouldn't.  'If's are special csh keywords,
and the test part isn't a command; hence there's no reason whatsoever
why it should try to expand patterns into filenames in the test part of
an 'if'.

--
Damian Cugley			'His feet are the wrong size for his shoes.'

greim@sbsvax.UUCP (Michael Greim) (08/12/88)

Ok, Damian, my posting was unclear.
Thanks for pointing it out. (I still have much to learn :-)

In article <1170003@otter.hple.hp.com>, pdc@otter.hple.hp.com (Damian Cugley) writes:
> / comp.unix.questions / greim@sbsvax.UUCP (Michael Greim) /  Aug  8, 1988 /
> 
> > < if "$x" =~ "stuff*" echo yes
> > < alias x 'if "$x" =~ "stuff*" echo yes'
> > < x
> > < alias x 'if "$x" =~ "stuff"* echo yes'
> > < x
> > <     It only prints 'yes' once -- on try three.
> > It's a bug in csh or at least an undocumented strangeness.
> 
> It may not be a bug. It *is* consistent with the rest of csh.
Yes, this may not be a bug. Let's stick to that it is not a bug.
     ^^^^
(So nobody needs a fix, nobody has to work one out, and
all versions of csh behave the same :-)
> 
> "..."  quoting prevents pattern patching when expanding filenames in
> commands, so it makes sense for the same to happen when using patterns
> with =~ or !~.  
Ok.
> 
> > In your first example the string resulting from expanding "$x" has
> > no 8 bit turned on, but the string "stuff*" has. Thus "*" is not recognized
> > as a shell meta character.
> 
> I don't think this makes much sense to me - I wonder what the person
> asking the question made of it :-( .  The * isn't recognised as being
> magical because it is quoted (i.e.  in "...").  Exactly how csh keeps
> track of what's quoted and what isn't is beside the point.
Sorry.
> 
> >                             A fix I made some months ago and posted to the
> > net took care of this bug, I think. It was listed as <486@sbsvax.UUCP>
> > from 25.apr.88. But there were still some flaws in pattern matching
> > which it might even be impossible to straighten out.
> 
> I don't *think* it's a bug (I don't claim to be an expert).  In the man
> page for csh it says "..."  prevents filename expansion, i.e. makes *
> unmagical.  It should've said it made it unmagic in =~ patterns too, but
> that's a reasonable deduction.  
> 
> Why would it be useful for pattern-matching to work even when quoted?
> How then would we turn it *off*!?  If there's no bug, 'fixing' it will
> only cause more problems (i.e. introduce more bugs...).
I have made a modification to pattern matching, where
a pattern is command- and filename- expanded like any other operand.
"abc*" is a pattern matching abc, abcd, ... but "abc\*" only matches
the string "abc*". Ok, it introduces more problems and its usefulness
is at least doubtful. That's one of the reasons why I have not posted it.
(And I could not make myself to write a small text on the differences)

The bug fix I mentioned above was for another but similar problem
(my, am I confused lately :-), viz.:
	- create files "a,b" and "abc"
	- try to list these two files using "{" and "}" filename expansion
		metacharacters as in the example below
Script started on Fri Aug 12 10:48:16 1988
% ls
a,c		abc		typescript
% ls a{b,,}c
ac not found
ac not found
abc
% ls a{b,\,}c
ac not found
ac not found
abc
% ls a{b,","}c
ac not found
ac not found
abc
% ls a{b,','}c
ac not found
ac not found
abc
% 
And so on. The above cited fix was for this problem. Afterwards
% ls a{b,\,}c
produced the correct output :
a,c	abc
The same applies to metacharacters "[", "-" and "]".
This problem is about pattern matching but besides the point
of this discussion.
> 
> > BTW:
> > Did you ever wonder why 'if "$x" =~ "stuff"*' does not try to do
> > any filename substitution? The big question is : should it ?
> 
> No, I didn't wonder.  No, it shouldn't.  'If's are special csh keywords,
> and the test part isn't a command; hence there's no reason whatsoever
> why it should try to expand patterns into filenames in the test part of
> an 'if'.
But it expands other operands. Try :
#! /bin/csh
if ( Gargle* =~ "Gargle"*) then
	echo "1 : ok."
else
	echo "1 : false."
endif
and you will see, that csh filename expands the left operand of "=~".

BTW:
And why is the pattern not command substituted? Does the manual say
so? Should it not be possible to
	if ( "$a" =~ *`hostname`*) ...
whereas
	if ( "$a" == `hostname` ) ...
works ??
(You may not have noticed this, as it produces no diagnostic message.)
Ok, you can circumvent it, but isn't it a bug? Or is it no bug, if
it would have been documented?

The manual should be extended to say some lines about patterns in
csh expressions.
One can adapt to a lot of strange rules if they are documented.

	-mg
-- 
+------------------------------------------------------------------------------+
| UUCP:  ...!uunet!unido!sbsvax!greim   | Michael T. Greim                     |
|        or greim@sbsvax.UUCP           | Universitaet des Saarlandes          |
| CSNET: greim%sbsvax.uucp@Germany.CSnet| FB 10 - Informatik (Dept. of CS)     |
| ARPA:  greim%sbsvax.uucp@uunet.UU.NET | Bau 36, Im Stadtwald 15              |
| Phone: +49 681 302 2434               | D-6600 Saarbruecken 11, West Germany |
+------------------------------------------------------------------------------+
| # include <disclaimers/std.h>                                                |
+------------------------------------------------------------------------------+

pdc@otter.hple.hp.com (Damian Cugley) (08/16/88)

/ comp.unix.questions / greim@sbsvax.UUCP (Michael Greim) /   Aug 12, 1988 /
Quickly dispose of the basenote topic before getting on with the drift :-)

> > It may not be a bug. 
> Yes, this may not be a bug.
>      ^^^^
> (So nobody needs a fix, nobody has to work one out, and
> all versions of csh behave the same :-)
Agree!  Agree!  :-D

> The bug fix I mentioned above was for another but similar problem
> [...See Michael's posting for details!...]
> The above cited fix was for this problem. Afterwards
> % ls a{b,\,}c  produced the correct output :
> a,c     abc

I would also have expected '\,' to work, using the same logic as before.


> [Csh] expands other operands. Try :
> if ( Gargle* =~ "Gargle"*) then
> [...]
> and you will see, that csh filename expands the left operand of "=~".

By gum, so it does.  Wild.  (I guess the csh designers would reply as the
left operand of =~ isn't supposed to be a pattern, it can anly be expected
to act weirdly if it is...  I dunno what I'd've done, maybe an error
message ('Pattern on left of =~'), or maybe treat the '*' as unmagical?)

I would suspect the csh programmers never considered that these situations
would occur often, and didn't bother making csh react to them sensibly...


> BTW:
> And why is the pattern not command substituted? Does the manual say
> so? Should it not be possible to
>         if ( "$a" =~ *`hostname`*) ...
> whereas
>         if ( "$a" == `hostname` ) ...
> works ??

Ummm...  Errmmm...  well, the csh man page says command substitution comes
before filename expansion, but not where =~ patterns come in (it should do,
though).  I guess the designers of csh thought something like

	if ($wombat =~ `cmd file.*`)

would be more useful more often than having patterns outside the `...`.

(As a user of csh (as opposed to a programmer of csh) I'd've expected the
filenames in `...` to be expanded, *then* the command executed, *then* the
patterns involving the output of the command.  This would make both of
these work - but would need two separate stages of pattern-checking (or
three?) and would be even more complicated.)


> The manual should be extended to say some lines about patterns in
> csh expressions.
> One can adapt to a lot of strange rules if they are documented.
> 
>         -mg

I agree 1000%.  Just having a summary of the order of the various stages
would help in these sorts of questions.

The problem is the csh 'page' is already too long for the way it is now - I
had a terrible time ploughing through it for the first time.  It's a big
enough topic to warrant its own mini-glossary, ToC and/or index.  (Or maybe
hypertexed like the Emacs info pages?  Then each section could have a
summary of how X-substitution interacts with Y-substitution without the
manual seeming repetative.)

Anyone like to volunteer to re-write the entire manual?  Just a thought :-) .

pdc
--
/--Damian Cugley---\/---------------------------------------\/----------------\
|  St Edmund Hall  ||  UKNet: <pdc@uk.ac.oxford.prg.pampa>  || 'His feet are  |
|  OXFORD          ||  Other-  pdc@hplb.lb.hp.co.uk  (Until || the wrong size |
|  U.K.  OX1 4AR   ||  net:(?)/---------------------\  Oct) || for his shoes.'|
\------------------/\--------/ #disclaim <net/std.h> \------/\----------------/