[comp.os.vms] The DCL "&" operator

leichter@venus.ycc.yale.EDU ("Jerry Leichter") (11/11/87)

	>At DCL level, entering a line with only a "&" on it will instantly
	>crash your process, with some kind of signal error.
	I have no explanation for that as i can't duplicate it.

This was a DCL bug introduce in, I believe VMS V4.4 - could have been 4.2 -
and fixed in the next release after (either 4.3 or 4.5).

	The "&" symbol is used by DCL as an "indirect substitute variable"
	directive.  It is similar in function to the "'" directive, but
	carries the substitution one step further.  It uses the contents of a
	symbol to refer to a symbol that should be operated on.

This explanation, while it correctly describes some uses of the "&" operator,
is not correct.  The basic idea is rather simple:  DCL line processing occurs
in several passes, or phases.

Phase 1:  Scan the line for "'"'s and make the substitutions specified.
	  Phase 1 scans the entire line ONCE, from left to right; after
	  each substitution, it resumes scanning at the first character
	  just substituted.

Phase 2:  This one is simple:  The first lexem is pulled off the line.  If
	  it is a defined symbol, the value of the symbol replaces the
	  symbol.  No further substitution is done.  (There are actually
	  some complications to allow you to redefined symbols that are
	  already defined - if the second lexem is something like "=", no
	  substitution is done.

Phase 3:  So far, we've been doing low-level lexical processing.  The code
	  for Phases 1 and 2 know little about DCL - just things like the
	  syntax of strings.  Phase 3 is "syntactic" processing:  It pulls
	  of the first lexeme remaining after Phase 2 completes, and looks
	  it up in its syntax tables.  It then knows what syntax to expect
	  from the rest of the line.  It now parses the rest of the line.
	  There is one special case:  If the first character of a lexeme
	  it finds during parsing is "&", the rest of the lexeme must be
	  a symbol name - NOT a lexical function!  The value of the symbol
	  replaces the lexeme and parsing continues.

	  There are all sorts of "smarts" in Phase 3.  For example, it knows
	  that some contexts - like just after an "IF" - require evaluation
	  as DCL expressions.  That's why MOST uses of "'" in IF conditionals
	  and similar contexts are unnecessary - holdovers from earlier, less
	  sophisticated versions of DCL.  (Mainly pre-V3 VMS.)

If verification is turned on, it occurs between Phases 1 and 2.  The side-
effects of lexical functions occur immediately; that's why:

	$ SET VERIFY
	$ X = F$VERIFY(0)

displays the second line - F$VERIFY isn't evaluated until Phase 3 - while

	$ SET VERIFY
	$ X = 'F$VERIFY(0)'

does not - F$VERIFY is evaluated during Phase 1.

If you follow the logic of the phases through, you can see why after:

	$ X1 = "X2"
	$ X2 = "DIRECTORY"

the command:

	$ X1

complains about X2 not being a command, while the command:

	$ 'X1'

gives you a directory listing.  Also, why a command file that does:

	$ DIR = "DIR"

can then use the DIR command and get a directory without using any options
that might have been added to DIR in the surrounding context.  (Note:  Because
of some special-case code in Phase 2, DIR = "" does the same thing.)

So, why "&"?  The "&" operator was introduced early on to allow you to get
away from the strict left-to-right processing of Phase 1.  Essentially, it
gives you an extra pass to work with.  It was necessary before the "="
operator worked on strings (pre-V3); in those days, you had to use := for ALL
string operations - and := then, as now, does NOT evaluate its right hand
side.  Consider the command file arguments P1, P2, ... P8.  These are more
or less an array of values.  Suppose you wanted to copy the I'th element
somewhere:

	$ COPY 'P'I'' somewhere:

is the obvious try, but if fails:  Phase 1 proceeds right to left.  First it
sees 'P'.  Since P is presumably undefined, it substitutes the null string:

	$ COPY I'' somewhere:

All that's left for Phase 1 is the empty substitution for '', and we get:

	$ COPY I somewhere:

Not at all what we wanted.

You can try other combinations, but you'll soon convince yourself that there
is no way to get Phase 1 to give you the VALUE of P[I].  The problem is that
the I is to the RIGHT of the P, but must be substituted FIRST - and Phase 1
never backs up.

With "&", this is easy:

	$ COPY &P'I' somewhere:

Phase 1 (if I is 3) produces:

	$ COPY &P3 somewhere:

and DCL gives COPY the VALUE of P3, exactly as desired.

Using "=", this effect can be accomplished indirectly by doing:

	$ V = P'I'

since the right hand side of "=" is evaluated; so you don't need "&".  (This
simple operation - setting V to P[I]'s value - is probably impossible using
just the operators of V2 DCL.)	

So, "&" is rarely needed these days.  In theory, it could play another role:
Phase 1 substitution is simple string substitution, so the string substituted
may end up being split up into multiple lexical units later.  "&" substitution
specifies a value for a single lexical entity - it could contain spaces or
"illegal" characters.  In practice, this doesn't always work right.

							-- Jerry
------