[comp.lang.c] lvalues

dmr@alice.UUCP (01/16/87)

The question of lvalues has arisen recently both in the context
of the silly C book review, and also in relation to the operand
of ++.  The term is old (it comes from BCPL or earlier) and
just denotes the things that can appear on the left (`l') of
an assignment.

The White Book and the current ANSI draft both waffle about whether
the term is formal or descriptive; they introduce it by, respectively,
"an expression referring to an object [which is a] manipulatable
region of storage;" and "an expression that designates an object."

It might cause less confusion if the definition were explicitly syntactic,
and only certain lvalues were permitted by the semantics to be
assigned to.  In this scheme, an lvalue (eliding precedence) is defined
as one of

	identifier
	( lvalue )
	lvalue . identifier
	* expression

Also, by applying equivalence rules,

	expression[expression]     =>    *(expression + expression)
	expression->identifier     =>    (*expression).identifier

Only some lvalues can appear on the left of `=' (e.g.: not array,
not const, not function).  Even more restrictions apply to operands
of `&': not register, not bit-field.

This suggestion doesn't change the language, but it makes it
clearer what an lvalue actually is.  Both the old and new
reference manuals make it hard for the reader to enumerate the possible
lvalues.

		Dennis Ritchie
		research!dmr

firth@sei.cmu.edu (Robert Firth) (01/19/87)

In article <6539@alice.uUCp> dmr@alice.UUCP writes:
>The question of lvalues has arisen recently both in the context
>of the silly C book review, and also in relation to the operand
>of ++.  The term is old (it comes from BCPL or earlier) and
>just denotes the things that can appear on the left (`l') of
>an assignment.
>

The current BCPL Standard does not use the terms Lvalue and
Rvalue.  It talks of "lmode evaluation" and "rmode evaluation".
The latter mode yields a value; the former mode yields a
value with the special property that it represents the address
of a "cell", ie a storage location able to hold a BCPL value.

L-expressions occur in two contexts: as the LHS of an assignment
and after the address construction operator ('@' in BCPL, '&'
in C)

>The White Book and the current ANSI draft both waffle about whether
>the term is formal or descriptive; they introduce it by, respectively,
>"an expression referring to an object [which is a] manipulatable
>region of storage;" and "an expression that designates an object."

The concept is part of the understanding of the execution semantics
of the language; in particular, the semantics of the abstract store.
Hence, it seems better to explain it in semantic, ie descriptive terms.

>It might cause less confusion if the definition were explicitly syntactic,
>and only certain lvalues were permitted by the semantics to be
>assigned to.  In this scheme, an lvalue (eliding precedence) is defined
>as one of
>
>	identifier
>	( lvalue )
>	lvalue . identifier
>	* expression
>
>Also, by applying equivalence rules,
>
>	expression[expression]     =>    *(expression + expression)
>	expression->identifier     =>    (*expression).identifier
>

I would find this most unhelpful: it tells you nothing about what an
Lvalue actually IS!  Syntactic sugar is empty calories - understanding
a concept comes first; the rules for writing things down come later.

>Only some lvalues can appear on the left of `=' (e.g.: not array,
>not const, not function).  Even more restrictions apply to operands
>of `&': not register, not bit-field.
>

Rather, the C language has evolved to a degree of complexity where
concepts such as lvalues are no longer fully appropriate.  Assignment
to a bit-field cannot be explained in terms of lvalue and rvalue,
but only in terms of extractor/updater functions.  The expression
designating a bit-field cannot be evaluated to yield a context-free
representable "value" that can be passed around, stored &c.  Hence there
is no lvalue of a bit-field.

>This suggestion doesn't change the language, but it makes it
>clearer what an lvalue actually is.  Both the old and new
>reference manuals make it hard for the reader to enumerate the possible
>lvalues.
>
>		Dennis Ritchie
>		research!dmr

Robert Firth

brian@merlyn.UUCP (Merlyn Leroy) (01/20/87)

>..Assignment
>to a bit-field cannot be explained in terms of lvalue and rvalue,
>but only in terms of extractor/updater functions.  The expression
>designating a bit-field cannot be evaluated to yield a context-free
>representable "value" that can be passed around, stored &c.  Hence there
>is no lvalue of a bit-field.

This is true of most machines; however, the new TI graphics chip is
bit-addressable, and pointers to bitfields are perfectly legal.
(This is of course a non-portable extension).  This is also a good
argument for little-endianism.

-- 
Merlyn Leroy	(...ihnp4!umn-cs!rosevax!rose3!merlyn!brian)

guy%gorodish@Sun.COM (Guy Harris) (01/21/87)

> >The expression designating a bit-field cannot be evaluated to yield a
> >context-free representable "value" that can be passed around, stored &c.
> >Hence there is no lvalue of a bit-field.
> 
> This is true of most machines;

This is neither true nor false of any machine.  Machines aren't being
discussed here; C compilers are.  You can have a C compiler that doesn't
support bit-field pointers on a machine that supports bit addressing; you
can have a C compiler that supports bit-field pointers on a machine that
doesn't (this would, of course, be an extended version of C, and you might
have to be careful not to make these extensions in such a way as to render
the implementation nonconforming).

There is no law that requires that implementations on machines that support
bit addressing support bit-field pointers, nor that implementations on
machines that don't support bit addressing not support bit-field pointers.
(One might just as well argue that you can't have "char *" on a
word-addressible machine.)  Consider all the PL/I implementations on
machines not supporting bit addressing.

throopw@dg_rtp.UUCP (01/24/87)

> brian@merlyn.UUCP (Merlyn Leroy)
>> ???

>>..Assignment
>>to a bit-field cannot be explained in terms of lvalue and rvalue,
>>but only in terms of extractor/updater functions.

Why not?  Granted, to do so gives up the principle that lvalues can
always be converted to addresses via '&', but this is already not the
case for other reasons.  Further, assignment to bit-fields IS NOW
defined in terms of lvalue/rvalue in K&R, H&S, and the current X3J11
draft.  Again... why not?

>>  The expression
>>designating a bit-field cannot be evaluated to yield a context-free
>>representable "value" that can be passed around, stored &c.  Hence there
>>is no lvalue of a bit-field.

This is a non-sequitur.  The statement before the "hence" is incorrect,
and even if correct would not imply the statement after the "hence"
(which is also incorrect).  Bit-field expressions evaluate quite
successfully to "values" when they are evaluated where lvalues are not
permitted (to use the X3J11 terminology), and evaluate equally
successfully to "lvalues" when they are evaluated where lvalues ARE
permitted (such as to the left of an assignment operator).  As noted
above, lvalues are permitted as the operand of unary '&', while
bit-fields are not.  But this has nothing to do with the assignment
operators.

> This [the above non-sequitur]
> is true of most machines; however, the new TI graphics chip is
> bit-addressable, and pointers to bitfields are perfectly legal.
> (This is of course a non-portable extension).  This is also a good
> argument for little-endianism.

Well, first, the "this" from above is not true of any machines.
Further, the advantage of little-endiansim referred to here has little
to do with bit granular addressing, but rather to do with certain kinds
of type punning.  Big-endianism, of course, has a different set of
advantages in type punning.  I am speaking as one who has programmed a
marvelous big-endian machine with bit-granular addressing, and spent
considerable time thinking about what would be better and what would be
worse if the machine were little-endian, so you can take my word for it.
:-)

--
All prediction is difficult, especially about the future.
                                --- Nils Bohr
-- 
Wayne Throop      <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

chris@mimsy.UUCP (Chris Torek) (01/18/88)

In some article somewhere, someone writes:
>>I believe that Standard C does not allow type casting on the left of the 
>>= sign.

In article <5080012@hpfcdc.HP.COM> boemker@hpfcdc.HP.COM (Tim Boemker) writes:
>How about this:
>
>int i; char *c;
>c = (char *) &i;
>* (int *) c = 0;

It is not portable, but it is legal.

The result of a cast is an `rvalue' (R for Right: or in other words,
the kind of thing one finds on the right hand side of an assignment).
Assignments may be made only to `lvalues' (L for Left: the kind of
thing found on the left hand side of an assignment).  The C indirection
operator `*' takes an rvalue and yeilds an lvalue, hence the result
of `* (int *) c' is an lvalue.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

jfh@killer.UUCP (The Beach Bum) (01/18/88)

In article <10227@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> In some article somewhere, someone writes:
> >>I believe that Standard C does not allow type casting on the left of the 
> >>= sign.
> 
> In article <5080012@hpfcdc.HP.COM> boemker@hpfcdc.HP.COM (Tim Boemker) writes:
> >How about this:
> >
> >int i; char *c;
> >c = (char *) &i;
> >* (int *) c = 0;
> 
> It is not portable, but it is legal.
 
I believe it used to be portable.  Now that X3J11 has mangled the language,
it isn't guaranteed to be portable.  If I recall correctly, a pointer to
any object can _portably_ be converted to a pointer to a smaller object and
back again.  Thus,

	int i; char * cp;
	cp = (char *) &i;
	* (int *) cp = 0;

is portable, but

	int *ip; char c;
	ip = (int *) &c;
	* (char *) ip = '0';

is not.

> The result of a cast is an `rvalue' (R for Right: or in other words,
> the kind of thing one finds on the right hand side of an assignment).
> Assignments may be made only to `lvalues' (L for Left: the kind of
> thing found on the left hand side of an assignment).  The C indirection
> operator `*' takes an rvalue and yeilds an lvalue, hence the result
> of `* (int *) c' is an lvalue.
> -- 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)

Chris, that's some good voodoo.  A cast takes an object of whatever
type it happens to be, and whatever state of (l|r)value-ness and changes
it's type, but not it's (l|r)value-ness.  The difference between lvalue
expressions and rvalue expressions is more obvious in a language such
as BLISS with all those silly dots. . . . . .

The only restriction on what can be on the left hand side of an `=' is
having an address and a type, or being finaglable into having both.  I
can cast an obvious rvalue like `2' into an int * with ((int *) 2).
This thing can now appear on either side -

	*((short *) 2) = *((short *) 4);

- John.
-- 
John F. Haugh II                  SNAIL:  HECI Exploration Co. Inc.
UUCP: ...!ihnp4!killer!jfh                11910 Greenville Ave, Suite 600
"Don't Have an Oil Well? ...              Dallas, TX. 75243
 ... Then Buy One!"                       (214) 231-0993 Ext 260

chris@mimsy.UUCP (Chris Torek) (01/19/88)

>In article <10227@mimsy.UUCP> I claimed taht
>>>int i; char *c;
>>>c = (char *) &i;
>>>* (int *) c = 0;
>is not portable, but it is legal.

In article <2975@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
>I believe it used to be portable.

It used to be `portable by default': that is, there was no portable
way to do it at all, but that came closest.

>Now that X3J11 has mangled the language, it isn't guaranteed to be
>portable.

Doug Gwyn mentioned that the current draft makes it portable again.
(My interpretation of his remark; beware.)

>>The result of a cast is an `rvalue' ....
>>Assignments may be made only to `lvalues' .... The C indirection
>>operator `*' takes an rvalue and yeilds an lvalue ....

>Chris, that's some good voodoo.

?

>A cast takes an object of whatever type it happens to be, and whatever
>state of (l|r)value-ness and changes it's type,

Good so far.  Note that any lvalue can be converted to an rvalue.
Note also that the cast may change its bit-pattern and/or shape (e.g.,
casting short to double).

>but not it's (l|r)value-ness.

Whoa!  Stop!  No!  The result of a cast is effectively that of
assigning to an unnamed temporary variable with the type given
in the cast.  This is an rvalue.

>The difference between lvalue expressions and rvalue expressions is
>more obvious in a language such as BLISS with all those silly dots. . . .

No: If I understand correctly, in BLISS, every expression can be
veiwed as both an lvalue *and* an rvalue.  This is untrue in C.

>The only restriction on what can be on the left hand side of an `=' is
>having an address and a type, or being finaglable into having both.

No: it must be an lvalue.  Naturally, all lvalues have a type; most
even have addresses, although some (register) might not.  But this
is not the abstraction defined by the C language. [**]

>I can cast an obvious rvalue like `2' into an int * with ((int *) 2).

Yet it is still an rvalue.

>This thing can now appear on either side -
>
>	*((short *) 2) = *((short *) 4);

(There are no `int *'s above.)  Once again, `*' (indirection) takes
an rvalue and produces an lvalue.  *(short *)2, if it means anything,
means an lvalue at 2 (whatever that is).  If `2' happens to be a
valid address for a variable of type `short', this goes; if not,
you get a segmentation fault or other weird error.

If I had my K&R here, I would check, but I believe the only operators
that convert rvalues to lvalues are unary `*' (indirect, not
multiply) and `->' (pointer-to-structure-member).  All others result
in rvalues; some, such as unary `&' (address of) work only on
lvalues.

Mathematically speaking, unary `&' and `*' are inverse functions.
Unary `&' takes an lvalue of type `T' and produces an rvalue of
type `pointer to T'; unary `*' takes an rvalue of type `pointer to
T' and produces an lvalue of type `T'.  The <value, type> pairs
from unary `&' must be distinct for distinct lvalues [*], and there
may be only one value produced for any particular lvalue.  This
makes it an invertible function; `*' is then defined as the inverse
function.  While `*' is a well-behaved function on conventional
architectures, all that the language requires is that it be the
inverse of `&'.

-----
[*] Actually, <v,t> pairs need not be distinct for `dead' lvalues,
e.g., `int *i1,*i2; { int j1; i1 = &j1; } { int j2; i2 = &j2; }'.
The language states that using the <v,t> pair of a dead lvalue is
an error.  Whether certain different-looking lvalues are in fact
the same on segmented architectures with built-in aliasing (80x86)
is open to debate, but there is no legal way to obtain two different
<segment,offset> pairs that yeild the same address.  That is, it
is easy to do, but it is not part of the C language itself.

[**] The strongest statements the C language makes about addresses
are: they exist; they have values of type `pointer to T'; some form
of arithmetic, with the two operations A + I => A and A - A -> I,
works on them; the legal range for the integer value I is defined
by the declaration (array) or allocation (via malloc) of the object;
adding a legal integer I to an address A yeilds an address A1 such
that A1 - A = I; successively adding 1 to an address yeilds successive
array elements.  (I think that covers it: the rest can be derived.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/19/88)

In article <2975@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
>In article <10227@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
>> The result of a cast is an `rvalue' (R for Right: or in other words,
>> the kind of thing one finds on the right hand side of an assignment).
>> Assignments may be made only to `lvalues' (L for Left: the kind of
>> thing found on the left hand side of an assignment).  The C indirection
>> operator `*' takes an rvalue and yeilds an lvalue, hence the result
>> of `* (int *) c' is an lvalue.
>Chris, that's some good voodoo.  A cast takes an object of whatever
>type it happens to be, and whatever state of (l|r)value-ness and changes
>it's type, but not it's (l|r)value-ness.

If you don't know what you're talking about, you should shut up!

Chris is right (as usual) -- a cast produces a non-lvalue (which you
may call an `rvalue' if you wish), and it is only the result of
applying a * operator that is an lvalue as required for the target
of the assignment.

A cast performs a simultaneous conversion and virtual assignment of
its operand into an unnamed temporary register, in effect.  There
is no way that the result can be conceived of as being addressable.

>The only restriction on what can be on the left hand side of an `=' is
>having an address and a type, or being finaglable into having both.

Wrong -- it must be a modifiable lvalue.

jfh@killer.UUCP (John Haugh) (01/20/88)

In article <10236@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> >In article <10227@mimsy.UUCP> I claimed taht
> >>>int i; char *c;
> >>>c = (char *) &i;
> >>>* (int *) c = 0;
> >is not portable, but it is legal.
> 
> In article <2975@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
> >I believe it used to be portable.
> 
> It used to be `portable by default': that is, there was no portable
> way to do it at all, but that came closest.
 
Sorry Chris, here's the reference I was refering to in my `I believe'
statement:

	C Language Reference Manual, D. M. Ritchie, 1978

	Section 14.4, paragraph 4

	"A pointer to one type may be converted to a pointer another
	type.  The resulting pointer may cause addressing exceptions
	if the subject pointer does not refer to an object suitably
	aligned in storage.  It is guaranteed that a pointer to an
	object of a given size may be converted to a pointer to an
	object of a smaller size and back again without change."

	[ reprinted without permission ]

Sounds like the closest statement to being portable I can come up
with.

> >Now that X3J11 has mangled the language, it isn't guaranteed to be
> >portable.
> 
> Doug Gwyn mentioned that the current draft makes it portable again.
> (My interpretation of his remark; beware.)

Except that the X3J11 (how close is that to XJ6 or XJS ;-) committee
changed it from `any smaller object' to `may be cast to and from a
void object' (ie, replace (char *) with (void *) ).  This is what I've
read.  I believe the ability to cast to/from a short has been removed.

> >>The result of a cast is an `rvalue' ....
> >>Assignments may be made only to `lvalues' .... The C indirection
> >>operator `*' takes an rvalue and yeilds an lvalue ....
> 
> >Chris, that's some good voodoo.
> 
> ?
> 
> >A cast takes an object of whatever type it happens to be, and whatever
> >state of (l|r)value-ness and changes it's type,
> 
> Good so far.  Note that any lvalue can be converted to an rvalue.
> Note also that the cast may change its bit-pattern and/or shape (e.g.,
> casting short to double).
> 
> >but not it's (l|r)value-ness.
> 
> Whoa!  Stop!  No!  The result of a cast is effectively that of
> assigning to an unnamed temporary variable with the type given
> in the cast.  This is an rvalue.

Yes, you are right.  I deleted the rest of the mess I made.  I misunderstood
you to be saying that `*' somehow turned things into l-values only, which
contradicts being able to dereference a pointer and get an r-value.  The
example was intended to show U-star in both positions.

The best point you [ went on to make ] is that '&' and '*' are complimentary
operations.

> -- 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
> Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

In-Real-Life:  Captain Fanastic, Defender of Bathroom Soap Scum,

But in this life, I remain -
- John.
-- 
John F. Haugh II                  SNAIL:  HECI Exploration Co. Inc.
UUCP: ...!ihnp4!killer!jfh                11910 Greenville Ave, Suite 600
"Don't Have an Oil Well? ...              Dallas, TX. 75243
 ... Then Buy One!"                       (214) 231-0993 Ext 260