[comp.std.c] so how do I do it?

minar@reed.edu (06/27/91)

If

void * p;
(int *)p++;

is illegal, how do I do what I mean?

I'm writing some code right now that needs to extract information from a buffer
that contains various types in it. Lets say that there's a chunk of memory that
I *know* first contains a char, then an unsigned. I want to get these values.

The obvious way to do it is
struct foo {
  char c;
  unsigned u;
};

and then (struct foo *)p->c  or (struct foo *)p->u

this is nonportable, as to my understanding, as struct arrangements are not
guaranteed.

The next best thing is:
*(char *)p
to get the char. Then, I want to get to the unsigned that's next, so the
obvious next step is
(char *)p++
ie: increment one character. Later, I can
(unsigned *)p++
to get past the unsigned.

This seems very natural to me. Is it illegal? If it is, what do I do instead?

(if you're curious, the structures I'm dealing with are more like:

struct foo {
  char c[x];
  int i[x];
};

where x is variable at runtime.

I don't have to write this code portably, but I'd like to write it correctly
according to ANSI anyway.

while I'm at it, how do you get the offset of an element of a structure
the ANSI way?

(please send responses to me via mail (or post, too, if you want) - I don't
always get to read news.)

diamond@jit533.swstokyo.dec.com (Norman Diamond) (06/27/91)

In article <m0jspJY-0001ieC@shiva.reed.edu> minar@reed.edu writes:
>If
>void * p;
>(int *)p++;
>is illegal, how do I do what I mean?

p = (int *)p + 1;

This was already answered earlier.

>I'm writing some code right now that needs to extract information from a buffer
>that contains various types in it. Lets say that there's a chunk of memory that
>I *know* first contains a char, then an unsigned. I want to get these values.
>The obvious way to do it is
>struct foo {
>  char c;
>  unsigned u;
>};
>and then (struct foo *)p->c  or (struct foo *)p->u

This is legal.

>this is nonportable, as to my understanding, as struct arrangements are not
>guaranteed.

Huh?  Oh, you mean that the buffer was laid out by some other entity
than your C program.  Yes, C (along with every other language except Ada!)
fails to provide for manipulations of such data.  Actually the Ada syntax
is also non-portable, but there's a standardized way to express that you're
manipulating such things.

>The next best thing is:   *(char *)p   to get the char.
Yes.

>Then, I want to get to the unsigned that's next, so the
>obvious next step is   (char *)p++   ie: increment one character.
p = (char *)p + 1

>Later, I can  (unsigned *)p++  to get past the unsigned.
p = (unsigned *)p + 1

Actually in these cases, you might prefer to declare p as a char * instead
of void *.  Then you can say   p++   and   p += sizeof(int)

>while I'm at it, how do you get the offset of an element of a structure
>the ANSI way?
offsetof()
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.
Permission is granted to feel this signature, but not to look at it.

torek@elf.ee.lbl.gov (Chris Torek) (06/27/91)

In article <1991Jun27.115736.18417@tkou02.enet.dec.com>
diamond@jit533.enet@tkou02.enet.dec.com (Norman Diamond) writes:
>Huh?  Oh, you mean that the buffer was laid out by some other entity
>than your C program.  Yes, C (along with every other language except Ada!)
>fails to provide for manipulations of such data.

Not strictly true:  Mesa (for instance) allows record specifiers that
include actual bit or byte addresses.  In Mesa, though, it was all
or nothing: either you said

	type foo = record [ blah, blah, blah ];

and got arbitrary compiler-munching (including order), or you gave it
the `machine-dependent' clause and got absolute control.

There are a bunch of different concepts that different languages do
or do not allow you to express.  One is:

	This is exactly how the bits work.  Change none of them.
	[Example: Mesa `machine-dependent' records]

which is good for talking to devices or network byte streams.  Another
is:

	This is the data I need carried about.  Optimize for time.
	[Example: Pascal records]

Still another is:

	This is the data I need carried about.  Optimize for space.
	[Example: Pascal packed records]

C takes a somewhat wishy-washy approach and tries to do everything at
once: `struct's are not reorderable, but are not necessarily packed.
As it turns out, though, even Pascal's packed/notpacked approach is
insufficient in some cases: you might want to say `pack this as much as
is convenient, but not so much as to cause order of magnitude
slowdowns'.  Some Pascal compilers treat

	type seven = 0..127;
	var foo : packed array [0..16777215] of sevenbits;

as arrays of bytes, some as arrays of seven-bit fields; the latter
is sometimes severely slower, but does save 2 megabytes.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

gwyn@smoke.brl.mil (Doug Gwyn) (06/28/91)

In article <1991Jun27.115736.18417@tkou02.enet.dec.com> diamond@jit533.enet@tkou02.enet.dec.com (Norman Diamond) writes:
>>and then (struct foo *)p->c  or (struct foo *)p->u
>This is legal.

Actually, no.  ((struct foo *)p)->c is the way to write this.
You want to cast the pointer, not the char member.

>>this is nonportable, as to my understanding, as struct arrangements are not
>>guaranteed.
>Huh?  Oh, you mean that the buffer was laid out by some other entity
>than your C program.

I don't know exactly what he meant (could have been any of several things),
but the problems are:  (1) there might be padding between the struct members
and (2) the representation, especially of the unsigned member, may not agree
with the C implementation's.  The first problem is easy; the second one is
harder.

diamond@jit533.swstokyo.dec.com (Norman Diamond) (06/28/91)

In article <16561@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <1991Jun27.115736.18417@tkou02.enet.dec.com> diamond@jit533.enet@tkou02.enet.dec.com (Norman Diamond) writes:
>>>and then (struct foo *)p->c  or (struct foo *)p->u
>>This is legal.
>Actually, no.  ((struct foo *)p)->c is the way to write this.
>You want to cast the pointer, not the char member.

Sorry.  I probably messed up the syntax on a few other lines too.
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.
Permission is granted to feel this signature, but not to look at it.

volpe@camelback.crd.ge.com (Christopher R Volpe) (06/28/91)

In article <m0jspJY-0001ieC@shiva.reed.edu>, minar@reed.edu writes:
|>If
|>
|>void * p;
|>(int *)p++;
|>
|>is illegal, how do I do what I mean?

When you say "(int *)p++;", what you're really saying is:
  (int *)p = (int *)p + 1;

The only problem with that is that the left hand side is not an lvalue, because
of the cast. So, nuke the cast:
  p = (int *)p + 1;

There, the RHS is the expression you want, which can legally be assigned
to p, since it is an lvalue of type (void *).

-Chris
                                                  
==================
Chris Volpe
G.E. Corporate R&D
volpecr@crd.ge.com

jfw@ksr.com (John F. Woods) (06/28/91)

minar@reed.edu writes:
>If
>void * p;
>(int *)p++;
>is illegal, how do I do what I mean?

Since it is illegal, you'll have to explain what you WANT it to main.

>I'm writing some code right now that needs to extract information from a
>buffer that contains various types in it. Lets say that there's a chunk of
>memory that I *know* first contains a char, then an unsigned. I want to get
>these values.

Ah.

>The obvious way to do it is
>struct foo {
>  char c;
>  unsigned u;
>};
>and then (struct foo *)p->c  or (struct foo *)p->u
>this is nonportable, as to my understanding, as struct arrangements are not
>guaranteed.

They are guaranteed.  The address of p->c is guaranteed to be less than the
address of p->u.  Of course, the PADDING is implementation defined.


>The next best thing is:
>*(char *)p
>to get the char.

Yes.

>Then, I want to get to the unsigned that's next, so the obvious next step is
>(char *)p++

Not it you plan to program in C.

p = (char *)p + sizeof(char);

will increment p by the size of a char.  NOTE:  I assume you think you can then
type

*(unsigned *)p

to get the value of the unsigned int which follows, and this is WRONG.  A wide
variety of interesting (i.e. neither VAX nor washing-machine-controller)
machines will take addressing exceptions if you access objects on random
alignments.  To access that unsigned int will require something like

unsigned int victim;
bcopy(p, &victim, sizeof(unsigned int));	/* or your favorite memory
						 * mover or even a carefully
						 * chosen macro
						 */
p = (char *)p + sizeof(unsigned int);

But this, of course, assumes that the byte-order in that packed "structure"
is the same as the machine's natural byte-order; if you're exchanging these
"structures" with another machine (via disk file or network) that may not
be the case, in which case "accessor macros" are an excellent idea here (i.e.
the putlong()/getlong() macros found in the BSD named resolver routines).

This kind of problem has been solved correctly and well in C over and over
again.

>I don't have to write this code portably, but I'd like to write it correctly
>according to ANSI anyway.

Note that unportable code is not only a problem when you finally upgrade from
your 8080 system to a Laptop Cray; it can cause you grief the next time you
get a compiler upgrade (hahahahahahahahahahahaha) or even using the same
compiler, if it is an aggresive optimizer which prizes speed of code over
consistency in undefined cases.

>while I'm at it, how do you get the offset of an element of a structure
>the ANSI way?

Using the offsetof() macro.  Please go buy and read an ANSI C manual.

"I'm having troubles with an Ada program; I don't actually have an Ada manual,
nor have I ever seen one, so I just typed in the following BASIC program, and
I don't understand why it doesn't work:"

peter@ficc.ferranti.com (Peter da Silva) (06/29/91)

In article <m0jspJY-0001ieC@shiva.reed.edu> minar@reed.edu writes:
> I'm writing some code right now that needs to extract information from a
> buffer that contains various types in it.

> Lets say that there's a chunk of memory that
> I *know* first contains a char, then an unsigned. I want to get these values.

Option 1: define a structure for the buffer. This will automatically handle
alignment requirements and the like.

Option 2: If the buffer is packed, or imported, you will have to step
through it byte by byte:

	char *bufp;
	unsigned u;
	char c;

	bufp = buffer;

	c = *bufp++;
	u = *bufp++ << bitsperchar;
	u |= *bufp++;

Or, if the buffer is little-endian:

	c = *bufp++;
	u = *bufo++;
	u |= *bufp++ << bitsperchar;

> this is nonportable, as to my understanding, as struct arrangements are not
> guaranteed.

No, that's portable. Structs are guaranteed to be in increasing order. Padding
is undefined, as is byte order within a word.

> while I'm at it, how do you get the offset of an element of a structure
> the ANSI way?

Use the offsetof() macro.
-- 
Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180;
Sugar Land, TX  77487-5012;         `-_-' "Have you hugged your wolf, today?"