[comp.lang.c] struct accessing

dsr@stl.stc.co.uk (David Riches) (06/22/89)

I have a problem here which I'd like to get round if possible.
Say I have a structure like :-

struct fred
  {
   int tom;
   int dick;
   int harry;
  }

To look at any particular field I would say; fred.tom or fred.dick etc.

Now, I have a variable which tells me the name of the field in fred
which I would like to look at, e.g. field_name.  So if field_name
holds the name dick I want to look at fred.dick and so on.

What I'm doing at the moment is using a case statement to interrogate
field_name and to switch to the appropriate statement which lets me
look at the field I want.

This gets messy when the struct gets big. Is there a more subtle way
of doing this?  For instance, in my dreams, I would like to have a
statement which says :-

	person = fred.$field_name$

where $field_name$ means "take my contents and use that as the field
name.".

Does anyone have an elegant solution to this?

   Dave Riches
   PSS:    dsr@stl.stc.co.uk
   ARPA:   dsr%stl.stc.co.uk@earn-relay.ac.uk
   Smail:  Software Design Centre, (Dept. 103, T2 West), 
	   STC Technology Ltd., London Road,
	   Harlow, Essex. CM17 9NA.  England
   Phone:  +44 (0)279-29531 x2496

henry@utzoo.uucp (Henry Spencer) (06/24/89)

In article <1545@stl.stc.co.uk> dsr@stl.stc.co.uk (David Riches) writes:
>Now, I have a variable which tells me the name of the field in fred
>which I would like to look at, e.g. field_name...
>What I'm doing at the moment is using a case statement to interrogate
>field_name...
>This gets messy when the struct gets big. Is there a more subtle way
>of doing this?  For instance, in my dreams, I would like to have a
>statement which says :-
>
>	person = fred.$field_name$

There is no non-messy way of doing this in C.  Somewhere there has to be
a table mapping names into members; C does not normally keep such a thing
around at run time, so you have to supply it yourself.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

jacob@gore.com (Jacob Gore) (06/25/89)

/ comp.lang.c / dsr@stl.stc.co.uk (David Riches) / Jun 22, 1989 /
struct fred
  {
   int tom;
   int dick;
   int harry;
  }
... in my dreams, I would like to have a statement which says :-

	person = fred.$field_name$

where $field_name$ means "take my contents and use that as the field
name.".
----------

Something that comes close:

struct fred
  {
   int tom;
   int dick;
   int harry;
  } fred_proto;
#define TOM	(&fred_proto.tom   - &fred_proto)
#define DICK	(&fred_proto.dick  - &fred_proto)
#define HARRY	(&fred_proto.harry - &fred_proto)
...
int field_name;
struct fred fred;
int *person;
...
person = &fred + field_name;


Or some variation of the above, like
#define FIELD(aStruct,offset)	(*(&aStruct + offset))
int person = FIELD(fred,TOM);

--
Jacob Gore	Jacob@Gore.Com		{oddjob,chinet,att}!nucsrl!gore!jacob

pc@cs.keele.ac.uk (Phil Cornes) (06/26/89)

From article <1545@stl.stc.co.uk>, by dsr@stl.stc.co.uk (David Riches):
> ......... in my dreams, I would like to have a
> statement which says :-
> 
> 	person = fred.$field_name$
> 

This sounds a nice idea but I seriously doubt that you'll find a general
and portable way to do it......

The problem is that your compiler needs to be able to resolve variable
references at compile time..... But the contents of your variable $field_name$
may well not be known until run time.

This means that the best you can do is to write some sort of interpreter
which can scan the variable and sort out its meaning at run time instead of
at compile time.... And it didn't sound like this is what you were after
(indeed it's what you've already got).

foessmei@lan.informatik.tu-muenchen.dbp.de (Reinhard Foessmeier) (06/26/89)

In article <1545@stl.stc.co.uk> dsr@stl.stc.co.uk (David Riches) writes:
 .....
>Say I have a structure like :-
>
>struct fred
>  {
>   int tom;
>   int dick;
>   int harry;
>  }
>
>Now, I have a variable which tells me the name of the field in fred
>which I would like to look at, e.g. field_name.  So if field_name
>holds the name dick I want to look at fred.dick and so on.
>
 .....
>
>Does anyone have an elegant solution to this?

Since you want to process the		^Car vi volas procezi la erojn
component names of your structure,	de via strukturo, la sola racia
the only reasonable way seems to	vojo ^sajnas la uzo de vektoro
be to use an array instead of a		anstataw strukturo.  Vi povas
struct.  You may #define symbolic	#difini simbolajn nomojn por
names for the indices, e.g.		la indicoj, ekz-e

#define TOM	0
#define DICK	1
#define HARRY	2
 int fred[3];

Now you can access the components	Nun vi povas aliri la erojn kiel
as "fred[TOM]" &ct.  Fiddling with	fred[TOM] ktp.  Procezumi la plenajn
the full strings is probably a		nomojn probable estas mal^sparo de
waste of resources anyway.  Only	tempo kaj spaco.  Nur kiam vi volas 
if you want to read a name from		legi nomon de ekstera dosiero, vi
an external file you have to convert	devas konverti ^gin (ekz-e en
it (in a switch statement, for instance),	switch-instrukcio), sed
but that is the place where conversion	tio estas la ^gusta loko por kon-
belongs.				vertado.

Sorry for my crude English; Those who know ILo may benefit from the right
column.
Reinhard F\"ossmeier, Technische Univ. M\"unchen |  "Lasciate ogni speranza,
foessmeier@infovax.informatik.tu-muenchen.dbp.de |      voi che entrate!"
   [ { relay.cs.net | unido.uucp } ]             |      (Dante, Inferno)

dwho@nmtsun.nmt.edu (David Olix) (06/27/89)

In article <1545@stl.stc.co.uk> dsr@stl.stc.co.uk (David Riches) writes:
>I have a problem here which I'd like to get round if possible.
>Say I have a structure like :-
>
>struct fred
>  {
>   int tom;
>   int dick;
>   int harry;
>  }
>

OK, as long as all of the structure members are of the same type (in this
case int), there is a *slightly* sneaky way to handle it, although I admit,
it probably isn't the best thing for code readability.

Also I am assuming that the variable you have the name stored in is a
char * (or char[]).  In this case I have used 'name' as the variable.

Suppose you have the following:

struct fred {
  int tom;
  int dick;
  int harry;
   ...
};

char *people[] = {
  "tom", "dick", "harry", ...
};

char *name;
int i;
struct fred peoples;
int person;

...

for (i = 0; strcmp(people[i], name); ++i);
person = *((int *)&peoples + i);

Yeah, I know it's sleazy and nearly illegible, but it beats miles of case
statements.  Personally, though, if this is your problem, I would not define
fred as a structure, but rather as an array.

-- David Olix (dwho@nmtsun.nmt.edu)
"I take full responsibility for my own opinions,
   you take responsibility for yours!"

mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) (06/28/89)

In article <470001@gore.com> jacob@gore.com (Jacob Gore) writes:
> #define TOM	(&fred_proto.tom   - &fred_proto)
> #define DICK	(&fred_proto.dick  - &fred_proto)
> #define HARRY	(&fred_proto.harry - &fred_proto)

Close, but no cigar.  The pointer subtraction yields an error message,
since &fred_proto is of type "struct fred *" and &fred.proto.tom is of
type "int *".

Suppose, instead, that it was
    #define TOM 	(&fred_proto.tom   - &fred_proto.tom)
    #define DICK	(&fred_proto.dick  - &fred_proto.tom)
    #define HARRY	(&fred_proto.harry - &fred_proto.tom)

Even if tom is the first int member of a struct fred, and all the int
members are contiguous, it still isn't guaranteed to work.  Pointer
subtraction is defined by pANS C only over pointers into the same
array.  In particular, a compiler can put an arbitrary number of bytes
of padding between struct members, and such padding wouldn't have to
be a multiple of sizeof(int) bytes long.  Such an implementation
would, in most situations, be silly, but it could happen.

Another possible solution is:

        struct fred { int tom, dick, harry; } fred_proto;

        #define TOM     offsetof(fred_proto, tom)
        #define DICK    offsetof(fred_proto, dick)
        #define HARRY   offsetof(fred_proto, harry)
        #define REF(p,f) (int *) ((char *) &p + f)

        struct fred foo;
        ...
        REF(foo,TOM) = 10;      /* sets foo.tom */

Actually, K&R's 2nd edition doesn't say anything about the "offsetof"
macro, so I had to guess about its syntax.  Also, some #include file
is needed to #define it.  Anyway, I think this is portable under pANS
C, because offsetof is required to give a field offset in bytes.  If
this use is not portable, what's the portable use of offestof?

Another requirement is implied by the original article
<1545@stl.stc.co.uk>, in which dsr@stl.stc.co.uk (David Riches)
writes:
> Now, I have a variable which tells me the name of the field in fred
> which I would like to look at, e.g. field_name.  So if field_name
> holds the name dick I want to look at fred.dick and so on. 

The argument to REF has to be an int, not a "name".  There would have
to be some lookup table to associate a character string name with an
offset.
--
"Let me control a planet's oxygen supply, and I don't care who makes
the laws." - GREAT CTHUHLU'S STARRY WISDOM BAND (via Roger Leroux)
 __
   \         Tim, the Bizarre and Oddly-Dressed Enchanter
    \               mcdaniel@uicsrd.csrd.uiuc.edu
    /\       mcdaniel%uicsrd@{uxc.cso.uiuc.edu,uiuc.csnet}
  _/  \_     {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel

jacob@gore.com (Jacob Gore) (06/29/89)

/ comp.lang.c / mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) / Jun 27, 1989 /
> In article <470001@gore.com> jacob@gore.com (Jacob Gore) writes:
> > #define TOM	(&fred_proto.tom   - &fred_proto)
> > #define DICK	(&fred_proto.dick  - &fred_proto)
> > #define HARRY	(&fred_proto.harry - &fred_proto)
> 
> Close, but no cigar.  The pointer subtraction yields an error message,
> since &fred_proto is of type "struct fred *" and &fred.proto.tom is of
> type "int *".

I suppose casting one into the other is non-portable?

> Suppose, instead, that it was
>     #define TOM 	(&fred_proto.tom   - &fred_proto.tom)
>     #define DICK	(&fred_proto.dick  - &fred_proto.tom)
>     #define HARRY	(&fred_proto.harry - &fred_proto.tom)
> 
> Even if tom is the first int member of a struct fred, and all the int
> members are contiguous, it still isn't guaranteed to work.  Pointer
> subtraction is defined by pANS C only over pointers into the same
> array.  In particular, a compiler can put an arbitrary number of bytes
> of padding between struct members, and such padding wouldn't have to
> be a multiple of sizeof(int) bytes long.

Why is the padding relevant?  The only requirement is that a field's offset
is the same for all instances of the struct.

--
Jacob Gore	Jacob@Gore.Com		{nucsrl,boulder}!gore!jacob

mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) (07/04/89)

All bracketed references refer to a section in Appendix A of K&R,
version 2.

In article <470002@gore.com> jacob@gore.com (Jacob Gore) writes:
>> Close, but no cigar.  The pointer subtraction yields an error message,
>> since &fred_proto is of type "struct fred *" and &fred_proto.tom is of
>> type "int *".
>
>I suppose casting one into the other is non-portable?

Generally, the only pointer-pointer casts that cause guaranteed
results are casting an A* into a B* and back into an A*, which causes
the final result to be equal to the starting value.  However, B has to
require less or equally strict storage alignment, and "alignment" is
implementation-dependent (except that "char" has the least alignment
restrictions). [6.6]

There is a special dispensation, though: "If a pointer to a structure
is cast to the type of a pointer to its first member, the result
refers to the first member".  [8.3]

So
        int *         mordecai;
        struct fred * haman;
        mordecai = (int *) &fred_proto;                 /* blessed */
        haman    = (struct fred *) &fred_proto.tom;     /* cursed  */

>>     #define DICK	(&fred_proto.dick - &fred_proto.tom)
>>     #define HARRY	(&fred_proto.harry - &fred_proto.tom)
>> Even if tom is the first int member of a struct fred, and all the int
>> members are contiguous, it still isn't guaranteed to work.  Pointer
>> subtraction is defined by pANS C only over pointers into the same
>> array.  In particular, a compiler can put an arbitrary number of bytes
>> of padding between struct members, and such padding wouldn't have to
>> be a multiple of sizeof(int) bytes long.
>
>Why is the padding relevant?  The only requirement is that a field's offset
>is the same for all instances of the struct.

"A non-[bit]field member of a structure is aligned at an addressing
boundary dependent on its type; therefore, there may be unnamed holes
in a strcture".  [A8.3]

Suppose pointer subtraction were permitted between structure members.
Consider a sample storage layout for "struct fred" on some 32-bit
machine:
        __ __ __ __ | __ __ __ __ | __ __ __ __
        tom           dick          harry
Since pointer subtraction yields "a signed integral value representing
the displacement between the pointed-to objects; pointers to
successive objects differ by 1" [7.7], DICK is 1 and HARRY is 2, and
all is well with the Universe.

But consider another possible storage layout:
        __ __ __ __ | xx xx __ __ | __ __ xx xx | __ __ __ __
        tom                 dick                  harry
where "x"s refer to "unnamed holes".  Then DICK, which is
        (&fred_proto.dick - &fred_proto.tom)
is ... what?  1.5?  No, the result has to be an integer.  But "dick"
and "tom" are not separated by an integer multiple of "sizeof (int)"
bytes.  If the result of the subtraction is either 1 or 2, only half
of "dick" would be accessed, and some undefined filler value would be
brought along.

"The value [of a pointer subtraction] is undefined unless the pointers
point to objects within the same array" [7.7].  pANS C could have
required that consecutive non-bitfield members of the same type have
no padding between them, and it wouldn't be at all hard to implement
on any architecture I can think of.  I guess that nobody considered
such a special case because nobody saw a need for it.

The moral of the story is highly offensive: lbhe pbzcvyre'f
vzcyrzragbe fubhyq chg uvf "ybat" qvpx arkg gb gbz naq uneel.  Vs vg
ehof ntnvafg "fubeg"f pbagnvavat na haanzrq ubyr, vg znl trg pubccrq
va unys.
--
"Let me control a planet's oxygen supply, and I don't care who makes
the laws." - GREAT CTHUHLU'S STARRY WISDOM BAND (via Roger Leroux)
 __
   \         Tim, the Bizarre and Oddly-Dressed Enchanter
    \               mcdaniel@uicsrd.csrd.uiuc.edu
    /\       mcdaniel%uicsrd@{uxc.cso.uiuc.edu,uiuc.csnet}
  _/  \_     {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel

blarson@basil.usc.edu (bob larson) (07/04/89)

In article <1411@garcon.cso.uiuc.edu> mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) writes:
>There is a special dispensation, though: "If a pointer to a structure
>is cast to the type of a pointer to its first member, the result
>refers to the first member".  [8.3]

There is at least one current (non-ansi) compiler that does not do this
currently, and will be forced to have extra overhead if the first structure
member is a char.  Prime 50 series computers arn't byte addressable%, and
the ix mode C compiler avoids the overhead of bit field insert/extract
on char variables not part of an array by treating them as shorts.  (char
is unsigned by default, signed char will be rather nasty to implement.)

64v mode C on the Prime puts char variables in the left side of a 16 bit
word, thus using both more time and memory than needed.

% The ix instruction set does have instructions designed for use from C
that do try to fake byte addressing.  Unfortunatly, most of the overhead
of the bit field insert/extract is still there.

-- 
Bob Larson	Arpa:	blarson@basil.usc.edu
Uucp: {uunet,cit-vax}!usc!basil!blarson
Prime mailing list:	info-prime-request%ais1@ecla.usc.edu
			usc!ais1!info-prime-request

les@chinet.chi.il.us (Leslie Mikesell) (07/06/89)

In article <18255@usc.edu> blarson@basil.usc.edu (bob larson) writes:

>>There is a special dispensation, though: "If a pointer to a structure
>>is cast to the type of a pointer to its first member, the result
>>refers to the first member".  [8.3]

>There is at least one current (non-ansi) compiler that does not do this
>currently...

Is it possible to use the bsearch(), tsearch(), etc., library routines
for data in structs if a pointer to the struct cannot be cast to a
character pointer?  Alternatively, given a pointer to an element of
a struct, is it possible to deduce the base address (I assume this would
require pointer subtraction and thus be illegal)?

Les Mikesell