scs@adam.mit.edu (Steve Summit) (05/31/90)
In article <3078@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: >In article <2574@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes: >The problem I've often had is that I have a data structure containing >some pointers, and *I* have filled some of them in (using strdup()) >and the caller preset some of them to defaults. Now I'm going to change >one of them. Should I free it? If *I* allocated it, certainly, there >isn't any other copy of the pointer. If the *caller* allocated it... It is an axiom of data abstraction methodologies that only certain routines associated with a data structure should directly modify or access any of the fields in it. This means that you don't need to worry whether the caller coded structp->namefield = "rosebud"; or structp->namefield = strdup("rosebud"); because you gave him a routine so he could setname(structp, "rosebud"); and within your implementation of setname() you can enforce whatever allocation scheme you require. When I am writing a seriously modularized package, the structure declarations in the header files are surrounded by #ifdefs which make the declarations visible only to the routines which implement the package. For the above example, the file would look something like #ifdef STRUCT_INTERNALS struct whatever { char *namefield; /* other fields... */ }; #endif extern struct whatever *structalloc(void); extern void setname(struct whatever *, char *); Any calling program (unless it cheats and #defines STRUCT_INTERNALS) sees only extern declarations and maybe #definitions of flag values, but does not end up knowing the "shape" of the structure. C allows functions to pass around pointers to undefined functions, and this programming style makes good use of that feature. (Unfortunately, lint -h complains about "struct whatever never defined".) In article <224@taumet.COM> steve@taumet.UUCP (Stephen Clamage) writes: >This is easy to handle (safely and portably) in C++ (which is not >precisely the question you asked). >In the constructor for the data structure... >The destructor, >and each member function which modifies the structure, checks... >to see whether to free the data... >Forever after, only >the member functions are used to modify the data... Steve is saying the same thing, although the techniques are by no means limited to languages like C++. C++'s programming style (like object-oriented programming in general) merely encourages the use of short little "member access" functions. C++ also allows inlining them in case you are worried about function call overhead. In article <3102@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: >C++ is another language. I simply do not have the option of using it... As mentioned, you don't have to. Just use the techniques it would have encouraged, such as good data hiding. In article <3103@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: >In article <1739@necisa.ho.necisa.oz>, boyd@necisa.ho.necisa.oz (Boyd Roberts) writes: >> Well, once you've coded yourself into a corner all bets are off. Choose >> a better algorithm, one that has all the pointers in a free-able state. >Whether something is freeable is NOT a property of the algorithm which has >to make the immediate decision. It is a property of the program that USES >the algorithm. If you're writing a library function, you simply haven't >any control over the code that uses your function. The problem is when you are writing a program which tries to use a badly-written library which you have no control over. When you are the one writing the library, you are finally in a position to set things up correctly. You must arrange to give the calling program sufficient control, but not by "giving away the farm;" many details must be reserved to the library implementation so that it can be changed or improved later without breaking things. Member access functions, even for seemingly trivial operations, are an important part of a successful interface. The differences between having the caller say structp->somefield = 1; and setfield(structp, 1); are vast, and not immediately obvious. By using the second form, it is possible to link the calling program against a new version of the library without recompiling, even if the structure layout has changed change the library in such a way that some other value must be changed whenever somefield's value changes enforce access rights on the field, disallowing changes or checking changes for validity For example, suppose that one of the fields in the structure is an averaging interval. One day you decide you're spending too much time dividing by the number of samples, and you want to replace it with a right shift. You can change the function to set the averaging interval from setaveraginginterval(structp, avgint) struct whatever *structp; int avgint; { structp->avgint = avgint; } to setaveraginginterval(structp, avgint) struct whatever *structp; int avgint; { if(avgint is not a power of two) complain; structp->avgint = avgint; structp->log2avgint = log2(avgint); } and instead of later saying average = total / p->avgint; you can say average = total >> p->log2avgint; (Note the implication that a function might want to be declared as returning int, rather than void, even if initially it always returns successfully, so that later you can add cases which complain and/or return a failure code, such as when the requested averaging interval isn't a power of two.) Someone will inevitably complain that the function call overhead implied by all these little access functions is intolerable. You can avoid function calls by using a compiler that supports inlining, but that does remove the relink-without-recompile advantage. Unless the function is called so often that the call overhead is truly a factor, it's MUCH better to have it a bona-fide function -- the existence of a "hook" at which arbitrary code can fire whenever a structure field is modified (or even examined!) is frequently invaluable. (I'm sorry that the example above centers on an efficiency hack, since normally I eschew them. The example is artificial, and is merely meant to illustrate why you might need to keep two fields in synch, which you couldn't depend on your caller to do. In fact, having to keep them in synch at all is undesirable. Keep those silly little function calls and long divisions in there, unless it is conclusively demonstrated that replacing them with in-line code and/or right shifts will have tangible and useful benefits.) Good library design is a fascinating subject, and one worth careful thought and study. Steve Summit scs@adam.mit.edu