[comp.lang.c] Debugging, statics.

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/18/88)

In article <9174@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:

	[ ... describes problems he had reusing a sort program ... ]

     [Side note: Because about 30% of my code was "assert" statements, when
    things did go wrong I was able to spot the problem immediately and had
    good clues as to the causes.]

Triple Hurrah! This is not a side note, it is a very important thing!

I am a debugger hater. I want to cheer anybody that says that he didn't fire
a debugger to find mistakes, but his code told him/her about them. The best
checks/diagnostics/traces are those that an intelligent programmer puts in
the code, as they are algorithm based. As Knuth wrote "you will find that in
many well written programs half of the code is there to check the other
half".

(The only debugger I use is 'adb' to understand what is going when an
aggressive optimizer sanctified by dpANS C fails to generate correct code...)

    The apparent moral is to not use block-scope static initialization, but I
    think a better one would be, to design utility programs with the thought
    that they should be serially reusable.

I beg to differ, actually to dig deeper into the problem. The moral is that
by using statics (whether local or global) you can only write single instance
generators, and that this single instance is indeed not reusable if you dont'
refresh it.

    That way if they ever become subroutines (or repeating slaves like the
    one I had) you already have them in shape for the task.

Generators are not subroutines. In most algorithmic languages like C the easy
way to write a generator (a coroutine) is to store its state in
statics/globals. Then you can only have one instance of the generator.

Whenever you fork an executable under Unix a separate instance of its globals
is allocated, and refreshed. You have not used this mechanism for refreshing
the single instance you wanted, because it has high overheads. If you were
using multiple threads (precisely because of their low overhead), you would
not want to use fork for creating multiple instances of it as well.

The real solution would be to have the state of the generator in a struct,
explicitly identified as such. It would then be easy to convert it, as
necessary, from relying on fork() to create new instances and refresh them,
to instead doing it with low overhead program logic, as needed.

In this way one could have as many independent instances of the generator as
you want.  Existing examples of this:

[1] struct _iob and all the stdio procedures;

[2] struct u and the Unix system calls.

Yes, struct u is merely a conglobation of the states of all the system calls
of Unix that are generators for a process; when the Mach people wanted to
introduce multiple threads, they had to dynamically allocate a new struct u
per thread, and use a pointer to access the current instance, instead of
addressing it statically.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

mat@mole-end.UUCP (Mark A Terribile) (12/20/88)

|     The ... moral is to not use block-scope static initialization, ... a
|     better one [is] to design utility programs [that are] serially reusable.
| 
| I beg to ... dig deeper into the problem. The moral is that by using statics
| ... you can only write single instance generators, and that this single
| instance is indeed not reusable if you dont' refresh it.
| ...
| Generators are not subroutines. In most algorithmic languages like C the easy
| way to write a generator (a coroutine) is to store its state in
| statics/globals. Then you can only have one instance of the generator.

True, but please allow me to point out that this problem is trivial in C++
(I know, this is a group about C) and that using a struct to represent the
state is exactly the *implementation* of the C++ idiom.  The difficulty in
C is that you can't force an instance of a struct to be initialized when
it is declared, especially if it is declared locally.  Because there is no
automatic aggregate initialization, you end up resorting to statics.  I have
no good solution except to take the plunge UPWARD to C++.
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile

asg@pyuxf.UUCP (alan geller) (12/22/88)

In article <421@aber-cs.UUCP>, pcg@aber-cs.UUCP (Piercarlo Grandi) writes:
> In article <9174@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> ... a discussion of debugging by assertion versus by debugger ...
> ... then starting to talk about generators ...
> The moral is that
> by using statics (whether local or global) you can only write single instance
> generators, and that this single instance is indeed not reusable if you dont'
> refresh it.
> ...
> The real solution would be to have the state of the generator in a struct,
> explicitly identified as such. It would then be easy to convert it, as
> necessary, from relying on fork() to create new instances and refresh them,
> to instead doing it with low overhead program logic, as needed.
> In this way one could have as many independent instances of the generator as
> you want.  ...

Essentially, this amounts to creating what LISP programmers know as
a 'function closure', or 'closure' for short:  an object that consists
of a procedure, plus an associated execution environment (usually the
values of some procedure parameters).

It is indeed true that closures provide a useful, general-purpose
facility for defining things like multi-instance generators.  Also,
if you extend the data in the associated execution environment to
include some notion of 'program counter' and the local variables
of the procedure, then closures become a natural way to view coroutines
or process threads.

Unfortunately, LISP (and its relatives) is the only language I know
of that supports explicitly the notion of a closure (please, no flames
just to prove my ignorance here).  It is possible to provide such a
mechanism in C, but usually only for a specific special case -- in order
to provide a general-purpose function closure mechanism for C, you'd
have to monkey around a lot with either the operating system or the
run-time environment.

Alan Geller
Bellcore
...!{rutgers|princeton}!bcr!asg

guy@auspex.UUCP (Guy Harris) (12/22/88)

>The difficulty in C is that you can't force an instance of a struct to
>be initialized when it is declared, especially if it is declared
>locally.

Err, umm, better make that "*only* if it declared locally".  There's no
particular problem with

	struct foo {
		int a;
		char *b;
		float c;
	} bar = {
		666,
		"Hello, sailor!",
		137.06
	};

if "bar" is "static" or "external".

>Because there is no automatic aggregate initialization,

Automatic initialization is often (always?) just syntactic sugar.  You
can do

	foo()
	{
		int a = 33;
		struct foo b = { 666, "Hello, sailor!", 137.06};

		...
	}

by doing

	foo()
	{
		int a = 33;
		struct foo b;

		b.a = 666;
		b.b = "Hello, sailor!";
		b.c = 137.06;

		...
	}

The problem here appears to be that C makes it inconvenient to have
non-automatic, private data that belongs to an *instance* of the
generator; there's only one copy of a "static", so you don't get one per
customer.  (C doesn't make it *impossible*, of course.)

pcg@aber-cs.UUCP (Piercarlo Grandi) (12/23/88)

In article <127@mole-end.UUCP> mat@mole-end.UUCP (Mark A Terribile) writes:

#	[ ... where I (pcg) point out that if one implements generators state
#	in explicit structs one can build multiple instances of them ... ]

#    True, but please allow me to point out that this problem is trivial in
#    C++ (I know, this is a group about C) and that using a struct to
#    represent the state is exactly the *implementation* of the C++ idiom.

Indeed, indeed. Hail C++! It saves drudgery. By the way, Stroustrup has
indeed written a fairly general coroutines/generators package, and he does
demonstrate writing "iterator" classes in his book. Others have used them,
too. On my little own, I have written a generalized package that with a
little syntax macros does that in C as well, and I am going to convert it to
a C++ version as soon as I can.

    The difficulty in C is that you can't force an instance of a struct to be
    initialized when it is declared, especially if it is declared locally.
    Because there is no automatic aggregate initialization, you end up
    resorting to statics.

Agreed, but there is a also a question of a bit of laziness. Almost every
time I declare a struct in C I define create and delete procedures for it, as
a matter of habit, but I do too occasionally give in to the laziness of not
struct'ifying the state of an impure function, and then as often I get
bitten.

Clearly the fact that C++ invites and helps you to declare constructors and
destructors does help a lot in overcoming laziness.  They are also useful in
other ways; I have recently (re)posted to comp.lang.c++ a somewhat neat way
of using them to implement shallow cound, dynamically scoped variables (e.g.
exception handlers).

Let me add that C++ classes, as helpful as they are to hold state, are only
half of the full solution: to have a truly generalized generator coroutine
facility, you also need non local control transfers, to implement suspend and
resume.  I used a little known "feature" of PCC to implement this (taking &
of a label...), but clearly there ought to be a language supported mechanism.
This is a very general problem that algorithmic languages do not support very
well, save for noble exceptions (e.g. SL5/Icon).

Stroustrup (and Tiemann) are supposed to be wroking on neater ways to
implement non nested CONTROL transfer, just as C++ classes can be used
implement non nested environements.

    I have no good solution except to take the plunge UPWARD to C++.

Highly recommended, I have already done that, with special thanks to g++.
-- 
Piercarlo "Peter" Grandi			INET: pcg@cs.aber.ac.uk
Sw.Eng. Group, Dept. of Computer Science	UUCP: ...!mcvax!ukc!aber-cs!pcg
UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)

mat@mole-end.UUCP (Mark A Terribile) (12/24/88)

> >The difficulty in C is that you can't force an instance of a struct to
> >be initialized when it is declared, especially if it is declared locally.
 
> Err, umm, better make that "*only* if it declared locally".  [e.g.]
 
> 	struct foo {
> 		int a;
> 		char *b;
> 		float c;
> 	} bar = { ...

I should have written more carefully.

There is no way to declare a struct template such that EVERY INSTANCE of
that struct will be initialized AS SPECIFIED IN THE DECLARATION.

> >Because there is no automatic aggregate initialization,

> Automatic initialization is often (always?) just syntactic sugar.  You
> can do [a sequence of assignments].

True.  So are compound statements.  (You can always use gotos).  A datum
with an initializer tends to read somewhat differently than a datum
initialized by assignments to multitudes of sub-data.

On a different aspect of the topic,

>> ... using statics ... you can only write single instance generators, and
>> ... this single instance is indeed not reusable if you don't refresh it.
 ...
>> The real solution would be to have the state of the generator in a struct,
 ...
>... this [is] what LISP programmers know as a 'function closure', or `closure`
>for short:  an object that consists of a procedure, plus an associated
>execution environment ...  Unfortunately, LISP (and its relatives) is the
>only language I know of that supports explicitly the notion of a closure ...

Actually, your description of a closure could easily be applied to a C++ class.

> ... It is possible to provide such a mechanism in C ... for a specific ...
>case -- in order to provide a general-purpose function closure mechanism for
>C, you'd have to monkey ... with either the [OS] or the run-time environment.

C++ requires neither.
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile