[comp.sys.amiga] IPC - Standard Message Blocks

shf@well.UUCP (Stuart H. Ferguson) (04/15/88)

A Comparison of Different Standard Message Formats

There have been a variety of proposals for a standard message structure
which allows any number of blocks of data of any type.  The three I
compare are nested chunks, entry list, and linked list of chunks. Nested
chunks structure is basically a simple IFF file in memory.  Entry list
structure has an array of pointers to entries.  Linked chunk list is
just a plain old-fashioned linked list of chunks.  For the purpose of
comparing the efficiency and ease of use of the different techniques, I
write a function which takes a message and returns a pointer to the
entry or chunk of type "ctype."  I'm leaving out the type definitions
for brevity -- hopefully that won't render things completely
incomprehensible. 

Nested chunk list (or any other in-line memory structure):
---------------------------------------------------------

struct Chunk *FindChunk (mess,ctype)
	struct StdMsg *mess;
	long ctype;
{
	char *ptr;
	int i;

	ptr = (char*) mess->first_chunk;
	for (i=0; i<mess->number_of_chunks; i++) {
		if (((struct Chunk*)ptr)->ctype == ctype) return ptr;
		ptr += ((struct Chunk*)ptr)->chunk_size;
	}
	return NULL;
}

Entry list:
----------

struct Entry *FindEntry (mess,ctype)
	struct StdMsg *mess;
	long ctype;
{
	struct Entry *ptr;
	int i;

	for (i=0; i<mess->number_of_entries; i++) {
		ptr = mess->entry_array[i];
		if (ptr->ctype == ctype) return ptr;
	}
	return NULL;
}

Linked Chunk list:
-----------------

struct Chunk *FindChunk (mess,ctype)
	struct StdMsg *mess;
	long ctype;
{
	struct Chunk *ptr;

	for (ptr=mess->first_chunk; ptr; ptr=ptr->next_chunk)
		if (ptr->ctype == ctype) break;
	return ptr;
}


So - suprise! - the winner in minimum code complexity is the linked list
design.  This design also wins with regard to ease of changing the data
in a message.  Changing either of the other two involves allocating new
memory, copying old data to the new memory and de-allocating the old
memory.  With a linked list, adding or removing nodes is as simple as
swapping pointers.  It may not seem like a big deal, but de-allocating
memory can incur substantial overhead, since the system takes some time
to merge the freed memory back into the available memory lists to reduce
fragmentation. 

It's hard to beat good old-fashioned linked lists -- especially
Exec-style linked lists with no special cases for insertion and
deletion.  Linked lists: a fast, reliable, no-limits design.  Not just a
good idea -- a great idea! 

This message has been brought to you by the Linked List Avisory Board. 
-- 
		Stuart Ferguson		(shf@well.UUCP)
		Action by HAVOC		(shf@Solar.Stanford.EDU)

pete@violet.berkeley.edu (Pete Goodeve) (04/23/88)

Oh oh! Looks like the blind men have their fingers on the elephant again...

In article <5699@well.UUCP> Stuart H. Ferguson writes:

> A Comparison of Different Standard Message Formats
>
> There have been a variety of proposals for a standard message structure
> which allows any number of blocks of data of any type.  The three I
> compare are nested chunks, entry list, and linked list of chunks. [...]

[He then goes on to show code for finding a particular "chunk type ID" with
each of the formats, and finds that a linked list is a little faster in
this situation than the IPCMessage format we've been suggesting.]

It's a little disheartening to see people totally ignore the last two
months of wide ranging discussion, and try to demonstrate a minor supposed
disadvantage of the format we've settled on (the "Entry" or "Item" array)
in one particular case.

What we've been trying to do over the past couple of months is come up with
a single format that covers all the situations we've been able to think of.
The Pete(r) IPCMessage format has evolved from this over a long series of
iterations.

There are quite a few reasons for choosing the compactness of a (variable
size) array over a linked list.  The main one is probably that your average
message WON'T have an arbitrary series of items, but rather one or two
fixed fields that the server program won't have to search for at all.  The
IDs serve two purposes in such a case: first as a validity check, and
also as a secondary selection mechanism for variant fields.  If a list is
more suitable in a particular case -- a list of file names, for example --
there is nothing to stop an item pointing to one.  Long, complex, messages
seem not in the spirit of IPC though: better to send a string of short and
to-the-point ones that can be handled quickly.

Another major reason is that the "control information" for an item, and the
data it points to, must in general be kept separate.  There's no guarantee
that there would be any space for a list node (or any of the control
information) at the head of a piece of data -- it might be embedded in
something else.  And we want to avoid copying, remember?

                           + + + + + + + +


Stuart also stacked the deck a little when he showed code for finding an
item in our structure:
        ....
>         for (i=0; i<mess->number_of_entries; i++) {
>                 ptr = mess->entry_array[i];
>                 if (ptr->ctype == ctype) return ptr;
>         }
        ....

Now I would do it more like this (using Stuart's -- not our -- notation):

    ....
    struct Entry *ptr;
    int i;
    for (i = mess->number of entries, ptr = mess->item_array; i--; ptr++)
         if (ptr->ctype == ctype) return ptr;
    ....

(remembering that the increment value of a pointer in C is equal to the
size of its type).  This involves a simple addition -- no array
computations -- which is just about as fast as address chain following.
Sure, there are probably a couple more machine cycles involved, and you do
need the counter as well, but, heck, you're talking such minor savings for
a function that will be insignificant compared to all the other things
you'll want to do with the contents of a message.

                           + + + + + + + +


>  [.....]    Changing either of the other two involves allocating new
> memory, copying old data to the new memory and de-allocating the old
> memory.  With a linked list, adding or removing nodes is as simple as
> swapping pointers.  [......]

Dunh hunh?  Changing the SIZE of a piece of data in ANY scheme will involve
allocating memory (and copying, if some of the old data is to be retained
in the new chunk).  Of course in the first scheme -- using a sequential
"IFF" type format -- any such change would mean regenerating the whole
thing, but both the Item and Linked-list approaches have almost exactly the
same overhead.

To make such a change to an entry in a list, you allocate memory for the
new node, copy over any of the old data needed and create the revised data,
modify two node pointers appropriately, and deallocate the old chunk.

To make such a change to an Item, you allocate memory for the new data
block, copy over any of the old data needed and create the revised data,
modify one pointer (in the "Entry" structure) appropriately, and deallocate
the old chunk.

Now, look at those carefully.  Which one wins?

And another minor, but relevant, point.  At some point the server is
(usually!) going to reply the message, and the client is going to want to
know what happened to the stuff it sent over.  If some of it was simply
snipped out of a linked list, it is just gone, with no record.  On the
other hand in our protocol an Item's pointer is set to NULL if its data is
removed (or taken over by the server).  (If it is simply altered, things
are different of course; we may want flags in the status word to indicate
such happenings.)

                           + + + + + + + +

Listen, I think linked lists are great, too.  I use them everywhere.  But
no one construct is right for everything.  I'll try to select the best one
for the job, taking ALL the relevant factors into account.  (I even
actually descended to using a "goto" today [...My GOD, How will I ever hold
my head up again...?].)

                                                    -- Pete --

BTW -- I'm moving my future postings on IPC over to the *official*
comp.sys.amiga.tech (cross-posting days are done!).  I hope everyone can
now join me there...?

peter@sugar.UUCP (Peter da Silva) (04/25/88)

In article ... pete@violet.berkeley.edu (Pete Goodeve) writes:
> Oh oh! Looks like the blind men have their fingers on the elephant again...

Kinky.

> And another minor, but relevant, point.  At some point the server is
> (usually!) going to reply the message, and the client is going to want to
> know what happened to the stuff it sent over.  If some of it was simply
> snipped out of a linked list, it is just gone, with no record.  On the
> other hand in our protocol an Item's pointer is set to NULL if its data is
> removed (or taken over by the server).  (If it is simply altered, things
> are different of course; we may want flags in the status word to indicate
> such happenings.)

Actually, I think we want a flag for takeovers, too. I can see cases where the
PTR field is going to be NULL. Didn't you get my last message (knew I shoulda
saved it).

----------

On another track, have you had a look at the midi library? I think it's a good
standard for its purpose, and can take some of the load.
-- 
-- Peter da Silva      `-_-'      ...!hoptoad!academ!uhnix1!sugar!peter
-- "Have you hugged your U wolf today?" ...!bellcore!tness1!sugar!peter
-- Disclaimer: These aren't mere opinions, these are *values*.

pete@violet.berkeley.edu (Pete Goodeve) (04/28/88)

In article <1888@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>
>Actually, I think we want a flag for takeovers, too. I can see cases where the
>PTR field is going to be NULL. Didn't you get my last message (knew I shoulda
>saved it).
>
Yes, I got it (I think!).  I agree that we need that too.  Sorry if I
wasn't clear again.  I've got a complete spec worked out now, which I was
about to post, but I see someone else (Stuart) has beaten me to it (in .tech)
Have to look at that first, I guess! (:-))
>----------
>
>On another track, have you had a look at the midi library? I think it's a good
>standard for its purpose, and can take some of the load.
Arggh! One of the things I haven't had time to look at yet. Guess I'd better.

							-- Pete --