pete@violet.berkeley.edu ( Pete Goodeve ) (03/06/88)
In an earlier message [Subject: Re: the good old "tools vs. integrated systems" debate] I suggested that we needed some thought on the business of communicating between programs using ports and messages. So here's the way I've been thinking, anyway... Apologies for it getting a mite long. By the way, I forgot to mention AREXX last time, and everybody promptly set me straight on that. I'm not sure that AREXX is quite what I'm looking for, though. I'm not even sure how relevant it is to the points that I'm concerned with at the moment. It's hard to tell, because I haven't had a chance to look at it yet. All I know is what I can remember from Bill Hawes' talk at BADGE, and from reading a REXXText. It's obviously a very comprehensive command language, for situations where you need that sort of thing [but NOT a pretty language... my impressions on its structure from the manual were definitely un-positive]. I don't think it's necessarily particularly incompatible with my concerns here either. The thing I feel is most important is that when one program gets a message from another -- via AREXX or wherever -- there should be a standard way for it to know what to expect as the contents of that message and how to handle it. Remember that we're not just talking command lines here: there are all sorts of things we might want to pass between programs. I was saying before that there are several levels that we have to look at, from the basic use of ports up to maybe very complex information protocols. Let's start at the bottom and work up, as even the basic matter of messages and ports seems to need some standardization. First of all, people still get it wrong. VideoScape-3D is an example: it supposedly can be synchronized to an external task if it finds a couple of public ports declared; the trouble is, both task- and reply-ports have to be created (and therefore owned) by the external task, and so there's no way that the Waiting V-3D can be signalled when the task replies! Obviously if even a top notch guy like Allen can slip up (he admits he was in a hurry...(:-)) there's need for an expansion on the good ol' RKM. Even the KickStart Guide skimps a bit here, I think. Things aren't helped by some limitations in the design of the message/port system. It's fine for a group of tasks that are well behaved towards each other, but there are loopholes when two independent programs want to communicate with each other. Consider one program that has a desire to send a message to another program's named port, IF THAT EXISTS; one sequence you might think of is: .... his_port = FindPort("his_port"); .... while (appropriate) { ..... if (his_port != NULL) PutMsg(his_port,mymsg); .... } .... You can see that this has problems, because "his_port" could go away at any time and the program wouldn't know about it. Even if it did a FindPort each time, this still isn't clobber-proof because it COULD happen that the port went away just at that moment. The only perfect method is to always use the sequence: Forbid(); his_port = FindPort("his_port"); if (his_port != NULL) PutMsg(his_port,mymsg); Permit(); This is cumbersome, and has quite a bit of overhead each time, because FindPort has to search. Obviously we want something a little better for a general protocol. I can suggest two possible paths to take. The first is straightforward; the second needs more work, but I can see advantages. The simple approach is to decide on a standard protocol in which, to open communication, the sending program does something like: Forbid(); his_port = FindPort("his_port"); if (his_port != NULL) PutMsg(his_port,LOCKMSG); Permit(); The program owning "his_port" notes (and replies to, naturally) the LOCKMSG, and will promise not to go away until it receives a corresponding RELEASEMSG from the sender. Then the sender can use the his_port handle with impunity until it wants to end the conversation. It would be a good idea to also reserve a special cut-off message for the "his_port" owner to send back if it wanted to end the conversation, to keep things symmetrical. By implication we have two ports, "his_port" and the reply port of the first program. This should be all we need for two-way communication, as long as the reply port is prepared to handle messages other than replies. I don't see any difficulties here, and we don't want more ports around than we need. That's the first approach, and I can't really see any reasons not to adopt it for situations where two programs want to talk directly to each other, but I keep visualizing scenarios which beg for a little more flexibility. I think I'm going to leave expanding on these scenarios for another time (I'd like to avoid too many topics in one heading -- there's probably already too much here, but nevah mind...). I'll just say for now that I think they need some kind of central "message broker" process. This could either be a process with a single recognized message port or a resident library. There's one other thing to look at first, and that's the format and content of the messages being passed around. As I said in that earlier note, I don't think we need anything very elaborate for a basic structure -- something like: struct IPCMsg { struct Message ipc_Msg; ULONG ipc_ID; /* Data follows... */ }; ipc_ID would indicate the type of data concerned; ipc_Msg.mn_Length would be (4 + length_of_data). I suggest that the ID should normally be a four character string pretty much like an IFF ID word -- in fact we should reserve "FORM" etc. for messages that ARE IFF objects. To cover all situations, one should be allowed less than four characters, but always left-justified (padding could be spaces or nulls -- I see no preference); on the other hand if the first byte was ZERO, the ID would be a "private" one agreed between the programs involved; maybe we should also reserve codes 00000000-000000FF for universally agreed shorthands; two of these could well be the LOCKMSG and RELEASEMSG above. A couple of basic IDs that immediately come to mind are "TEXT", for simple ASCII, and "CLI ", for DOS command lines (to a shell). Decisions on particular IDs would be handled in the same way as IFFs are now -- through a clearing house. The rest of the structure should be filled in properly, too, and we should be sure that the standard rules are followed. (I presume -- but haven't checked -- that PutMsg sets the ln_Type to NT_MESSAGE, and ReplyMsg sets it to NT_REPLYMSG. Anyone know for sure?) The receiving program should be able to assume that if mn_Reply is NULL it should dispose of the message's memory when done; if it is non-zero, of course, the message MUST be replied (possibly with data and/or ID changed). Yeah.. well that's quite enough for now. Nothing very exotic, but maybe it'll be a start on a foundation we can build a flexible IPC protocol from. I'll churn out my rationale for a message broker as soon as I can. Meanwhile, I'll sit back and wait for the flak on this lot. -- Pete --
pete@violet.berkeley.edu ( Pete Goodeve ) (03/06/88)
In my previous article [IPC (1): Ports & Messages] I was arguing that we
need a properly defined protocol for passing messages between programs,
and I suggested that we might need something more general than
individually created links between specific programs. In this article I
want to look at that idea in more detail.
Ron Minnich's "tools-based hypercard" concept, which started this whole
discussion off, is just a nebulous glimmer at this stage, but if you
envisage it as I do it will be as a suite of small programs, any one of
which can call on any other to perform the task it is designed for, using
information provided by -- or generating information for -- the invoking
program. As the whole point is to keep the system modular and adaptable,
it would be a horrendous task to keep track of a network of ports belonging
to the individual programs.
It will be much cleaner if a module that wants something done can "say to
the air", so to speak, "I want information on this; does anyone have it?",
or "I have this new information: does anyone want it?". In other words, a
Broadcast message. Once a broadcast request has been answered, the two
programs can always set up private communication ports of course, but this
might not always be the way to go either. You can broadcast specific
information as well as requests, and the advantage of this is that you
don't have to be concerned how many others are listening: any program that
wants the information can pick it up and use it; if no one wants it, it
will just drift off into the ozone.
Another example: we have a "tools-oriented" publisher package (I wish!),
together with our favorite editor and paint package, which also have such
facilities [Stop dreaming, Peter...]. The publisher has no edit or paint
facilities of its own: it can read existing files, but all manipulation is
done by the other programs. Suppose we start up the publisher and ask it
to format a particular text file; it reads the file as it is at that
moment, but at the same time sets up to receive any new information that
might be generated about that file. We look at the results previewed on
the screen, and decide that we want to re-edit a paragraph, so we click on
the editor icon (on the workbench -- the publisher doesn't know about it
specifically). The editor first off sends out a message "What text file?",
which the publisher --knowing which file we are working on -- will respond
to. Note that in the general case SEVERAL programs might respond, and the
editor would then pop up a requester asking the user to select one. The
editor then reads the original file (it won't try to work on the
publisher's data image -- no telling what format it is in...), and you can
start editing. Every significant change you make to the text (with some
appropriate atomicity) is put into a message which the editor tosses out.
The publisher is looking for these, so each time it updates its own preview
image suitably.
That's a sketch which might give you some idea of my dreams -- I've left
out most of the details, and ALL the snags. This sort of scenario could
also be handled by straight message passing between the two programs, but I
like the general idea of the editor, for instance, not having to know or
care what use is going to be made of the information it tosses out. The
program receiving it might be an incremental compiler rather than a
publishing package, or it could be just a communication link, sending it
(via DNET, naturally...) to some remote installation.
In this situation, there probably wouldn't be more than one other program
interested in a particular message, but how about a scientific data
acquisition module, continuously collecting data? You would want a
permanent record of it all on disk, perhaps compressed by a second module,
and you might also want to do some analysis (or a couple of kinds of
analysis) on it in real time as it arrived. Here there are several
modules, all working on the same stream of messages from the first module.
Closer to the needs of most people, there's been some desire expressed on
the net recently for VT100 to have "plug-in modules" for specific purposes.
The "Broadcast message" mechanism would be a natural for this. In fact I'd
go one further and just have a fairly dumb Communications Module (without
screen display or keyboard input) that simply sends text block messages it
receives out the serial port, and broadcasts incoming characters to whoever
wants them. If all messages ended up being one character long, we could
hit severe speed restrictions, but I think there may be ways around this.
I'll leave that for further thought, though.
Alright, let's get to some specifics. What I'm proposing is a "message
broker", which is built from a single public port with a process attached
[or a process with a port attached; whatever]. It could also be
constructed as a resident library, but for now let's go with it as a
process. Other programs would pass it messages regarding their interest in
a particular "topic" (using the standard "IPC" protocol defined in the
previous article). These messages would initially be one of: "I want to
know about this topic", or "I want to give information on this topic".
The topic is a text string -- whatever is suitable --, but this only has to
be used for the initial contact. In the ID word at the head of the message
data area the broker will return a handle value, to be used when actual
information is passed on that topic. Topics are kept (with their handles)
in a lookup list, which is added to as necessary. The requesting program
supplies a reply port as usual, and this is filed away by the broker under
that topic.
When the broker receives a data message on a topic, it looks up -- using
the handle -- which reply ports should be informed, makes duplicates of the
message as necessary, and dispatches these. Normally the original message
will only be replied when all the spawned copies have been replied; the
sender could probably also specify that it wanted an immediate reply if the
message didn't contain any pointers to other memory. You won't be able to
pass information back in replied messages, though, because they're copies.
If we keep the handle less than 32 bits it can be one of the "private" IDs
discussed in the last article, and can go in the ipc_ID slot. The message
itself also needs an type ID, so that the receiving program will know how
to handle it, and this would presumably go into the next word. The data
area proper would follow this, so the message structure would be -- aside
from the extra word -- just the same as a standard IPC message.
struct BrokerMsg {
struct Message brkr_Msg;
ULONG brkr_Topic;
ULONG brkr_ID;
/* Data follows...*/
};
With a scheme like this, a receiving program gets only the messages it is
interested in without having to know who or where the originator is. To me
this looks like a good foundation for a flexible network of programs. The
disadvantages as usual are in extra overhead, both in message structure and
the time taken in duplication and so on. So now tell me all the other
objections...
-- Pete --
rminnich@udel.EDU (Ron Minnich) (03/07/88)
In article <7409@agate.BERKELEY.EDU> pete@violet.berkeley.edu () writes: >The rest of the structure should be filled in properly, too, and we should >be sure that the standard rules are followed. (I presume -- but haven't >checked -- that PutMsg sets the ln_Type to NT_MESSAGE, and ReplyMsg sets it >to NT_REPLYMSG. Anyone know for sure?) The receiving program should be I know for sure that it does not. I found that out the hard with internet.device, where i had to jam NT_MESSAGE into messages IN THE DRIVER, even though the messages had been PutMsg'ed. >Yeah.. well that's quite enough for now. Nothing very exotic, but maybe >it'll be a start on a foundation we can build a flexible IPC protocol from. >I'll churn out my rationale for a message broker as soon as I can. >Meanwhile, I'll sit back and wait for the flak on this lot. Two thoughts: 1) Seems like we also need a 'port broker'. Isn't that function served by PIPE:, or am i missing something fundamental? I.e. rather than use ports, just use PIPE: which strikes me as being more reliable? Or do we lose something valuable by not using ports? 2) As for the message broker, make it a library, then we in effect have multiple copies running for each client ( i am thinking as opposed to having a 'port manager' task which will be a bottleneck) -- ron (rminnich@udel.edu)
pete@violet.berkeley.edu (03/09/88)
[This is a response to an article in comp.sys.amiga.tech. Because of the uncertain distribution of this group, I'm cross posting to comp.sys.amiga. -- Note to Ron: maybe we'd better keep any further discussion over there for now!] rminnich@udel.EDU (Ron Minnich) writes: > In article <7409@agate.BERKELEY.EDU> pete@violet.berkeley.edu () writes: > >The rest of the structure should be filled in properly, too, and we should > >be sure that the standard rules are followed. (I presume -- but haven't > >checked -- that PutMsg sets the ln_Type to NT_MESSAGE, and ReplyMsg sets it > >to NT_REPLYMSG. Anyone know for sure?) The receiving program should be > I know for sure that it does not. I found that out the hard with > internet.device, where i had to jam NT_MESSAGE into messages IN THE > DRIVER, even though the messages had been PutMsg'ed. Damn. Oh well. Thanks. We'll just have to enforce the rules ourselves, then. > Two thoughts: > 1) Seems like we also need a 'port broker'. Isn't that function served > by PIPE:, or am i missing something fundamental? I.e. rather than > use ports, just use PIPE: which strikes me as being more reliable? > Or do we lose something valuable by not using ports? I'm not sure I follow your line of reasoning there, but there are some fundamental problems I see in basing everything on pipes: they're basically byte serial, which is very inconvenient for structured data, and even more important they involve buffering and therefore COPYING of data -- a lot of overhead. Messages on the other hand don't move -- they're just inserted in a list, so making a lot of data available to another task can be very fast. For a general IPC scheme, I think we have to be concerned with speed. I'm not saying we should discard pipes: they're especially useful when the data is normally serial-file based... well, I don't need to tell you that: this all started with "tools.. etc." didn't it... > 2) As for the message broker, make it a library, then we in effect > have multiple copies running for each client ( i am thinking > as opposed to having a 'port manager' task which will be > a bottleneck) I agree totally. At least in principle. I went the port route because I thought that the concept was clearer that way, but I'm sure a library is the way to go in the long run. For a start it removes all danger of the FindPort hazard we've been talking about. On the other hand, though, I'm not sure that a manager process would in fact be a bottleneck: after all this is a one-task-at-a-time machine really (sigh)! -- Pete --
pete@violet.berkeley.edu (03/09/88)
[This is a response to an article in comp.sys.amiga.tech. Because of the uncertain distribution of this group, I'm cross posting to comp.sys.amiga. -- Note to Ron: maybe we'd better keep any further discussion over there for now!] In article <1401@louie.udel.EDU> rminnich@udel.EDU (Ron Minnich) writes: > In article <7409@agate.BERKELEY.EDU> pete@violet.berkeley.edu () writes: > >The rest of the structure should be filled in properly, too, and we should > >be sure that the standard rules are followed. (I presume -- but haven't > >checked -- that PutMsg sets the ln_Type to NT_MESSAGE, and ReplyMsg sets it > >to NT_REPLYMSG. Anyone know for sure?) The receiving program should be > I know for sure that it does not. I found that out the hard with > internet.device, where i had to jam NT_MESSAGE into messages IN THE > DRIVER, even though the messages had been PutMsg'ed. Damn. Oh well. Thanks. We'll just have to enforce the rules ourselves, then. > Two thoughts: > 1) Seems like we also need a 'port broker'. Isn't that function served > by PIPE:, or am i missing something fundamental? I.e. rather than > use ports, just use PIPE: which strikes me as being more reliable? > Or do we lose something valuable by not using ports? I'm not sure I follow your line of reasoning there, but there are some fundamental problems I see in basing everything on pipes: they're basically byte serial, which is very inconvenient for structured data, and even more important they involve buffering and therefore COPYING of data -- a lot of overhead. Messages on the other hand don't move -- they're just inserted in a list, so making a lot of data available to another task can be very fast. For a general IPC scheme, I think we have to be concerned with speed. I'm not saying we should discard pipes: they're especially useful when the data is normally serial-file based... well, I don't need to tell you that: this all started with "tools.. etc." didn't it... > 2) As for the message broker, make it a library, then we in effect > have multiple copies running for each client ( i am thinking > as opposed to having a 'port manager' task which will be > a bottleneck) I agree totally. At least in principle. I went the port route because I thought that the concept was clearer that way, but I'm sure a library is the way to go in the long run. For a start it removes all danger of the FindPort hazard we've been talking about. On the other hand, though, I'm not sure that a manager process would in fact be a bottleneck: after all this is a one-task-at-a-time machine really (sigh)! -- Pete --
rminnich@udel.EDU (Ron Minnich) (03/09/88)
In article <7543@agate.BERKELEY.EDU> pete@violet.berkeley.edu.UUCP ( Pete Goodeve ) writes: >I'm not sure I follow your line of reasoning there, but there are some >fundamental problems I see in basing everything on pipes: they're basically >byte serial, which is very inconvenient for structured data, and even more >important they involve buffering and therefore COPYING of data -- a lot of >overhead. Messages on the other hand don't move -- they're just inserted >in a list, so making a lot of data available to another task can be very >fast. For a general IPC scheme, I think we have to be concerned with >speed. Good point. What i am after is a PORTPIPE: i guess- a port manager with no copying that has the attributes of files- namely, a third party manager just as in PIPE:. Then people can safely close them, etc. without worrying about giving other people bad pointers. I am real unhappy with taking the existing port mechanism just as is- i think we have got to make this reliable, if only so Jerry Pournelle won't flame us. ron -- ron (rminnich@udel.edu)
cmcmanis%pepper@Sun.COM (Chuck McManis) (03/10/88)
In article <7543@agate.BERKELEY.EDU> ( Pete Goodeve ) writes: >[This is a response to an article in comp.sys.amiga.tech. Because of the [Aren't 200 articles as good as 200 'yes' votes?] > I'm not sure I follow your line of reasoning there, but there are some > fundamental problems I see in basing everything on pipes: they're basically > byte serial, which is very inconvenient for structured data, and even more > important they involve buffering and therefore COPYING of data -- a lot of > overhead. Messages on the other hand don't move -- they're just inserted > in a list, so making a lot of data available to another task can be very > fast. For a general IPC scheme, I think we have to be concerned with > speed. Have you considered using a new mechanisim that uses a buffer pool that your task doesn't "own". It works like this, you call "OpenIPCPort()" and the library does, and allocates a unique port for you. Then you Put messages on it like a PIPE: by writing them in a block. The library copies them to a buffer (preallocated) and signals who ever you sent it to. Or you can allocate a buffer with "GetIPCBuffer()" and save the allocation step. What this buys you is that, a) There are no longer in resources that the task has to free they are all "owned" by the library. Thus sudden death syndrome does not involve a loss of memory. Second, if there was an MMU in there both tasks could map the buffer pool as shared memory and not worry about address mismatches. --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com These opinions are my own and no one elses, but you knew that didn't you.
rminnich@udel.EDU (Ron Minnich) (03/10/88)
In article <44811@sun.uucp> cmcmanis@sun.UUCP (Chuck McManis) writes: >Have you considered using a new mechanisim that uses a buffer pool that >your task doesn't "own". It works like this, you call "OpenIPCPort()" >and the library does, and allocates a unique port for you. Then you >Put messages on it like a PIPE: by writing them in a block. The > (and more good stuff ) This is pretty much what i was after with a PORTPIPE: device. Also tasks could indicate that they are willing to give up ownership of the port in which case the other task would get it for a while, and, ..., no data need be copied. In other words PORTPIPE: could run in a 'data copy mode' or, for processes that were willing to be nice and wanted to buy the performance, they could operate in a 'quid pro quo' mode (much as messages do now) where you give up ownership of the shared resource and gain performance thereby. -- ron (rminnich@udel.edu)
dillon@CORY.BERKELEY.EDU (Matt Dillon) (03/17/88)
:In article <1401@louie.udel.EDU>, rminnich@udel.EDU (Ron Minnich) writes: :> 2) As for the message broker, make it a library, then we in effect :> have multiple copies running for each client ( i am thinking :> as opposed to having a 'port manager' task which will be >> a bottleneck) : :A message manager would be a bottleneck. A port manager wouldn't be, since :it's only accessed to get the port and to release it. However, Matt Dillon's :adfe ports would allow you to put the port manager in a library relatively :easily. One would put in the library any functions that would otherwise take a large amount of code, should be in the library for future compatibility issues, or provide logical high level constructs. You ONLY get a bottleneck for things requiring sequencing (whether it is in a library or not). So, for instance, the extended port structure which handles simple id'ing and a reference count would be hidden by the functions in the library that deal with it, allowing for future expansion. -Matt
SLMYQ@USU.BITNET (04/05/88)
All this IPC stuff seems to be pretty neato, but my question is, how will application programmers use it? It would be nice if a lowly mortal like me could access this without going to great pains. It should in a way be as simple as the AmigaDOS library routines. Also, it would be nice to be able to get on the serving end easily. Everybody who's tried knows how hard it is to make your own AmigaDOS device driver (BCPL? Blachk!) or an Exec device. I think it should be message based internally, but to the application programmer it would just be a set of routines to call using "IPC handles" returned by "IPC Open" routines or whatever. I don't know if this is the purpose you're thinking of this stuff to be for, but it would be nice for my program to call up the IPC library and say, "Hey, I need this data compressed in 12-bit Lempel-Ziv format", or "Hey, I want this FTXT file formatted into an ILBM picture".
SLMYQ@USU.BITNET (04/06/88)
I don't think we really need a *method* of interprocess communication - messages and ports are plenty for most anything. We don't need to define a standard on *how* things are passed between processes. It's *what* we pass to *who*, and how to find whoever. After all, you don't just send a message to any port in the public port list, saying, "Hi! I'm DeluxePaint!". Bryan Ford (SLMYQ@USU.BITNET)
SLMYQ@USU.BITNET (04/14/88)
OK, since this IPC discussion is not really getting anywhere, it's time to start defining what we need. Has Usenet EVER finished defining something? Anyway, this is just an abstraction of what I think we need. I'm posting this to comp.sys.amiga.tech because THAT'S WHERE IT BELONGS! I think there should be two basic types of IPC: user-controlled IPC and program-controlled IPC. User-controlled IPC In user-controlled IPC, the user provides or defines the link between processes or tasks. Piping would be considered user-controlled, because the user sets up the pipe. The Clipboard would also be user-controlled IPC, but indirect - a program first posts something to the clipboard, and later another one reads it. They communicate with each other through the clipboard, which the user controls. These are two good examples of user-controlled IPC already built into the Amiga. However, piping has some serious limits (stream-only, CLI-only, and mostly text), and the Clipboard is hard to use and indirect. We need something that can pass any kind of data, including structured data, works *efficiently* with the Workbench, and is easy to use for the programmer. Program-controlled IPC This is the kind of IPC that the user doesn't need to know about, such as devices, messages and ports, etc. This kind of IPC works between two processes - the server and the client. The client calls on the server to perform a service. I think there are three forms of major services - output, conversion, and input/editing. Output The printer device could be an example of an output service - it outputs data to the printer. A "show picture" service would open a screen, display the specified picture, wait for the user to tell it that he's done, close the screen, and signal the client. Conversion Conversion services would convert data from one format to another. The Translator would be considered a conversion service - it converts from English to phonetics. One "standard" conversion service I would like to see is a compression service which compresses or decompresses data in a variety of ways, like run-length and Lempel-Ziv encoding. Editing Most applications would go into the input/editing category. A painting program would be a "picture editor", and a music program would be a "music editor". Their service to the client would be to let the user edit something under the client's supervision. Although this may seem quite useless, consider this. Many people, including me, are against compilers including integrated editors. I prefer to use my own favorite editor than be confined to the limits the compiler's editor imposes. The good thing about integrated editors is that the compiler "controls" the editor, so it can bring up the editor at a certain line, saying "Syntax error", and the user would know exactly where the problem was. Certain keys could be defined in the compiler's interest, for example, a "next-error" key. Same with menus. If we carefully define this kind of IPC, we can create very flexible environments for the user. Obviously, text editors would have the biggest job in this category, but that doesn't mean other editors wouldn't want IPC. For example, an animation system might want to have IPC so specific external programs can create special effects, such as neato explosions or fractal animation. Anyway, this is just a big jumble of ideas, and you won't find a single drop of code or structure definitions here. I'm just trying to define the various uses for IPC. We'll have to mess with it a bit, and maybe we can come up with something useful. Bryan Ford (SLMYQ@USU.BITNET)