gnb@bby.oz (Gregory N. Bond) (10/04/89)
This is on Sun 3/60 and Sun 3/260, running SunOs 3.5, soon to be on a Solbourne server, running SunOs 4.0.3 all round. I have an application that occasionally requires lookups to a database. These lookups are often quite lengthy, taking up to a minute or more. As the database code is quite large, and a lot of processing code is associated with it, I'd rather not include all this code in every copy of the application. (Not to mention that a database backend for each copy of the application would murder the server in no time.) So I have used the Sun RPC compiler to create a database server task that runs on the main machine, and an rpc client library to link into the application. Now this works quite well for the small fast calls, as they are usually answered within the timeouts. However, on the more complex queries (and it is not easy to tell what's complex from the client as a lot of complexity is hidden in the server), the RPC times out and resends. This banks up extra work for the server that isn't necessary, doing one difficult job 5 times. What is worse, the rpc client routines often timeout totally, and return an error, even though the server is happily chugging away on the request. Also, if the server machine is busy, or the server process has been swapped etc, or there is lock contention for the tables, then even the short queries time out and fail, only to work when immediately retried. This is quite un-friendly for the (largely non-technical) user base! I have tried using the TCP transport, which one would have thought meant that timeouts were not appropriate, but that doesn't seem to be the case. Setting very large timeouts is also not an option as I would like quick indication that the server is unavailable. It seems that Sun RPC is designed for many small and quick calls (e.g. NFS), rather than large slow calls (as in my example, or for things like numerical compute engines inverting large matricies). This is shown by the heritage of using UDP and timeouts rather than the flow control and reliability in TCP. Does anyone have any idea how to approach this sort of RPC application? Is there some trick or section of TFM that I have overlooked? Does anyone have (oh joy oh bliss) some code for this type of RPC work? Or do I drop Sun RPC entirely as the wrong tool and handcraft something using "raw" TCP sockets? Greg. The network sort of computes! -- Gregory Bond, Burdett Buckeridge & Young Ltd, Melbourne, Australia Internet: gnb@melba.bby.oz.au non-MX: gnb%melba.bby.oz@uunet.uu.net Uucp: {uunet,pyramid,ubc-cs,ukc,mcvax,prlb2,nttlab...}!munnari!melba.bby.oz!gnb
Kemp@DOCKMASTER.NCSC.MIL (10/06/89)
Gregory Bond writes: > [description of RPC based database application that sometimes > takes a long time for lookups] > > Does anyone have any idea how to approach this sort of RPC > application? [...] Or do I drop Sun RPC entirely as the wrong > tool and handcraft something using "raw" TCP sockets? No need to use sockets. However you are correct in stating that lengthening the RPC timeouts is a bad idea. The accepted solution to this very common situation is to use callback RPC's, in a scenario something like this: 1. Client requests an action 2. Server does whatever can be guaranteed to finish quickly (checking request validity, user authentication, whatever) and acknowledges the request 3. Server does the lengthy job (computation or database lookup) and when it's finished, sends a callback RPC to the client This is analagous to asynchronous I/O, and requires a similar amount of care by the programmer to do the job right. Nonetheless, it *is* the right way to do it. I am constantly annoyed by people who write Sunview programs that do lots of computation in notifier event routines, instead of just starting a job and returning. What you are left with is a window that just sits there for seconds or minutes, refusing to refresh itself. If you want to get fancy, you could have a 'progress meter' that gives the user some feedback as to how his database query is progressing (and lets him know if the database server or the net goes down), and gives him some opportunity to abort the job if it is taking too long or not going in the right direction. Dave Kemp <Kemp@dockmaster.ncsc.mil>
mishkin@apollo.HP.COM (Nathaniel Mishkin) (10/06/89)
In article <GNB.89Oct4171744@baby.bby.oz> gnb@bby.oz (Gregory N. Bond) writes: >It seems that Sun RPC is designed for many small and quick calls (e.g. >NFS), rather than large slow calls (as in my example, or for things >like numerical compute engines inverting large matricies). This is >shown by the heritage of using UDP and timeouts rather than the flow >control and reliability in TCP. > >Does anyone have any idea how to approach this sort of RPC >application? Is there some trick or section of TFM that I have >overlooked? Does anyone have (oh joy oh bliss) some code for this >type of RPC work? Or do I drop Sun RPC entirely as the wrong tool and >handcraft something using "raw" TCP sockets? You could use NCS RPC (available for Suns from HP/Apollo). NCS uses UDP, but you don't supply timeout values. It pings the server periodically (and with lower frequency as the call proceeds without problems) itself. If the server becomes unresponsive, the client finds out about it. There are various other technical difference which I won't enumerate, for fear of being accused of self-promotion. -- Nat Mishkin Hewlett Packard Company / Apollo Systems Division mishkin@apollo.com
ka@cs.washington.edu (Kenneth Almquist) (10/08/89)
> Gregory Bond writes: >> [description of RPC based database application that sometimes >> takes a long time for lookups] Kemp@DOCKMASTER.NCSC.MIL replies: > The accepted solution to this very common situation is to use callback > RPC's, in a scenario something like this: > > 1. Client requests an action > 2. Server does whatever can be guaranteed to finish quickly (checking > request validity, user authentication, whatever) and acknowledges > the request > 3. Server does the lengthy job (computation or database lookup) and > when it's finished, sends a callback RPC to the client Accepted by whom? The technical problems of implementing RPC systems are basicly solved. The political problem of keeping SUN from foisting lousy software on the world is harder to solve, but it may be that Gregory can manage to get hold of another RPC system. (Nat Mishkin mentions one such system.) Lacking a good RPC system, I would be inclined to use sockets rather than SUN RPC. Kemp acknowledges of the kludge he describes: > This is [analogous] to asynchronous I/O, and requires a similar amount > of care by the programmer to do the job right. Care of course translates into programmer time. It won't necessarily take any longer to in effect build your own specialized RPC mechanism on top of sockets. And this latter approach won't force you to mangle your code to mesh with the IPC mechanism. Kenneth Almquist
weiser.pa@xerox.com (10/08/89)
An alternative approach to callbacks is to keep things under the control of the client using "status" calls. For instance, we use this in a document retrieval application here, like this: Client initiates action, gets back a session handle and a completion estimate. The completion estimate is used by the client for a time to callback and see how things are going (using that same session handle), and get another completion estimate. And so on. Clients that want progress reports more frequently call back more often--the completion estimate is just a hint. This way RPC clients don't have to also know how to be servers, and/or pass around procedure call handles through RPC's (which would be the extension of the usual callback method to RPCs). Server code is also relatively simple--it generally pretty easy to keep some kind of status report about how things are going, and to return completion estimate hints that not only take into account how long the thing is actually going to take but how busy your are (and so how often you want to hear from the client). Completion estimates are a three part field: seconds until done, integer that increases while there is progress, and percent done. Clients that want to give progress feedback to their users use the percent done field to show work getting completed. Naturally, clients that can't take a hint can bring the system to its knees. But that is a universal. -mark-
Kemp@DOCKMASTER.NCSC.MIL (10/10/89)
Kenneth Almquist writes: > Dave Kemp (me) replies: >> The accepted solution is to use callback RPC's [...] > > Accepted by whom? The technical problems of implementing RPC systems > are basicly solved. The political problem of keeping SUN from foisting > lousy software on the world is harder to solve, but it may be that > Gregory can manage to get hold of another RPC system. (Nat Mishkin > mentions one such system.) Lacking a good RPC system, I would be > inclined to use sockets rather than SUN RPC. Kemp acknowledges of > the kludge he describes: > >> This is analagous to asynchronous I/O, and requires a similar amount >> of care by the programmer to do the job right. > > Care of course translates into programmer time. Or at least programmer thought. I certainly wouldn't characterize callbacks as a "kludge", any more than device interrupts are a "kludge" to get around the problems of polled I/O. Interrupts, and signals, and callbacks, (the concepts, not necessarily the particular implementations) are the elegant solution to the problem of maximizing the productivity of a multi-threaded system. The database application can be handled synchronously without much thought: 1) Issue a query 2) Wait for the answer (for as long as it takes) 3) Continue or it could be handled asynchronously, with a little more thought: 1) Issue a query 2) Do something else (like update the progress meter) 3) Get signalled when the query is finished 4) Continue (use the results of the query) The programmer has to decide what to do while waiting for his answer, regardless of the mechanism used to get it (sockets, Sun RPC, HP/Apollo RPC, or whatever). The synchronous method is a no-brainer. Asynchronous is not a kludge, it's a matter of good careful human engineering. I'll take good tools wherever I can get them. If GNU RPC :-) does the job better than SUN RPC, I'll use it. If you find raw sockets easier to use and document and port and maintain, then by all means use sockets. What I don't understand is your crack about "the political problem of keeping Sun from foisting lousy software on the world". If you don't like SunOS, then you certainly don't have to pay for it; you can buy DEC or HP or Data General workstations instead. Or, as Mr. Mishkin suggests, buy HP RPC to run on your Sun. The original poster was looking for an answer to his particular problem; i.e. he has an application running *now* on Suns, and he needs a solution. To say that Sun RPC is 'lousy', that the technical problem is solved, and that you would use sockets (without giving any examples) is not very constructive. Dave Kemp <Kemp@dockmaster.ncsc.mil>
rsalz@bbn.com (Rich Salz) (10/11/89)
In <21090@adm.BRL.MIL> Kemp@DOCKMASTER.NCSC.MIL writes: > I certainly wouldn't characterize >callbacks as a "kludge", any more than device interrupts are a "kludge" >to get around the problems of polled I/O. If the RPC system decides to do time-outs, and you have to do something to circumvent that, then yes, it is a gross hack. Sun RPC should check to see if the (remote) CPU and the receiving process have died before blindly resubmitting the request. Anything else requires the application level to work-around the IPC level, and that's just about the best definition of a kludge I've ever heard of. /r$ -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net. Use a domain-based address or give alternate paths, or you may lose out.