[comp.windows.open-look] Calendar Manager in a large environment

bigmac@erg.sri.com (Bryan McDonald) (06/07/91)

Hi,

A week or so ago this came across this newsgroup:

In article <1991May23.112737.13619@ecrc.de> cmc@ecrc.de (Chris Crampton) writes:
>Our problem is that we have about 80 diskless clients and so we have about
>80 different /var/spool/calendar directories and backing these up does
>not fit in very well with our current back-up strategy.  Am I missing the
>point or would it be okay to replace all /var/spool/calendar directories
>with a symlink to, say, /home/server/calendar which is a globally visible
>NFS'ed directory that IS backed up?  i.e. for all clients:

A day or two later I responded, stating that I was having some problems
with just such a setup.  I also said I would followup when I knew what
was going on.

Well, now I know.  This scenario will not function reliably with the 2.0
version of the calendar manager.  Let me describe the way i perceive that
cm works (and if anyone can point to flaws in the theory, please let me know):

On any given workstation there are two processes that handle the calendar
for you, the cm program and the daemon that runs on all the machines.
The daemon by itself does not deal with any data file before the cm program asks
it to, and at startup cm will always ask for the collog.userid where
userid is that of the user starting the cm program.

Now, once you have started, as long as you never run on another machine that
also has /var/spool/calendar mounted as well, you can use cm just fine and
never worry.  But, there are problems when you start using the networked
features of the daemons.  If I am logged onto two machines, both of which have
the spool mounted, and both of which are running a cm, or if my secretary and
I are both editing my calendar from 2 diff. machines with the spool mounted,
I still run into the same old classic race condition that I saw using the calentool.
It goes like this.

MachineA has a cm started by UserA, and MachineB one by UserB.  UserA wants
to edit UserB's calendar, so UserA browses across the net to UserB@MachineB.
Now both UserA and UserB make additions, and both see their own alterations
take place.  Now, both go home.  Sometime late at night, and i am hypothosizing
that there is an "idle" time marker that triggers this, the callog.UserB file
gets updated, when no one is around.  My hyopthesis: the cm waits until idle
for x amount of time, then tries to flush itself and the local daemon to be sure
that all the data is saved onto disk.  THis is nice, except for one thing.
Apparently each cm/daemon pair always looks for a local data file to flush to,
even if the local daemon got it's original data over the net from another
machine.  Race condition.  Whichever one idles out first looses.

There appears to be no solution to this in the current implementation of 
the cm/daemon pair.  A recomendation from Sun..."Don't do that."

Another thought from Sun: put the calendar all on one central machine,
do NOT mount them, let everyone start the local one, then browse to the
central machine.  Not really nice since cm has this tendency to show you
your original properties, not those of a calendar you have browsed to.

Another thought of my own: centralize them, but instead of browsing to them,
rsh to the central machine and run cm, displaying back on your home display.
Works, but could kill the central machine in my environment of 150+ machines.

What are we going to do?  I am not sure yet.  I am trying to find out if there
will be any substantial changes in the 3.0 versions of cm, but it will take me
a while to do so or for Sun to get back to me, since it is not released yet.
We will probably end going back to one file on one machine, in var, but backing
these up will be a major pain (maybe a cron job that copies it to the central
directory once a day).  This also means that every user in my facility must
memorize machine names that do change on a not-so-infrequent basis.

And on top of it all, the access permision field has a hard coded length limit that
means that noone can give bigmac@quetzalcoatl.erg.sri.com access to their
calendar.  You may laugh at the name, but I have a few users who have 8 character
user names and 6 character machine names and none of them work.  (Sun obviously
forgot that while their network uses short names first in host tables, many
others in this world do not.)

So, here I am griping.  I actually like cm, and would love to encourage my users
to use it, but I can't.  Then someone says, "Well, what does he want?"  

I want the name limit removed.

I want cm to be able to start up on a defined file name or automatically
"browsed" to another machine.

I want cm to be smarter, and avoid the race condition all together.  Rather then
feed full images of the data around, I want it to update one file, on disk, piece
by piece as the changes happen, then push those changes out to any cm browsing
the specified file.

And I still want it for free.  ;-)

Hopefully Sun is listening (I know they have heard it once already, I gave them
an ear full Monday), and some of these problems will be addresses shortly.  Maybe
they already have been.  When is 3.0 coming out?

--------------------------------------------------------------------
Bryan McDonald        |   Computer, Hardware, And Operations Support
Systems Administrator |                     CHAOS
bigmac@erg.sri.com    |            ITAD - SRI International