dennis%cod@NOSC.ARPA (Dennis Cottel) (07/07/86)
Managing network resources would be much easier, and the network more robust, if Aegis allowed symbolic links to have alternatives. For instance, the command crl /sys/help "//Main_Node/sys/help,//Alternate_Node/sys/help" would establish an alternate link. A reference to /sys/help on some Local_Node would first try to link to Main_Node, but if that node were unavailable, it would then try Alternate_Node. There is a sort of precedent for this in the current method of expanding environment variables to handle alternate /bin directories. Since this seems such an obvious extension, maybe there's a basic problem. Has this been discussed here before? Has Apollo discussed such plans, or has it been brought up at ADUS meetings? Dennis Cottel Naval Ocean Systems Center, San Diego, CA 92152 (619) 225-2406 dennis@nosc.ARPA sdcsvax!noscvax!dennis
mishkin@apollo.UUCP (Nathaniel Mishkin) (07/08/86)
Managing network resources would be much easier, and the network more robust, if Aegis allowed symbolic links to have alternatives. I think this would be a fine idea. In fact, eons ago, when I was at Yale, I implemented this feature in the naming server. (I called the feature "multilinks".) The change was never integrated into the standard software. Now that I'm at Apollo, and conveniently, since I happen to do a lot of hacking on the naming server, I've been wanting to put this feature in for real. Unfortunately, other things have had higher priority. Note that we'd really like to solve the underlying problem -- supporting automatic switching among multiple copies of the same file -- in a more sophisticated way. We call this the "replicated object" problem and a good solution would have tools for maintaining consistency and broadcasting to find locations (rather than forcing every node to always node about all the copies). The proposed naming hack, is effective, but crude. But I'll probably do it anyway if I manage to find two free days in a row. There's an interesting question w.r.t. this feature though. Note that it could be implemented by using environment variables rather than links. E.g.: $ ev := "//node1/sys/help,//node2/sys/help" $ catf "$(ev)/ld.hlp" This implementation would have more in common with TOPS-20 (and I suppose VAX/VMS) "logical names" (I think VMS calls them "symbols" or something) in that the name is dynamic. This is as opposed to the solution that uses links, which is more static. Consider two nodes booted off the same disk: With the link solution, both nodes (i.e. the users at both nodes) are obliged to live with the same expansion, since it is in the file system. With the environment variable solution, each user can set his environment variables to his own liking. The disadvantage of the environment variable solution is that it requires every user to have those variables defined. (With the link solution, a system administrator can set up the links when he configures the node.) The problem can be lessened by defining the variables default values in the node's startup files, but I suspect that will be "unreliable" and more work (from an administrative point of view). I'm afraid both implementations make sense (i.e. one could want both), but I'm loathe to actually do both since it seems like it would cause no end of confusion. (I wouldn't want to have to write the documentation.) I'd be interested in hearing people's comments. -- Nat Mishkin Apollo Computer Inc. apollo!mishkin -------
Erstad@HI-MULTICS.ARPA.UUCP (07/09/86)
This is not a good solution, but a stop-gap with the present system. What we do is create links for /sys/help and much of our software for disk space reasons. All links off a particular node are in the form /softlink/xxx. For example, CRL /sys/help /softlink/sys/help. Then, a single link /softlink points to a node entry directory. In this way, if the node is down a single command will retarget all of the links. However, this is still a manual reconfiguration, which only solves a small part of your problem.
ZELEZNIK@UTAH-20.ARPA.UUCP (07/11/86)
In reference to alternate links, we have developed a two fold approach to maintaining a consistent environment (at the system, node, and user levels) across single node failures. First, each entry directory with "critical" data has a backup location on some other node. For system level directories, each backup contains only those branches which are required to maintain the environment (e.g., /etc, /sys/tcp, ... objects which exist in only one place). For user login directories, the backup is user specified, (usually top level files and links, user_data, and personal com/bin directories). In this way, both the system and user environments are preserved. A simple "node_down node" command walks the net and uncatalogs the unavailable node, replacing it with a link to its backup location, while a corresponding "node_up node" command undoes this action (execution of these commands is restricted). In this way, we forget about what lives where; once the backup locations are established, everything is handled by simply switching the root level entry for the unavailable node. Second, backup node_data directories are provided to preserve the node environment when diskless partners are down. Each node has a primary and a secondary paging partner. The secondary is maintained with only the critical files (e.g., startup?*, etc?*, tcp info, ... and any user-specific files), with all non-essential files removed periodically. This reduces the backup size by more than an order of magnitude. All the necessary syncing and such for all of this is done automatically through scripts running under /etc/cron. User logins, however, are left to the user. While far from the elegance of replicated objects, this has provided a reasonably stable environment during single node failures, and has been straight forward to maintain. Contact me if anyone has more interest. Mike Zeleznik University of Utah Zeleznik@Utah-20.ARPA -------