acrotty@cvbnet.uucp (Art Crotty) (08/21/86)
lineater +3 I have a somewhat complex request for info involving file transfer between hundreds of SUN3 and SUN2 workstations. These workstations' are networked together using ethernet (802.3). Protocols - TCP/IP/UDP. I would like to have the ability to transfer large application programs to all nodes on the network simultaneously. Why I think it may be possible to do this: 1.) The ethernet packet information can contain what is commonly called a multicast bit within the destination address. Thus, I should be able to set this bit to broadcast or spray my large application program (ie. 10mb-30mb) to all nodes on the network. I also, using the multi-cast bit, should be able to set up a table of nodes that I wish to distribute the program to. Thus, if the bit is set to 0 - it is specific address, 1 it is a group of nodes and all 1's in the field indicate all nodes. 2.) Some user-level programs already do something similar to what I want. For instance, "wall" will broadcast a message to nodes on your network. The command "rcp" will copy one or a group of files to one particular destination at a time. I want a "wall" or "rwall" and combined "rcp" that can copy my file or files simultaneously to all nodes or subgroups of nodes on a network. I know NFS allows mounting of an application to nodes and simultaneous access of that application - but that is not what I want. I want to distribute to stand-alone machines as well as file servers new copies of an application once a week and each rcp or "dread the thought" cartridge taping can consume 1/2 hour per node. Thus, 20 nodes is 20 x 1/2 hour by rcp from a master database or less time if multiple tapes are made or rsh tarring from a server with 1/2" tape. I would like to be able to say something like: distribute -g <tablefile> <application> where -g is the option for group and tablefile is the database that contains a list of nodes with names or internet addresses distribute -a <application> where -a is for all nodes - no table I am not that familiar with the networking code on SUN's and was wondering the following: Can it be done? Is this beyond the ability of the Ethernet itself? When "wall" does a broadcast - is it simultaneous to all nodes or consecutive? Will I need some sort of daemon process running, all the time, on each node waiting for a signal to allow broadcast file transmissions, or can /etc/inetd already handle this type of request with little or no code tweeking? What kind of error checking do I have to do for testing that the program was successfully transmitted without losing packets or corrupting packets - at source or destination? Has anyone created a program that can do this? If not, can someone get me started as to the process or code that I might need to access or create to accomplish what I want? For instance: you can alter "inetd" or you have to create a new daemon you must access these libraries and change this/that you must use these calls etc., etc. Thanks in advance for all advice!!! +-------------------------------------------------------------------+ | | | /\ Post: Art Crotty | | / \ Computervision Corp. | | /_ _\ 14 Crosby Drive | | / o o \ Bldg. 5-1 | | -mm--------mm- Bedford, Mass. 01730 | | Ma Bell: (617) 275-1800 | | The fool wanders, UUCP: { decvax,raybed2 }!cvbnet!acrotty | | the wise man travels. | +-------------------------------------------------------------------+
hedrick@topaz.RUTGERS.EDU (Charles Hedrick) (08/25/86)
The original article asks whether it is feasible to update software on multiple systems by using a broadcast protocol. This would save you from having to do separate copies to each. Anything is possible to do with enough design work, but let me mention two serious problems. First, the Ethernet is not a reliable medium. This means that any individual packet may be dropped. All protocols currently used to send files include some sort of acknowledgement that the packet really got there. If an ack is not received, the sender resends the packet. This is true of FTP, rcp, and NFS, though the actual details of the protocols are different for NFS and the other two. So a broadcast distribution protocol would have to keep a list of the sites that are expected to be receiving, and keep resending each packet until it has gotten an ack from every receiver. Since the acks would all be sent at the same time, you would have guaranteed collisions on the Ethernet. Probably you would want some sort of randomized delay before sending the ack. This would be a nontrivial design problem, and probably there would be other implications that I have not noticed. But an experienced protocol designer could probably solve the problem. You imply that you are going to be updating hundreds of Suns. I would be somewhat wary of the idea of hundreds of Suns on a single Ethernet. When we asked Sun about this, they recommended no more than 50 diskless Suns on a single Ethernet. Our measurements suggest that this number is about right. Of course if the machines are not diskless, more should be possible. But there is a limit. If you have hundreds of machines, they are probably going to be on more than one Ethernet, with gateways. Broadcasts do not go through gateways, unless special provisions are made. This is a good thing. It protects networks from other networks where a machine has decided to start spraying the network with high-speed broadcasts (a failure mode that is not uncommon when you are playing with experimental network software). There are also problems in making sure that loops don't occur. If a gateway forwards broadcasts from one interface to the other, any very interesting topology will end up with broadcasts looping around the network. These problems can be solved, and indeed there is an RFC describing multi-network broadcasts, but you should realize that there are design issues involved with broadcast protocols that involve more than one Ethernet. My suspicion is that this is not worth doing. I suggest instead using a branching tree distribution. I.e. your master sends to 10 machines and each of them to 10 more, or something like that. Note that the Ethernet should be able to support a number of simultaneous transfers, as long as they are not broadcasts. The limit on network bandwidth for most machines (including Suns) is the machine's own Ethernet hardware and software. The fastest real transfers I have seen are 1MBit/sec, and even that requires special care. 200Kbit/sec is more normal. Thus the Ethernet should be able to support a reasonable number of simultaneous copies, as implied by the branching tree model. Collisions would not be the problem here that it would be with the broadcast scenario, since the various copies would quickly lose any synchronization that they might have.
jqj@gvax.cs.cornell.edu (J Q Johnson) (08/25/86)
The author of the original article proposes use of broadcast/multicast (on networks that support it) as a way of achieving multiple parallel file transfers. The problem with this scheme, obviously, is that there is no simple way to achieve reliability (though see various papers on "reliable broadcast" by Ozalp Babaoglu et al). Most file transfer protocols use end to end ack/nack to make sure the data got there, which assumes the sender knows who is receiving the data. A multicast-based ftp is not impossible, but it certainly doesn't match the communications model of any of the popular existing protocol families (tcp/ip, sna, decnet, osi, xns, etc.). TCP/IP didn't even standardize the value of the broadcast \fIaddress\fP until recently! Note that most existing broadcast applications on Ethernets assume unreliable broadcast, and are generally used for sending status information (or requests for information). In almost all cases, the amount of information to be transferred is limited to a single packet. Conclusion: it's a good topic for research, but don't expect anyone to implement such a beast in the near future. And don't ever expect to see it layered on TCP/IP.
guy@sun.uucp (Guy Harris) (08/25/86)
> I also, using the multi-cast bit, should be able to set up a table > of nodes that I wish to distribute the program to. Except that the Sun driver for the "ie" interface doesn't understand multicasts. You'd have to change that driver, provide "ioctl" calls to set the multicast address group, and provide a way, in whatever protocol you used, to specify that a packet is to go to a multicast group. > 2.) Some user-level programs already do something similar to what > I want. For instance, "wall" will broadcast a message to nodes > on your network. No, it won't. The "rwall" command will send messages to other machines; however, it does not "broadcast" them, in the sense that it uses Ethernet broadcast facilities for this. when discussing networks. If it is asked to send messages to a set of machines, it does so by running through an enumeration of those machines and sending to them one at a time. > I know NFS allows mounting of an application to nodes and simultaneous > access of that application - but that is not what I want. I want to > distribute to stand-alone machines as well as file servers new copies > of an application once a week and each rcp or "dread the thought" > cartridge taping can consume 1/2 hour per node. You may, in this case, want to have the stand-alone machines get the application via NFS. > I would like to be able to say something like: > > distribute -g <tablefile> <application> > > where -g is the option for group and tablefile is the database that > contains a list of nodes with names or internet addresses Even if IP supported multicast groups, this would not be straightforward. You can't assign a host to a multicast group; that host has to add *itself* to the multicast group. As such, you'd have to start by telling the hosts in that list to join a particular multicast group (you'd also have to either 1) reserve a multicast group for this or 2) find some way of finding an unused group and choosing it). I think there may be some RFCs discussing the use of multicast addresses in IP, but I doubt that there are any standard implementations of this for UNIX. At best, they're probably experimental. At worst, they don't exist. There are a lot of complicated issues involved in putting multicast support into IP. Of course, as stated before, you'd have to whack on the networking code quite a bit to teach it about multicast addresses, anyway. > Can it be done? Maybe, if you're willing to learn a lot about IP, Ethernet, and the 4.2BSD networking code, and make *lots* of changes to it. I don't guarantee that it'd be possible even then. > When "wall" does a broadcast - is it simultaneous to all nodes or > consecutive? As I mentioned above, "wall" doesn't do broadcasts at all; since "rwall" doesn't do them as Ethernet broadcasts, they are consecutive. (No, "rwall" doesn't fork off N processes, one per machine.) > What kind of error checking do I have to do for testing that > the program was successfully transmitted without losing packets > or corrupting packets - at source or destination? Lots. TCP doesn't understand broadcasts, much less multicasts, and can't really be made to. As such, you'd have to provide your own flow control and error recovery. As Charles Hedrick pointed out, this won't work well (if at all) if you want to update hosts that aren't on the same Ethernet, either. I think the best advice is "try something else". -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)
sjl@amdahl.UUCP (Steve Langdon) (08/30/86)
In article <6527@sun.uucp> guy@sun.uucp (Guy Harris) provides his normal
clear explanation of many of the issues involved in trying to use multicast
to distribute files. His explanatation focused on the issues involved when
you tried to solve the problem above the Data Link Layer (or LLC in 802
terminology). A general architectural approach to multicast would be nice,
but you might find something useful in the current IEEE 802.1 work on load
protocols. It is limited to use on a single LAN and was (last time I checked)
planned to support multicast. I cannot provide any further details on how
they have designed the protocol because it has never been high on my priority
list.
If my suggestion leads you to a workable protocol, Guy is still right about
the type of skills you will need to use it. Expect to learn more than you
want to know about the messy details of the system, hardware, and LAN.
--
Steve Langdon ...!{decwrl,sun,hplabs,ihnp4,cbosgd}!amdahl!sjl +1 408 746 6970
[I speak for myself not others.]
garyf@mc0.UUCP (gary friedman) (09/05/86)
In article <6@cvbnet.uucp> acrotty@cvbnet.uucp (Art Crotty) writes: > >I would like to have the ability to transfer large application >programs to all nodes on the network simultaneously. The short answer to your question is you *can* broadcast your updated programs to your other nodes, but you shouldn't. The reason for this will take some explaining. The Ethernet protocol you said you had, TCP/IP/UDP, are actually 2 seperate protocols that can co-exist harmoniously: TCP/IP, which will guarantee packet delivery to one node only, and UDP, which guarantees nothing. One of UDP's features is, since it transmits packets without waiting for any kind of acknowledgement, it is able to send to a special broadcast address and have 'billions and billions' of machines (which are also set to receive with this same broadcast address) receive them without the overwhelming overhead that would otherwise be required in such a case. Many erroneously equate "UDP" with "Broadcast", when in fact "Broadcast" is merely a special case. As you can probablly guess, if you choose to broadcast your updates to all your Sun workstations, you run the risk of randomly dropping packets or losing bits of information in other ways. This risk is even greater if the other Suns are transmitting information to each other (Using TCP/IP, no doubt) in the background at the same time. An example: In my studies of UDP reliablilty, it was common for a Sun3 to send 100 UDP packets and have a Sun2 receive only 65 of them. (This result is amplified by the fact that the Sun3 sends them faster than the Sun2 can physically receive them. Sun2 to Sun2 generally yields better than 98% of the message when lots of other Ethernet activity is taking place.) My reccommendation is to use NFS, as it was designed for precisely your situation. (The original posting didn't state why the option was ruled out.) If that option isn't acceptable, the next best option is to write a shell that sequentially rcp's the file to every node individually. (RCP uses the TCP/IP protocol; it's no dummy!) Sorry about that---and good luck. -- Gary Friedman Jet Propulsion Laboratory UUCP: {sdcrdcf,ihnp4,bellcore}!psivax!mc0!garyf ARPA: ...mc0!garyf@cit-vax.ARPA
guy@sun.UUCP (09/08/86)
> (Explanation that TCP guarantees packet delivery, but does not support > broadcast, while UDP supports broadcast but doesn't guarantee packet > delivery.) > As you can probablly guess, if you choose to broadcast your > updates to all your Sun workstations, you run the risk of randomly > dropping packets or losing bits of information in other ways. Well, you *could* have the receiving hosts send back acknowledgments when they received the broadcast packets. This would be an excellent way to melt down an Ethernet, though, given the number of hosts that would receive the broadcast packet. The sending host would also have to know *all* the hosts the broadcast would go to, in order to know whether it got all the acknowledgments it should. It would also have to know what to do if it didn't get acknowledgments from all the hosts; should it retransmit only to the hosts that didn't get it (if 75% of them didn't get it, this could flood the Ethernet) or to all the hosts. In addition, the code on the receiving end would have to be able to deal with packets received out of order, or duplicate packets (especially if the response to negative or missing acknowledgments is a broadcast retransmission). > My reccommendation is to use NFS, as it was designed for > precisely your situation. (The original posting didn't state why the > option was ruled out.) If that option isn't acceptable, the next best > option is to write a shell that sequentially rcp's the file to every > node individually. (RCP uses the TCP/IP protocol; it's no dummy!) And NFS uses UDP/IP. NFS operations return a success/failure indication and, if a failure, an error code; this return message acts as the acknowledgment. -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)