berger@well.sf.ca.us (Robert J. Berger) (07/30/90)
We are looking to make a special purpose dedicated lan for controlling up to 128 devices. These devices will run a real time os such as PSOS or VxWorks. There will be up to 16 master devices made up of unix workstations running Unix System V.4. The workstations will be initiating most traffic. We will not be using the lan for traditional computer traffic, but just for control and status exchanges. Most messages will be short and would easily fit into one datagram packet. They will probably be RPC's on top of UDP or TCP/IP. The main time critical response we need is to have a guaranteed worst case of the workstation sending a message to one or several of the slaves, where the message must get to the slave within 5 milliseconds. Most other traffic needs to get from master to slave within 16 milliseconds. There will be some backround traffic as well, but they are not generally time critical. It seems that ethernet would have problems guaranteeing the response time, particularly if the backround traffic got somewhat heavy and thus start having collisions. We are strongly considering 16mbs 802.5 token ring for our lan based on our reading (we have no actual experience with token ring). Theoretically token ring should be able to meet our worst case traffic estimates and give us the token often enough to meet the 5 millisecond response time, particularly if we take advantage of the alleged ability of 802.5 to support global priorities on the network. My questions for the net are: 1. Are we barking up the wrong tree here? Is it concievable to get the response we need at all? 2. Does token ring really offer the kind of response time that it shows on paper? 3. Does token ring global priorities really work and are they usable from UDP/ TCP/IP? 4. Has anyone done something like this before and have some hard numbers? (with ethernet, token ring, or something else?) Thanks, Bob PS. Please use the email address in the signature, not the return address of this posting! ------------------------- Bob Berger SONY Advanced Video Technology Center 677 River Oaks Parkway San Jose, CA 95134 408-944-4964 [uunet,mips]sonyusa!sfcsun!berger (soon: berger@sfc.sony.com)
hedrick@athos.rutgers.edu (Charles Hedrick) (07/30/90)
You've got to be real careful about this "guaranteed" stuff. I think it's mostly marketing propaganda. The behavior of both Ethernet and TR has statistical components. These include instantaneous load on the network and packets being dropped. TR propaganda talks about "guaranteeing" that the bandwidth is evenly split. But the problem is, generally there are enough hosts on the network that if they all decide to transmit at full speed at the same time, you've got an overload. So the fact that things work at all is due to the random nature of the traffic: all hosts don't transmit at once. Generally if you wanted to make a precise mathematical model of the thing, you'd have to take the probability distributions for the offered load of each source, and make sure that the combined load on the network is within acceptable bounds "almost all" of the time. Few of us can afford networks fast enough that the worst case will be within bounds. Maybe in your specific application you can. The point is that you've got to size the network so that you avoid an overload condition. The fact that TR and Ethernet handle overloads somewhat differently probably isn't going to affect you. There are also queueing issues in the hosts. Our experience with TR is that many of the IBM TR cards have fairly small buffers -- smaller than current Ethernet devices. So in fact the servers don't handle peak loads well at all. They start losing packets. Again, you may not be able to afford to get enough buffers on the cards to handle all slaves sending maximum size messages to the master at the same time. You're going to have to depend upon reasonable statistics. Finally, we have the issue of packets dropping. Lots of things can cause a packet to drop, including momentary noise somewhere. On the TR you have in addition the fact that when you boot an IBM PC (can't speak for your workstatinos), its relay clicks in and interrupts the network for a while. We think the token gets lost and has to be regenerated. At any rate, the delay -- although only a fraction of a second -- is still probably longer than the longest likely delay due to collisions. Ethernet is passive, so turning on and off a device does not cause any interruption. The net effect of all of this is that you probably want to set up a mockup or do a fairly detailed simulation. Or if the bandwidths involved are small enough, just rely on massive overkill. If the messages are small enough and don't happen that often, you've got enough extra bandwidth that it's unlikely you'll have any problems with either technology. But if you're at all close to the limits, what actually happens as you near overload is not the simple thing that TR propaganda implies.
kwe@bu-it.bu.edu (Kent England) (07/31/90)
In article <Jul.29.17.57.05.1990.14474@athos.rutgers.edu>, hedrick@athos.rutgers.edu (Charles Hedrick) writes: > You've got to be real careful about this "guaranteed" stuff. How right. Even "guaranteed" begs the question on exactly what is guaranteed. I recall a discussion about FDDI that started with figuring out effective throughput and ended up talking about the effect on throughput of setting certain token rotation timers. I believe that discussion was right here on comp.dcom.lans two or three months ago. I can't recall exactly the other principals in the discussion or I would give credit for info. One thing I came away with from that discussion was that some of the worst case maximum token rotation times, for given congestion and timer settings, were positively geologic timeframes. So, token ring may be "guaranteed", but how long are you going to wait for a given target token rotation time? And what happens to the guarantee when the token is lost, or other losses occur? It does not hold. Seems to me that our profession is beyond making these sorts of arguments for one technology over another. It's out of fashion, like saying Ethernet can't sustain more than 3 Mbps throughput. --Kent
andrew@dtg.nsc.com (Lord Snooty @ The Giant Poisoned Electric Head ) (07/31/90)
In article <61624@bu.edu.bu.edu>, kwe@bu-it.bu.edu (Kent England) writes: > [..]So, token ring may be "guaranteed", but how long are you going > to wait for a given target token rotation time? And what happens to the > guarantee when the token is lost, or other losses occur? It does not > hold. I more or less agree with all that's been said, except to point out that FDDI uses a "timed token" protocol whereas 802.5 Token Ring does not. A fellow called Werner Bux (IBM Zurich I believe) has published many papers analysing loading vs. offered traffic (and vs. many other parameters). Also, it was mentioned that "the token is believed to be lost" when a new station inserts and clicks in its MAU relay, thus causing longish delay. This is not what the protocol specifies; the Initialisation Phase merely causes the new station to exchange special MAC frames with designated Server nodes on the ring. At any rate, there should be no loss of token *theoretically*. -- ........................................................................... Andrew Palfreyman Incidentally, in English, the name of the planet andrew@dtg.nsc.com is "Earth". - Henry Spencer
chris@yarra.oz.au (Chris Jankowski) (07/31/90)
In article <19300@well.sf.ca.us> berger@well.sf.ca.us (Robert J. Berger) writes: > > We are looking to make a special purpose dedicated lan for controlling > up to 128 devices. These devices will run a real time os such as PSOS or > VxWorks. There will be up to 16 master devices made up of unix workstations > running Unix System V.4. The workstations will be initiating most traffic. ^^^^^^^^^^^^^^^ > The main time critical response we need is to have a guaranteed worst case of > the workstation sending a message to one or several of the slaves, where the > message must get to the slave within 5 milliseconds. Most other traffic needs ^^^^^^^^^^^^^^^ I am just wondering about the following: 1. You say you will be using UNIX System V release 4 on your workstations. and 2. You require *guaranteed* delivery time of a packet send by those workstations within 5 miliseconds, thus the workstation has to issue those packets with time resolution not worse then 5 miliseconds I presume. Now consider the following: There is an interrupt for your real-time application on the workstation but the workstation just happens to be prosessing some stuff in a critical section of the kernel, which cannot be interrupted. So the kernel continues on its merry business and time flies. I remeber reading an HP paper a few years ago saying that it can take up to a second before the standard UNIX kernel switches the interrupts on. Or you need the so called preemptive kernel. I know very little about SVR4 but I think that it is not preemptive. I believe it was to be and I vaguely remember 10ms mentioned sometime in 1988 but I think that it was quietly dropped in 1989. My conclusion is that it looks as your 5ms may be insignificant compared to variability of the time on workstations. Am I right or wrong or maybe I do not know something important? Anybody cares to comment? -m------- Chris Jankowski - Senior Systems Engineer chris@yarra.oz{.au} ---mmm----- Pyramid Technology Corporation Pty. Ltd. fax +61 3 820 0536 -----mmmmm--- 11th Floor, 14 Queens Road tel. +61 3 820 0711 -------mmmmmmm- Melbourne, Victoria, 3004 AUSTRALIA (03) 820 0711 micron n. - a unit of length of one milionth of a meter, worth $2,000,000,000 since the fault in the Hubble space telescope has been identified.
rpw3@rigden.wpd.sgi.com (Rob Warnock) (08/04/90)
In article <61624@bu.edu.bu.edu> kwe@bu-it.bu.edu (Kent England) writes: +--------------- | hedrick@athos.rutgers.edu (Charles Hedrick) writes: | > You've got to be real careful about this "guaranteed" stuff. | ... I recall a discussion about FDDI that started with figuring | out effective throughput and ended up talking about the effect on | throughput of setting certain token rotation timers. I believe | that discussion was right here on comp.dcom.lans two or three months | ago. I can't recall exactly the other principals in the discussion or | I would give credit for info. +--------------- I was one of them. ;-} It was in late May, as I recall. +--------------- | One thing I came away with from that discussion was that | some of the worst case maximum token rotation times, for given | congestion and timer settings, were positively geologic timeframes. +--------------- *Seconds*! Like, 82 of them! Max config ring, 500 dual-attach single-MAC stations in a 100 km circumference circle, that is, 200 Km of fiber. [My previous number of 164 seconds was for 1000 single-attach stations with no concentrators, which isn't really a legal configuration.] And, quoting further from myself in that discussion: "Get real!", you say? O.k., let's say that since we really don't want any given file server to be able to bang out more than 90% of the net (even if we ask him to), we can run with T_Opr as small as 16ms (or, max_bytes/rotation == 200K). That still means 16 *seconds* before you get to send, if everybody else suddenly wants to send 200K, and you happened to have just missed capturing the token. (That is, the "storm" started with the guy immediately downstream of you.) [Oops! "Only" 8 seconds if you have 500 dual-attach single-MAC stations.] "That's still ridiculous!", you say? O.k., so we don't have 1000 nodes in the ring, we have 200. And we don't have 200km of fiber, we have 20km. Now we're down to something that you could easily see in a single department, if all the fiber happens to be "starred" out from a central wiring closet. (Dual-attach, so the average station is 25 meters from the closet. Real enough?) Idle TRT is down to 320usec, so we set T_Opr to 5.66ms (so our expensive file server can at least send a 64K window per token, with 4096 bytes of user data per packet, and thus get about 93% of FDDI), and it *still* can be as much as 1.1 seconds before you get to send if everybody else suddenly wants to send only 64K. I'll still stand by those numbers [as modifed above]... -Rob ----- Rob Warnock, MS-9U/510 rpw3@sgi.com rpw3@pei.com Silicon Graphics, Inc. (415)335-1673 Protocol Engines, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311
ken@minster.york.ac.uk (08/09/90)
In article <64442@yarra.oz.au> chris@yarra.oz.au (Chris Jankowski) writes: >In article <19300@well.sf.ca.us> berger@well.sf.ca.us (Robert J. Berger) writes: >> >> We are looking to make a special purpose dedicated lan for controlling >> up to 128 devices. These devices will run a real time os such as PSOS or >> VxWorks. There will be up to 16 master devices made up of unix workstations >> running Unix System V.4. The workstations will be initiating most traffic. > ^^^^^^^^^^^^^^^ >> The main time critical response we need is to have a guaranteed worst case of >> the workstation sending a message to one or several of the slaves, where the >> message must get to the slave within 5 milliseconds. Most other traffic needs > ^^^^^^^^^^^^^^^ >My conclusion is that it looks as your 5ms may be insignificant >compared to variability of the time on workstations. > >Am I right or wrong or maybe I do not know something important? >Anybody cares to comment? Well pointed out. I still laugh at people selling `Real-Time' Unix, with claims like "We run real-time Unix, and you can use NFS, etc, etc". If you run NFS then you ain't running in real-time. If you want guaranteed response times don't use Unix, use a real-time operating system. Ken -- Ken Tindell UUCP: ..!mcsun!ukc!minster!ken Computer Science Dept. Internet: ken%minster.york.ac.uk@nsfnet-relay.ac.uk York University, Tel.: +44-904-433244 YO1 5DD UK
mack@wizzle.enet.dec.com (Dick Mack) (08/16/90)
|> |> Also, it was mentioned that "the token is believed to be lost" when a new |> station inserts and clicks in its MAU relay, thus causing longish delay. |> This is not what the protocol specifies; the Initialisation Phase merely |> causes the new station to exchange special MAC frames with designated |> Server nodes on the ring. At any rate, there should be no loss of token |> *theoretically*. |> -- |> ........................................................................... |> Andrew Palfreyman Incidentally, in English, the name of the planet |> andrew@dtg.nsc.com is "Earth". - Henry Spencer |> If what you are talking about here is 'graceful insertion', one has to be careful - even a single attachment end station can add enough delay to blow timers and cause a ring re-initialization. When one takes into consideration that two operational segments might be joined, then there has to be some guarantee that there are no multiple tokens and that the insertion has not caused frames to be concatenated. Ensuring token loss so that the total ring reconfigures is an easy way to accomplish this. Dick Mack
mack@wizzle.enet.dec.com (Dick Mack) (08/29/90)
|> |> Also, it was mentioned that "the token is believed to be lost" when a new |> station inserts and clicks in its MAU relay, thus causing longish delay. |> This is not what the protocol specifies; the Initialisation Phase merely |> causes the new station to exchange special MAC frames with designated |> Server nodes on the ring. At any rate, there should be no loss of token |> *theoretically*. |> -- |> ........................................................................... |> Andrew Palfreyman Incidentally, in English, the name of the planet |> andrew@dtg.nsc.com is "Earth". - Henry Spencer |> If what you are talking about here is 'graceful insertion', one has to be careful - even a inserting a single attachment end station can add enough delay to blow timers and cause a ring re-initialization. When one takes into consideration that two operational segments might be joined, then there has to be some guarantee that there are no multiple tokens and that the insertion has not caused frames to be concatenated. Ensuring token loss so that the new combined segment recinitialized is an easy way to accomplish this. Dick Mack -- --- | Dick Mack mack@wizzle.enet.dec.com | To err is human; | to moo, bovine. or mack%wizzle.enet@decwrl.dec.com | --- |
pat@hprnd.HP.COM (Pat Thaler) (09/01/90)
|> |> Also, it was mentioned that "the token is believed to be lost" when a new |> station inserts and clicks in its MAU relay, thus causing longish delay. |> This is not what the protocol specifies; the Initialisation Phase merely |> causes the new station to exchange special MAC frames with designated |> Server nodes on the ring. At any rate, there should be no loss of token |> *theoretically*. |> -- From IEEE 802.5: 7.4.2 Insertion/Bypass Transfer Timing. The insertion/bypass mechanism shall break the existing circuit before establishing the new circuit. The maximum time that the ring trunk circuit is open shall not exceed 5 ms. and 6.4 Symbol Timing .... (2) Whenever a station is inserted into the ring or loses phase lock with the upstream station, it shall, upon receipt of a signal which is within specification from the upstream station (re)aquire phase lock within 1.5 ms. (3) ..... So it looks to me like station insertion causes at least a 6.5 ms break in the network, perhaps longer since all the stations between the inserting station and the master lose and reaquire lock. During that time packets or the token can be lost. Pat Thaler