rmilner@zia.aoc.nrao.edu (Ruth Milner) (05/31/91)
About a month ago I posted a query about experience with Auspex NFS servers, and promised a summary. Apologies for the delay, here it is. Of the replies I received, three were from people who have Auspex servers (or have installed them at customer sites), all after extensive evaluation. All three are *extremely* impressed with the systems, and very happy they bought one. I have included excerpts from their replies below. The other three were from people suggesting possible alternatives. I had not heard of Sequent's potential capabilities as a server; it might be something to look at, although it still isn't clear that it could handle that number of clients on that many networks. The second alternative is a Sun 4/75 with Legato's Prestoserve. This is reported as being able to support up to 16 SPARC- stations. However, the administration involved in maintaining (and especially upgrading) five servers instead of one, with attendant duplication and the logistics of users file-sharing between them, is not something I can really consider in this case. The third alternative was the OMNI/Interphase NC300 coprocessor. A fair amount of detail is given in the response below. This is an Ethernet coprocessor. We met and talked with Auspex recently, and they had some interesting things to say. One was that the NFS serving (including disk requests) and IP routing are 100% independent of the host CPU, to the extent that the host itself can actually die, and everything else will keep going. They reported on a site where nobody noticed until 2 days later when they went to log onto the Auspex. Unfortunately it can't be rebooted independently of the other processors, but the downtime is less and can be scheduled for a more convenient time. Here are the excerpts. Many thanks to those who sent them. ---------------------- [Configuration] We have about 40 clients, about 1/2 Sun-4 and SPARC, most of the rest Sun-3, and a couple of Sun-2's. [...] We're still on one net today, but will split it to two in a day or two. We intend to go to 4, then 6 as we upgrade Sun-3's to SPARCs. If we increase the number of clients, we may go to the full 8 networks. All root and /usr is from the Auspex. All swap except the 5 clients with local disks is from the Auspex. We use striping on the swap partition, and the -async export option on swap and tmp (var/tmp is linked to tmp, also.) We don't have enough disk to put usr/local, X11, or home directories on the Auspex. Openwin is part of usr on the Auspex. NS-5000M, minimum configuration pretty much (16MB cache, 8 MB Sun-3 host, 2 nets, 2 1GB drives, 1 FP, 1 SP). The host runs 4.0.3 as modified by Auspex. [...] the SPARC that Auspex will soon offer is the SPARCengine 1/E, the 12.5 MIP SPARC. So it's not worth much more than the Sun-3 as a compute server. So I'm happy with the Sun-3 running 4.0.3 given the current choices. [Performance] Kind of early to tell. Our users are not noticing much difference from using 4 Suns as servers, but none of those Suns were 4/490-class, and we haven't had any heavy network traffic (even on our single net) yet. If it's different, it's better, though. We will have parallel Cisco ports for non-server traffic outside the subnets, so we probably will never know about routing performance. Bruce Nelson of Auspex made the remark to us that the Cisco is about 50% faster than the NS-5000 for routing. [Price] Sun and Solbourne couldn't deliver the performance on so many networks. The 4/490, even with Prestoserve and Omni network boards, takes a big performance dip after two networks, by Sun's own results on the test we provided. [...] The prices were pretty nearly identical with 2 GB of disk and enough guts to run 6 networks if the interfaces were added. Solbourne put 2 of their work- group servers in. Sun put in all the Prestoserve and Omni Solutions stuff. Expansion of the Auspex will be lots cheaper than expansion of the others, though. Your configuration will start out with Auspex cheaper -- 30-40 GB of IPI disk doesn't come cheap. And on Suns adding lots of SCSI disk is a nightmare. [Reliability] The only time it has been down since I added clients [3 weeks ago] is when I accidentally hit the power switch while fiddling with Ethernet cables -- I guess I'll epoxy a guard over the switch. [Enhancements] Ax_perfmon uses curses, so you can have displays running all over the network. It shows network traffic on each net by direction, processor loads, age of the data in cache, NFS traffic by function, and data operations for each drive. That's just in the summary screen -- there are more detailed displays for Ethernet, NFS, cache, disk, and virtual partition activity. [Lengthy sample display omitted] Note that all six networks were handling over 1MB per second. That impresses me. It can do it with eight networks, too, I think. It took 3 Sun-4 or SPARC clients on each network doing nothing but dd from a NFS file to /dev/null to push it that hard. But the key to getting that much data from the Auspex turned out to be having one disk per client, so that's interesting, too. -------------------------------- From: jfd@octelb.octel.com (John F. Detke) Organization: Octel Communications Inc., Milpitas Ca. We have had an Auspex for almost a year now, and have been quite pleased. I don't have your posting in front of me, but let me take a stab at your questions Our configuration is a NS5000, 15 660MB disks, 2 1GB disk and a 8mm tape drive. Standard memory (8MB) 1 I/O processor running AusOS 1.2 (soon to be 1.3). and 2 ethernet boards (for a total of 4 networks). We currently have 3 functioning nets, our backbone (all the servers, 7 3/80's, and 100 PC-NFS nodes), a 3/80 network (24 more 3/80's), and "my net" with just my Sparc, used for testing. The 3/80s are 'dataless", have their own root and swap on the 104MB quantum, and get everything else from the Auspex. We also run a few partitions for software development, our local (/usr/local) file system, a couple disks of home directories, and that's about it for now. We are planning on moving all development directories to the auspex. It handles the routing fine (heavy traffic on the backbone, and bursts from the 3/80 network to the backbone during compiles). The only major "problem" we have had is that it is too fast. We have Cabletron 10baseT equipment, and have 3 repeaters in the loop (MT8000, MMAC's) and the Auspex complains sometimes about "late collisions" and no carrier. Several Cabletron upgrades later and this only happens a few time a week. Auspex real follows the 9.6 msec packet spacing (not 9.61, 6 packets bursts at 9.6). As for price/performance, the only problem I have is the $6,500 SCSI disks, which I know are substantially less (but HP makes a nice, bulletproof 5 1/4" disk). Auspex compare prices with IPI disks, not SCSI. Even so, they do OK. [...] We did some performance run-offs, and the Auspex beats, hands down. The 4/490 wasn't close, unless you were measuring single network performance. Routing really kills the 4/490. We even tried Legato's presto serve, which actually slowed down the 4/490 at peak performance (and Sun even admits this now!) -------------- [Configuration] We have much the same needs as you do for a customer of ours. I recently finished doing an evaluation of a Sun 4/490, Solbourne 5E/900, Auspex (NS3000 & NS5000) and the Epoch. For the strictly file/data serving part we picked the Auspex as the winner. We are also recommending a 470/490 class machine for some specific tasks. [Site he visited] had a fully loaded NS5000 (a second one is on order) with 8 LANS with each LAN supporting 17-35 Sun workstations and a 4/490 on each. The Suns have local disk strictly for swapping and temp files except for the 4/490's which are disk-full. Everything else is snagged from the Auspex. Additionally the workstations are are running X11 or OpenWindows or Sunview. They are quite pleased with what they have and the only reason they are buying a second system is so they can have a readily available back up. They are using disk striping but not mirroring at this time. [Price, reliability, performance] Price was not a big issue in our determination due to the nature of the system. Availability and performance were the keys. [Service] The customer I visited had no complaints at all. ------------------------- Don't know if you've taken a look at Sequent. With our parallel streams implementation, our TCP/IP and raw packet capabilities can scale from the very small to the very large. We have i486-based CPU boards (two CPUs per board) and VME Ethernet. I can't say anything about our machines serving Suns, except: "Ask your sales rep, don't assume." :->. Andy Valencia vandys@sequent.com -------------------------- From: Dan Butzer <butzer@cis.ohio-state.edu> Organization: The Ohio State University Computer & Information Science While we do not have experience with the Auspex (Universities are pretty poor these days) we have done extensive testing on various servers to replace our aging 3/180's. We've found that the best bang for the buck is a Sun 4/75 with the Legato S-Bus Prestoserve. You can hang as many as 16 DISKLESS sparcstations off one of these and see good performance. You're beginning to approach network saturation. The SS2 still has power to offer. 5 of these with second SCSI cards host adapters, 2nd ethernet cards for a backbone net, and plenty of 1GB disks should do nicely. Also, DEC 5000 systems will soon have Presto and should perform comparably.... -------------------- From: scp@acl.lanl.gov (Stephen C. Pope) Organization: Advanced Computing Lab, LANL, NM We've been using an NC300 in a sun 4/260 which serves up home dirs, and usr/local for about 25 machines, and root and swap for about 10 sun3s and 4 SLCs. We've had it ever since OMNI first came out with it before the interphase buyout. It works great. So good I never have to think about it (except to be sure to get the new drivers everytime I do an os upgrade). I've never wasted much time trying to benchmark it - but it definitely gives faster NFS service; the more clients, the better the speedup over not having it. Even with only 1-2 clients, it's about 25% faster than with it turned off. Only once in the last 1 1/2 years has it failed - the code running on the coprocessor core dumped for unknown reasons. No problem though, since the usual sun nfsd's take over right away, and your system keeps running. You should bear in mind that (at least as of the last software release I have) promiscuous mode doesn't work, so you don't want to have to do any nit'ing on that particular interface. ------------------- -- Ruth Milner Systems Manager NRAO/VLA Socorro NM Computing Division Head rmilner@zia.aoc.nrao.edu