Sun-Spots-Request@RICE.EDU (Vicky Riffle) (04/23/87)
SUN-SPOTS DIGEST Thursday, 23 April 1987 Volume 5 : Issue 9 Today's Topics: Summary of Experiences: Super Eagles on Suns 68881 speeds Slight csh/fifo problem on 3.2 blinking blocks and disk controllers (very long) TeX on Sun 3.2 SUN WP packages Query? WWB for Sun? macpaint and digital scanner search? 3/50 and 3/75 speeds? Berkeley Smalltalk for Sun 3.0? XNS for Sun3? SUN vs HP9000? Public Domain SPICE? rpc.lockd failure? ------------------------------------------------------------------------ Date: Tue, 31 Mar 87 12:03:08 MST From: gmt@arizona.edu (Gregg Townsend) Subject: Summary of Experiences: Super Eagles on Suns Thanks to all those who reported their experiences with Super Eagles on Suns. The results can be summarized as follows, with addresses for identification only and not guaranteed functional: reported by #drives age cpu ctlr problems --------------------- ------- ---- --- ---- ----------------------- ultra!shj@ames 1 drive 2 wk 280 none hedrick@topaz.rutgers >=4 drives fall none brisco@caip.rutgers >4 drives 4 mo 2nn xy450 none (duplicate report?) deutch@sam.cs.cmu 2 drives 280 1 failure; down 2.5 wks ehrhart@spam.istc.sri 10 drives <3 mo xy451 none " 4 drives >3 mo xy451 none shel@CFI 2 drives 5 mo 280 none hull@buffalo.csnet 4 drives ip 4 failures; down >90 dys ray@ssl-macc.co.uk 1 drive 160 3d pty 1 failure ek@ame.arizona 1 drive 1 mo 160 none pac@stl.stc.co.uk 1 drive 5 mo xy451 none craig@squaw-valley.UCI 2 drives 6 wk 280 none " 6 drives 2 wk 280 none Total: 6 failures at 3 sites out of about 40 drives at 10 sites. Median age of drives reported on: about 3 months. Sites without problems were typically enthusiastic, not just satisfied, though two people used the same phrase about keeping fingers crossed. People reporting failures were understandably upset. My inquiry prompted a phone call from Sun's Southwest Customer Service manager. He says the problem has been identified as failure of the head locking mechanism during shipment. All Super Eagles shipped by Sun after early March should be okay because (1) a team from Fujitsu swept through Sun's inventory and inspected every drive, pulling the bad ones; and (2) Sun is adding their own locking mechanism before shipment to supplement Fujitsu's mechanism. All things considered, I'm guardedly optimistic about the 280 we expect in early April. I don't EXPECT it to fail but still regard it as possible. ------------------------------ Date: Wed, 1 Apr 87 14:20:20 EST From: mnetor!utzoo!henry@seismo.CSS.GOV (Harry Spencer) Subject: 68881 speeds Regarding my previous comment on making sure that your 68881 is really running as fast as it can: several people have pointed out that the Floating-Point Programmer's Guide quietly mentions a handy little program "/usr/etc/mc68881version", which tells you which mask level of 68881 you have and roughly how fast it is running (the latter might not be right if you have other users active on the system, mind you). If mc68881version says that you've got mask A93N (as opposed to A79J), and on a quiet machine it says that the clock speed is circa 12.5 MHz, then the jumpering is wrong, and fixing it will boost your floating-point speed by maybe 25%. If it says circa 16.8 MHz, you already have the fast speed. If you've got mask A79J, 12.5 is the fastest you can run. This is for 3/75s, 3/160s, and 3/180s. The 3/50 is a special case, and the new machines like the 280 are a different story entirely. Now, the real question is, why doesn't Sun bother to check this on the machines being shipped, since they've *got* a program to do it? Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry ------------------------------ Date: Mon, 06 Apr 87 15:43:01 EST From: ted@braggvax.arpa (Ted Nolan) Subject: Slight csh/fifo problem on 3.2 I was playing with fifo's a little on 3.2, just because I've never had them before and thought they were neat. I find that there's a slight problem with csh and fifos: to wit, if you have noclobber set, csh will refuse to redirect output to a fifo. Since csh seems happy enough to redirect to other devices, there was probably just an oversight in adding the new file type to the 'ok to redirect to even though it already exists list.' Ted Nolan ted@braggvax.arpa ------------------------------ Date: Fri, 10 Apr 87 16:02:20 est From: dan@flash.bellcore.com (Daniel R. Strick) Subject: blinking blocks and disk controllers (very long) BLINKING BLOCKS A few months ago I noticed the "blinking block" problem on a CDC 9715-515 connected to a Xylogics 451. My recollection of the details is somewhat fuzzy. This version is 90% correct: When the drive was formatted, the diag program (release 3.0 of sun unix) noticed that a sector was bad and slipped it. The diag program then decided that the same (logical) sector was still bad and forwarded it. After the disk was formatted, I rebooted unix and ran a home-grown generic disk testing program (writes a pattern, reads it back, checks the result, advances to the next track, etc.). The log file written by the testing program was fascinating. Pass N would complete without exception and then on pass N+1 the program would report a data miscompare and show the pattern (the incorrect data) that was written on pass N-1. The disk controller apparently found two (or more) physical sectors to be equally attractive homes for a single logical sector. The unix disk driver never reported an I/O error. Replacing the disk controller did not help. I examined the faulty track with the diag "read headers" command. It would sometimes show a normal set of headers with one spare. Sometimes it would show the same set with one header completely garbaged. [Skip the next two paragraphs if you understand disk formatting.] A sector written on disk generally consists of a sector header followed by a data segment. The sector header contains some sort of sector address (e.g. cylinder, track, and sector numbers) and some sort of error detection code. An error detection/correction code is appended to the data segment. "Formatting" a disk drive generally means writing the sector headers. The sector headers are not rewritten during normal operation. When a sector of data is written to disk, the drive must switch from reading to writing between the header and data segments. This creates a break in the bit stream recorded on the drive. This is followed by a repetitive pattern (e.g. a couple of hundred zero bits) because the drive electronics must be synchronized with the bit stream in order to make sense out of it. The break and the synchronization pattern are called a "gap". The gap is followed by a special "sync" code (e.g. a distinctive pattern beginning with a 1 bit) that marks the beginning of the data. There is another gap before the sector header. Most disk controllers search for a particular sector within a track mainly by reading sector headers until they find one with the desired sector number. Sector headers with impossible addresses or bad error detection codes are mostly ignored. If the correct sector is not found within a reasonable time, the controller gives up. One technique for working around bad spots on a disk is sometimes called "sector slipping" (Xylogics terminology). Each track is formatted with an extra sector. The extra sector does not interfere with normal disk I/O because it is formatted with an invalid header, 0xDDDDDDDD. If one of the normal sectors is bad, the bad sector and all following sectors are shifted one physical sector towards the end of the track. The header of the bad sector is filled with 0xFEFEFEFE. Another method for avoiding bad spots is sometimes called "sector forwarding". The address of the bad sector is entered into a table (stored in a more or less fixed location on the disk) that assigns a good sector to be used in place of the bad sector. The header of the bad sector is set to 0xFFFFFFFF. The sector forwarding table is consulted whenever the controller cannot find a sector. If the error detection codes are reasonably reliable and the disk formatting program does not do anything silly like format two sectors with the same header, the disk system should never return bad data. A write or read may fail, but if both succeed the data should be correct (assuming your cpu, your memory, your backplane, and the universe are not otherwise hostile). I suspect there is a stupid bug in the diag formatting program. Consider the following scenario. Diag has just slipped sector 41 on track 329/0. Logical sector 41 is now sitting on physical sector 42. Physical sector 41 is unused (header = 0xFEFEFEFE). Diag retests the track and decides that logical sector 41 is bad. Since a sector has already been slipped, sector 41 must be forwarded. This is accomplished by entering sector 41 into the bad sector table and setting its header to 0xFFFFFFFF. Here is one way in which the job could be botched: Diag realizes that the slipped and forwarded sectors are the same logical sector and decides to give up on the slip and just forward the bad sector from its original location. Diag should first unslip the sector and then forward it. Diag does this backwards. The result is a normal track with a spare sector header. A header for logical sector 41 remains on track 329/0 even though the sector has been entered into the bad sector table. Suppose the problem with physical sector 41 is slightly bad media underneath the sector header. When the header reads correctly, the data in physical sector 41 is used. If the header read fails, the disk driver looks in the bad block table, finds the entry for logical sector 41, and uses the data in the sector forwarding area. I swapped my bad disk with a similar drive from a CCI Power6. What can you do if this happens to you and you don't have a Power 6? If you can figure out what is wrong with a track, you can use diag command called "whdr" to patch a header. (Which header? The one that blinks and is in the bad sector table.) If that fails, you can try reformatting the drive. If that fails and you bought your disk system from SMI, insist that the disk be replaced. If you bought it someplace else, you can try adjusting the drive sector size switches to move the sector header away from the bad spot. (P.S. The formatting bug can be invoked artificially using the release 3.2 formatter. Just add two entries for the same sector to the flaw map table and reformat.) --------------------------------------- XYLOGICS 451 CONTROLLERS I bought a small heap of Xylogics 451 disk controllers a while back, tested them, and found that none are to be trusted. I hear rumors that other people have had problems but I don't know who. The people I talk to at Xylogics insist that nobody else has reported such problems. I have a program that walks over disk drives (writing and reading back random patterns). When I install a disk system I usually run patterns over the drives for a few days. The purpose of this exercise is to flush out intermittent cabling problems and media defects. When I began testing my first 451 controllers I was only doing 30 to 50 passes over each drive and the controllers seemed to work just fine. A few disk systems later something strange happened. A word was dropped towards the end of a long disk write. I recabled the system and tried again. The disks worked just fine. A few disks later, it happened again. I started increasing the number of patterns run on each drive. I eventually discovered that the frequency of occurrence depended mainly on the controller and perhaps somewhat on the drives (mostly CDC 9715-515s and NEC D2352s). I tested six Rev-B 451 controllers, five Rev-Fs and several 450 controllers. Every Rev-B controller would botch a write if tested long enough. One controller ran about 500 passes over disk drives before failing. Another seemed to fail once every 50 passes or so. The average rate was roughly one failure in 200 passes. The Rev-F 451 controllers and the 450 controllers seemed to work ok. I started to send all my Rev-B boards back to Xylogics for upgrades. (P.S. The word going around is that if you have any pre Rev-F controllers, you should have them upgraded.) Then I started to look into another problem that surfaced when I was working on the word slipping problem. Every once in a while I would get back a few bad bits. The patterns of the bits in error suggested that the Xylogics 451 controllers were botching the ECC (correcting the wrong bits). I reformatted one of my CDC 9715s with only one verification pass to preserve the moderate media defects and retested four controllers (two repaired Rev-Fs, one Rev-B upgraded to Rev-F, and one Rev-B). Every controller failed in essentially the same way. Sometimes the system reported read failures, but a lot of the time (perhaps 30%, depends on the media defect) bad data was returned. Perhaps the controller just doesn't recognize bad data reliably. One of the media defects I worked over tends to cause a couple of bits at the beginning of a sector to be dropped (missing), shifting the entire sector by a bit or two (a sync problem?). The system says the data is good even though a couple of thousand bits are wrong. Could this be a driver problem? Since I don't have the source, I can't say. I can say that it was not fixed in release 3.2. The 451 controllers are usable if you are careful to map out nearly all of the bad spots. The controllers seem to correct trivial (single bit?) errors correctly. The media testing done by diag during formatting is not sufficient to guarantee that you will never be "bitten". The Sun release 3.2 formatting program will read the flaw maps from a new SMD disk drive. You should compare the paper copy of the flaw map that you got with your drive with whatever the diag program claims it read off the drive. I just formatted a new CDC-9715 disk with the new diag. Diag found 41 spurious flaws, including 8 flaws that were each listed four times. Six of these were located in sector 70 (the drive only has about 50 sectors per track). Diag was smart enough not to slip or map these sectors, but it made up for this by slipping each of the other two once and forwarding them three times. The people at Xylogics have not been very helpful. I suspect they are more interested in their new product lines and want these problems to go away without having to fix a bunch of old boards. Would anyone out there in sun-spots land like to repeat my tests? You should be prepared to reformat most of one or more drives (CDC 9715-515s seem to work best as mediocre media helps to expose problems) with the release 3.0 formatter and run patterns for several days. You should probably have bought your 451 controller(s), Rev-F or later, from Xylogics or one of its distributors. I can provide my disk exercising program and a telephone number in Burlington Mass to be used if you decide your controllers are flakey. --------------------------------------- OTHER CONTROLLERS The disk controllers of 1987/1988 are likely to be the Xylogics 752/7503 and the Interphase 3200/4200. I don't know very much about the Xylogics boards. I just got a pair of 752s without manuals or software, so I can't do much with them yet except admire the workmanship and the chip packing density. I assume the 7503 will be important as Sun will probably sell it and support it (standard disk driver, bootstrap proms, etc.). I think I was told that the 752s and the 7503s will be essentially software compatible and that makes the 752s very attractive. I don't recall if these controllers will be media compatible with 451s. It would be nice. I was told to expect a pair 7502s in the mail very soon... I also have several Interphase 4200 "cheetah" controllers. I have been told that the 4200 is basically a 3200 modified to handle high transfer rate drives, so most of what I say about the 4200 probably applies to the 3200. The 4200 is a standard size VME card with a VME style "handle" that must be removed because it gets in the way when you try to install one in a Sun VME-VME adapter. Since the card is small, there are only two SMD B-cable connectors. I understand there is some sort of 4 drive adapter but I have not seen one. The SMD connectors are mounted straight up on the board rather than on the edge. This is a bit of a pain because flat cable connectors with strain reliefs may stick out too far above the board and the A-cable must be routed over the B-cable connectors. The B-cable connectors are not interchangeable. Formatting a drive with a 4200 is a little messier than formatting with other controllers because the 4200 must be told the sizes of the gaps before and after the sector header. If you are trying to format a type of drive you have never formatted before, experimentation may be necessary. There is no notion of drive type number. You don't have to worry about the possibility that two drives might be incompatible because they were formatted with the same drive type number but have different geometries. The 4200 supports sector slipping (Interphase calls it "mapping") and does track forwarding in hardware. There is no way to forward a single sector, but track forwarding should not be painful if nearly all bad sectors are slipped. The controller is very fast. It plugs (nearly) directly into a Sun 3 backplane (no multibus adapter logic in the way) and can do 32 bit dma. The 4200 also has a 128 KB cache. When you ask it to read from a disk, it will start reading as soon as the head settles on the track. The 4200 will continue reading from the disk after all the requested data has been transferred, stopping only when it has filled up the cache, sucked up two extra tracks, reached the bottom of the cylinder, or you ask it do do something else. The software I got with the 4200 controller is a bit primitive but usable. A new version is in the works. I have been told it will be ready very soon and support formatting under unix (via ioctl()). The controller is a bit buggy (error detection/correction problems and such) but people at Interphase are working on these problems. I suspect my 4200s will be fixed before my 451s. ------------------------------ Date: 12 Apr 1987 14:01:48 BST From: uucp%ux63.bath.ac.uk@Cs.Ucl.AC.UK (J. H. Davenport) Subject: TeX on Sun 3.2 I have received a patch tape from Sun to the Pascal compiler, which corrects the problem mentioned in my previous communication (faulty compilation of division by powers of two). The reference name on the tape seems to be "pascal_mod_by_2". This tape a pre-requisite patch, which fixes the same bug in C and F77 (and the common code generator). On recompiling after applying these patches, TeX passes the trip test, whereas before the result was a lamentable failure. J.H. Davenport JHD1%CAMPHX%CAGA@UCL-CS.ARPA ------------------------------ Date: 31 Mar 87 00:12:45 GMT From: hplabs!felix!john@seismo.CSS.GOV (John Gilbert) Subject: SUN WP packages Query? Hello out there.... A friend is looking for a word processing package to run on SUN workstations. It should be WYSIWYG, and offer all the capabilities afforded by nroff/troff. So far, Interleaf has been by to tell them how much they would charge, and it was extermely unacceptable. Their secretaries indicated they would be happier with nroff/troff, which they already use. Given the prices for PC type word processors, there ought to be some for the SUN that are affordable. All I need is pointers to products and they will be happy to follow up. Please E-MAIL responses to me at: .!trwrb!felix!john Thanks for any help. -- John Gilbert .!trwrb!felix!john ------------------------------ Date: Wed, 1 Apr 87 23:42:18 EST From: mark@cbpavo.mis.oh.att.com (Mark Horton) Subject: WWB for Sun? The laser writer interface kit for Sun 3.2 says to get ditroff from AT&T. The only ditroff product I know of for sale is a binary distribution for the 3B2. Does anyone have ordering information about where I can get an appropriate ditroff package? I'm happy to order and pay for it, but I would like to get suitable binaries quickly if possible. (We have a ditroff, but it seems to be an old version, and it doesn't work properly.) Mark Horton mark@cbosgd.ATT.COM cbosgd!mark@seismo.css.gov ------------------------------ Date: Fri, 3 Apr 87 13:42:36 EST From: dunigan@ORNL-MSR.ARPA (Tom Dunigan) Subject: macpaint and digital scanner search? Looking for recommendations for macpaint-like software for Sun 3s any experiences with SolarPaint ($1,495 -- could buy a mac for that)? Recommendations for digitizing scanners that interface to sun and associated software? any experiences with Microtek MS-300A scanner? thanks tom dunigan@ornl-msr.arpa ------------------------------ Date: Fri, 3 Apr 87 15:05:27 CST From: dje%datacube.UUCP@CCA.CCA.COM (Dave Erickson) Subject: 3/50 and 3/75 speeds? With Sun's recent price reduction on the 3/50, we are considering buying several. The only problem we have with the machine is that it is about 3/4 as fast as a 3/75, the machine we're familiar with here. We will soon dig into the 3/50 to try to increase the performance by increasing the CPU clock oscillator, CPU speed, and replacing anything else that is an obvious difference between the 3/50 and the 3/75. I believe the 3/50 CPU runs at 12 MHz and the 3/75 is 16 Mhz. Is this true? Has anyone ever done this? Is it possible? Dave Erickson ------------------------ Datacube Inc. 4 Dearborn Rd. Peabody, Ma 01960 617-535-6644 ------------------------ [ihnp4 | mirror]!datacube!dje ------------------------------ Date: 6 Apr 87 11:34:00 EST From: gaudette@icst-ecf.arpa (Philip Gaudette) Subject: Berkeley Smalltalk for Sun 3.0? I have a release of Berkeley Smalltalk II that runs ok on version 2.0 of the operating system but not 3.0 or later. The problem, of course, is that the BS II virtual machine uses obsolete code to deal with the display. Has anyone modified BS II so that it runs properly under SunView? If I figure out these changes, does anyone else want them? -- Philip Gaudette, Computer Scientist -- Natl Bureau of Standards ------ ------------------------------ Date: 8 Apr 87 08:47:05 GMT From: mcvax!oce.nl!mhsc@seismo.css.gov (Maarten Schoonwater) Subject: XNS for Sun3? We want to set up a Sun3/160 as gateway between our TCP-IP oriented Unix systems and some other machines which have XNS networking capabilities. We have the XNS-courier sources from the public domain (Cornell University) which come with the 4.3BSD distribution. This however is not guaranteed to work. Has anyone ported this or other XNS software to a Sun3 with Sun 3.0 operating system? Has anyone implemented XNS networking in the Sun kernel? Our main interest is to do file transfer and perhaps terminal emulation so we don't need the complete Courier set. Maarten Schoonwater Usenet: mhsc@oce.nl Oce-Nederland B.V. mail : P.O. 101 5900MA Venlo R&D department The Netherlands ------------------------------ Date: Wed, 8 Apr 87 10:21:48 -0200 From: mcvax!kark!hayek@seismo.css.gov (Rolf Hillestad) Subject: SUN vs HP9000? Is there anyone out there who has worked with HP 9000/300 series machines with HPUX, and could give me some hints on strengths or weaknesses I have special interest in how the 9000/350CX behaves compared to a SUN 3/260C. I am considering these two machines for a development project which involves a mapping system and communications using X.25 and X.400 protocols. Any comments on the quality of HPUX 5.2, software or hardware in general, or graphical capabilities are welcome. Please respond by mail, and I summarize to the net if there is enough interest. =================================================================== Rolf A. Hillestad Box 25 Computer Technology Group 3601 Kongsberg Kongsberg Vaapenfabrikk Norway USENET: mcvax!kark!hayek =================================================================== ------------------------------ Date: 8 Apr 87 19:07 +0500 From: carlos%deervax.concordia.cdn%ubc.csnet@RELAY.CS.NET (Carlos Perez) Subject: Public Domain SPICE? Does any one know a public domain version of SPICE that will run on a Sun 3/160 V3.2?. Any information will be very much appreciated. Carlos Perez Concordia University Department of Electrical Engineering Montreal, Quebec, H3G 1M8 (514)848-3107 P.S Please reply to any of the following addresses since I do not receive the Sun-Spots stuff. UUCP: {decvax,ihnp4,akgua,etc.}!musocs!deervax!carlos CSNET: carlos%deervax.concordia.cdn@ubc.csnet CDNNet: carlos@deervax.concordia.cdn. ------------------------------ Date: 10 Apr 87 5:34 +0800 From: sample%ubc.csnet@RELAY.CS.NET (Rick Sample) Subject: rpc.lockd failure? I have been running into a problem with /etc/rpc.lockd, in that it will run fine for a while, and then dump core. When this happens, any program calling "lockf" hangs and can't be killed. This seems like a misfeature to me. Has anyone else had the same trouble? Rick Sample sample@ubc.csnet sample@cs.ubc.cdn sample@cs.ubc.can ------------------------------ End of SUN-Spots Digest ***********************