JERAGER@AMHERST.BITNET (PROF. JOHN RAGER) (02/14/89)
For those of us who don't have quite as much access to Sun grapevines, could you summarize the performance problems observed with a 3/50 running version 4 of the OS? Who is being impacted? Is one person writing C programs, etc going to see unbearable slowup? We are a small academic group with 4 3/50s and frankly, none of the upgrade/replace options are appealing (they're all too expensive). If you can't respond from your own experience, you could release this to the public and I'll summarize responses, etc. Also, is anyone worried that version 5.0 won't run on Sun3s at all? Thanks John Rager (JERAGER@AMHERST.BITNET)
hedrick@geneva.rutgers.edu (Charles Hedrick) (02/23/89)
Yes, 4.0 has been a disaster. We use 4.0.1 on 4MB 2/50's and 3/50's. Reports are mixed. I did some side by side tests of a 4.0 one and a 3.2 one starting up suntools and lisp, and found things were OK. I use a 3/50 with 4MB as my normal machine, including building system software. It's certainly not a 3/60, but it works OK. But some users report that when they use a lot of windows, things are much slower. Our operator, who uses one window per server and switches a lot, was near to murdering us until we added aother 4MB to that machine. I've also had reports from some users that things seem a lot slower. It seems like there's a different paging strategy somehow. When you go back to a window that you haven't used for a while, chances are you'll have to wait while the program pages back in. One suspicion is that there's more of a tendency to let one program take over the whole system, so once a program is working in one window, all other programs get paged out more quickly than they used to. Sort of sounds like they let the working set grow too rapidly, doesn't it? But it could also be that paging with NFS is slower than paging with ND. There's a lot of things you can do to make things better: tailoring a kernel, making sure you don't run unnecessary daemons, making sure your syslog doesn't loop, etc. Sun has a list of them, and that list has been given here several times. But even after all of that is done, there is a difference between 4.0 and 3.2. I don't find 4.0 unbearable, but I just got a 3/50 as 4.0 came up, so I haven't had much experience under 3.2. I don't think you'll find single C compiles going slower. Where you'll be affected is when trying to do things in multiple windows. Note that I use X. I may be better off than suntools users. We have been able to build a shared X library, and so take advantage of that new facility to save memory. Unfortunately, we haven't been able to do that for suntools. The 4.0 suntools, although done with shared libraries, is too slow to be usable. We don't know what it is, but we went back to the 3.2 version. This helped, but didn't completely get rid of the difference. (I've contemplated building a 3.2 suntools with shared libraries, but this would involve some structural changes, so we haven't had time to do it.) At this point I plan to stick with 3.2 suntools until we get Sunview 2, which runs on top of X. Of course there's apparently some question whether the Sun merged X/NeWS will be usable on 4MB machines either. If not, we'll probably try to get people to migrate to the traditional X software. It does seem like things are faster on our 32MB Sun 4's. I'm not sure whether I'd switch if I had to do it again. My problem is that the university has lots of Sun's. Some need the new features, largely because of new hardware. We can't really maintain two versions. So we eventually have to upgrade. I'm sort of upset that the promised performance tuning is going to wait until 4.1, though as we start seeing how many bugs there are, I begin to understand that they have their hands full fixing bugs. I think at the moment I'd rather have not not add any features or change any algorithms, but just fix bugs. 4.0 and even 4.0.1 is fairly buggy. We have problems with just about everything related to networking. Rlogin sometimes doesn't turn XON/XOFF on and off. NFS hangs. ypbind crashes or goes catatonic. lockd crashes if your host name is long. Some of these have fixes. The ypbind from the answerline may fix the yp problems. But I have been using using at least one FTE person fixing bugs in the kernel and basic utilities for the last several weeks, and see no end in sight. I sent the first batch of fixes to sunbugs, and didn't even get an acknowledgement. I think if you have no strong reason to change, you might want to wait for 4.0.2. I think it will have a lot of the problems fixed, though I'm worried about rumors that they're not going to be able to tell us in detail what changed, as they did from 4.0 to 4.0.1, which means we'll just have to throw away everything and start from scratch. If I've done enough local work to make 4.0.1 stable by then, I may ignore 4.0.2. But a site without source will probably want to do it the other way. The other approach would be to wait for 4.1, which is supposed to address the performace problems. Though given our experience with 4.0, you might want t make it 4.1.2.
wayne@ames.arc.nasa.gov (Wayne Hathaway) (03/06/89)
Relative to Charles Hedrick's observation on paging being different under SunOS 4.0, I understand that one of the changes that was made was to go to a more "global" page replacement strategy, instead of treating program areas and disk cache and so forth as separate "local" paging areas. The end result is that single applications can definitely take over more of the machine, causing more things to be paged out (and subsequently back in). Apparently this is even regarded within Sun as a somewhat suspect "optimization." Wayne Hathaway Ultra Network Technologies domain: wayne@Ultra.COM 101 Daggett Drive Internet: ultra!wayne@Ames.ARC.NASA.GOV San Jose, CA 95134 uucp: ...!ames!ultra!wayne 408-922-0100
guy@uunet.uu.net (Guy Harris) (03/14/89)
>Relative to Charles Hedrick's observation on paging being different under >SunOS 4.0, I understand that one of the changes that was made was to go to >a more "global" page replacement strategy, instead of treating program >areas and disk cache and so forth as separate "local" paging areas. Well, sort of. In fact, what happened is that the UFS and NFS file systems perform "read" and "write" operations by temporarily mapping the affected region of the file in question into the kernel's address space and doing copies between the mapped region and the process's buffer. The net result happens to be that all of the page frame pool is used both for paging for "read"s and "write"s, and paging for mapped-in files (e.g., demand-paged executables and "mmap"ped files), which means in effect that any page in the page frame pool can be used as what in earlier UNIX systems would be a buffer-pool block for file data. (Control information - inodes, bitmaps, indirect blocks - is still kept in a reduced-size traditional buffer cache). >Apparently this is even regarded within Sun as a somewhat suspect >"optimization." It is not, as far as I know, so regarded - at least not in the OS group. The idea is considered reasonable; there may, at present, be problems with the paging algorithms that make it cause problems in some cases. (The potential performance improvement - which, at least on machines with lots of memory, translates to an actual performance improvement - wasn't the only reason why this was done; it also simplified problems of keeping file data references in "read" and "write", and references to mapped file data, consistent.)