[comp.sys.sun] 4 MB Suns with OS version 4.0

JERAGER@AMHERST.BITNET (PROF. JOHN RAGER) (02/14/89)

For those of us who don't have quite as much access to Sun grapevines,
could you summarize the performance problems observed with a 3/50 running
version 4 of the OS? Who is being impacted? Is one person writing C
programs, etc going to see unbearable slowup? We are a small academic
group with 4 3/50s and frankly, none of the upgrade/replace options are
appealing (they're all too expensive).

If you can't respond from your own experience, you could release this to
the public and I'll summarize responses, etc. Also, is anyone worried that
version 5.0 won't run on Sun3s at all?

Thanks

John Rager (JERAGER@AMHERST.BITNET)

hedrick@geneva.rutgers.edu (Charles Hedrick) (02/23/89)

Yes, 4.0 has been a disaster.

We use 4.0.1 on 4MB 2/50's and 3/50's.  Reports are mixed.  I did some
side by side tests of a 4.0 one and a 3.2 one starting up suntools and
lisp, and found things were OK.  I use a 3/50 with 4MB as my normal
machine, including building system software.  It's certainly not a 3/60,
but it works OK.  But some users report that when they use a lot of
windows, things are much slower.  Our operator, who uses one window per
server and switches a lot, was near to murdering us until we added aother
4MB to that machine.  I've also had reports from some users that things
seem a lot slower.  It seems like there's a different paging strategy
somehow.  When you go back to a window that you haven't used for a while,
chances are you'll have to wait while the program pages back in.  One
suspicion is that there's more of a tendency to let one program take over
the whole system, so once a program is working in one window, all other
programs get paged out more quickly than they used to.  Sort of sounds
like they let the working set grow too rapidly, doesn't it?  But it could
also be that paging with NFS is slower than paging with ND.

There's a lot of things you can do to make things better: tailoring a
kernel, making sure you don't run unnecessary daemons, making sure your
syslog doesn't loop, etc.  Sun has a list of them, and that list has been
given here several times.  But even after all of that is done, there is a
difference between 4.0 and 3.2.  I don't find 4.0 unbearable, but I just
got a 3/50 as 4.0 came up, so I haven't had much experience under 3.2.  I
don't think you'll find single C compiles going slower.  Where you'll be
affected is when trying to do things in multiple windows. 

Note that I use X.  I may be better off than suntools users.  We have been
able to build a shared X library, and so take advantage of that new
facility to save memory.  Unfortunately, we haven't been able to do that
for suntools.  The 4.0 suntools, although done with shared libraries, is
too slow to be usable.  We don't know what it is, but we went back to the
3.2 version.  This helped, but didn't completely get rid of the
difference.  (I've contemplated building a 3.2 suntools with shared
libraries, but this would involve some structural changes, so we haven't
had time to do it.)  At this point I plan to stick with 3.2 suntools until
we get Sunview 2, which runs on top of X.  Of course there's apparently
some question whether the Sun merged X/NeWS will be usable on 4MB machines
either.  If not, we'll probably try to get people to migrate to the
traditional X software.

It does seem like things are faster on our 32MB Sun 4's.

I'm not sure whether I'd switch if I had to do it again.  My problem is
that the university has lots of Sun's.  Some need the new features,
largely because of new hardware.  We can't really maintain two versions.
So we eventually have to upgrade.  I'm sort of upset that the promised
performance tuning is going to wait until 4.1, though as we start seeing
how many bugs there are, I begin to understand that they have their hands
full fixing bugs.  I think at the moment I'd rather have not not add any
features or change any algorithms, but just fix bugs.

4.0 and even 4.0.1 is fairly buggy.  We have problems with just about
everything related to networking.  Rlogin sometimes doesn't turn XON/XOFF
on and off.  NFS hangs.  ypbind crashes or goes catatonic.  lockd crashes
if your host name is long.  Some of these have fixes.  The ypbind from the
answerline may fix the yp problems.  But I have been using using at least
one FTE person fixing bugs in the kernel and basic utilities for the last
several weeks, and see no end in sight.  I sent the first batch of fixes
to sunbugs, and didn't even get an acknowledgement.  I think if you have
no strong reason to change, you might want to wait for 4.0.2.  I think it
will have a lot of the problems fixed, though I'm worried about rumors
that they're not going to be able to tell us in detail what changed, as
they did from 4.0 to 4.0.1, which means we'll just have to throw away
everything and start from scratch.  If I've done enough local work to make
4.0.1 stable by then, I may ignore 4.0.2.  But a site without source will
probably want to do it the other way.  The other approach would be to wait
for 4.1, which is supposed to address the performace problems.  Though
given our experience with 4.0, you might want t make it 4.1.2.

wayne@ames.arc.nasa.gov (Wayne Hathaway) (03/06/89)

Relative to Charles Hedrick's observation on paging being different under
SunOS 4.0, I understand that one of the changes that was made was to go to
a more "global" page replacement strategy, instead of treating program
areas and disk cache and so forth as separate "local" paging areas.  The
end result is that single applications can definitely take over more of
the machine, causing more things to be paged out (and subsequently back
in).  Apparently this is even regarded within Sun as a somewhat suspect
"optimization."

  Wayne Hathaway            
  Ultra Network Technologies     domain: wayne@Ultra.COM
  101 Daggett Drive            Internet: ultra!wayne@Ames.ARC.NASA.GOV
  San Jose, CA 95134               uucp: ...!ames!ultra!wayne
  408-922-0100

guy@uunet.uu.net (Guy Harris) (03/14/89)

>Relative to Charles Hedrick's observation on paging being different under
>SunOS 4.0, I understand that one of the changes that was made was to go to
>a more "global" page replacement strategy, instead of treating program
>areas and disk cache and so forth as separate "local" paging areas.

Well, sort of.

In fact, what happened is that the UFS and NFS file systems perform "read"
and "write" operations by temporarily mapping the affected region of the
file in question into the kernel's address space and doing copies between
the mapped region and the process's buffer.  The net result happens to be
that all of the page frame pool is used both for paging for "read"s and
"write"s, and paging for mapped-in files (e.g., demand-paged executables
and "mmap"ped files), which means in effect that any page in the page
frame pool can be used as what in earlier UNIX systems would be a
buffer-pool block for file data.  (Control information - inodes, bitmaps,
indirect blocks - is still kept in a reduced-size traditional buffer
cache).

>Apparently this is even regarded within Sun as a somewhat suspect
>"optimization."

It is not, as far as I know, so regarded - at least not in the OS group.
The idea is considered reasonable; there may, at present, be problems with
the paging algorithms that make it cause problems in some cases.  (The
potential performance improvement - which, at least on machines with lots
of memory, translates to an actual performance improvement - wasn't the
only reason why this was done; it also simplified problems of keeping file
data references in "read" and "write", and references to mapped file data,
consistent.)