[comp.sys.sun] 32MB on new Sparc Server?

dvorak@iam.unibe.ch (Jiri Dvorak) (05/06/89)

Does somebody have an idea why it isn't possible to order the new 370 and
390 server with less than 32MB of memory? Is there at least one good
reason for 32MB on a server which does nothing but serving 10+ diskless
clients?

thanks,
Jiri Dvorak                  dvorak@iam.unibe.ch   or
Institute for Informatics    dvorak%iam.unibe.ch@relay.cs.net
University of Berne          UUCP: ..!uunet!mcvax!iam.unibe.ch!dvorak
Switzerland

dupuy@cs.columbia.edu (Alexander Dupuy) (05/11/89)

>From the price list, it would seem that the Sun-4/300 series comes with 8
megabytes on the CPU card, and that expansion memory cards of 24 megabytes
are available - as this is the most reasonable combination for the two
available memory sizes: 32 and 56.

What's most interesting is that this is _not_ ECC memory like the
Sun-4/200 series uses, but apparently only parity-checked.  Parity
checking is fine when you don't have much memory, since failures are less
common, but once your memory size starts getting up there (56 megabytes is
quite a bit for a Sun) the possibility of chip failures tends to increase
proportionally.

With ECC memory, you can run for quite a while with a marginal (or even
broken) chip - certainly long enough to get a replacement board or to
schedule downtime to replace the failing chip (the ECC subsystems usually
give you enough information to pinpoint the failing chip).

I guess Sun either has a lot of confidence in their memory chip suppliers,
or they're trying to increase the sales of 2-hour response maintenance
contracts.  :-) I heard from a friend that a Sun salesthing let slip that
Sun were not terribly pleased with the new memory architecture on the -300
series, and that there probably wouldn't be any other machines in that
series.  One possibility is that marketing (price/performance and
competitive) considerations played a role in the memory design.

In response to your second question, there is a good reason why you might
want to have a server with 32 megabytes just to serve 10+ diskless
clients: disk caching.  Under 4.0, almost all of memory is potentially
available for I/O buffering.  A diskless client which goes across the
ethernet to access disk which is cached in memory there will almost
certainly see better response times than a diskful client going to a
relatively slow SCSI disk.

@alex
-- 
inet: dupuy@cs.columbia.edu
uucp: ...!rutgers!cs.columbia.edu!dupuy

roy@phri.nyu.edu (Roy Smith) (05/13/89)

dupuy@cs.columbia.edu (Alexander Dupuy) writes:
> the ECC subsystems usually give you enough information to pinpoint
> the failing chip

<FLAME ON!>

Alex is right, ECC can pinpoint the source of a memory error down to the
specific chip which is failing.  With the proper (fairly trivial) logic in
the memory management section of the OS, you could very will print out
error messages saying "replace chip at position N-14 on board 3".  Even if
you don't want to be that fancy, it's easy enough to go from address and
syndrome to chip location if you know the way memory is laid out.  And
everything you need to know about the chip layout could be contained in
one side of a single sheet of paper.  You don't even really need ECC to do
that; the ROM-resident diagnostics in a 3/50 can isolate memory errors
down to a specific address and bit-within-word just fine without ECC.

But Noooo, Sun won't tell you how to do this.  They claim that the details
of the memory layout are company confidential!  We had a memory chip go
bad on a 3/50 once.  I called Sun up to try and find out how to map from
address/bit to chip location and they wouldn't tell me.  After much
fighting with them, we ended up returning the board to them for repair at
a cost of $1300 and it took over a month!  If we wanted 3-day turnaround,
it would have been something like $3000, which is about 80% of what we
paid for the whole workstation new.  If they were just willing to pry
loose one piece of paper and send it to me, I could have fixed it in an
hour for $10 in parts.

So, what good does it do to have ECC memory do 99% of the job of locating
a bad chip, if Sun won't give you the information to do the critical last
1% of the job yourself?

<FLAME OFF>

Now that I'm in a good mood :-), maybe somebody can tell me why I can buy
a Mbyte of 100ns memory for a Mac-II for $160 but a Mbyte of memory for a
Sun-3 costs more like $500, even from a third party like Clearpoint or
Helios?  OK, the Mac memory isn't even parity, but surely the real
difference in price can't be that much, or anywhere near it.

Roy Smith, System Administrator
Public Health Research Institute
{allegra,philabs,cmcl2,rutgers,hombre}!phri!roy -or- roy@phri.nyu.edu
"The connector is the network"

hart@decwrl.dec.com (Howard C. Hart) (05/23/89)

In article <3772@phri.UUCP> roy@phri.nyu.edu (Roy Smith) writes:
>X-Sun-Spots-Digest: Volume 7, Issue 297, message 1 of 18
>
><FLAME ON!>
>
>Alex is right, ECC can pinpoint the source of a memory error down to the
>specific chip which is failing.  With the proper (fairly trivial) logic in
>the memory management section of the OS, you could very will print out
>error messages saying "replace chip at position N-14 on board 3".  Even if
>

Now I get to show my ignorance of memory management. We've got a bad
memory board as I speak. When I run Sun's sysdiag on it, I get the
physical adress of the bad memory which pins it down to the 4th Sun memory
board (i.e. 3+ Mbytes). I'd assume to locate the bad chip, I'd just
subtract anything over 3 Mbytes and use the offset to calculate which of
the chips was bad. If there is some question as to which chip I start
counting from, I take a good chip, replace the suspect chip, and check the
memory again until I get it right. Lot's of work, but still doable. Did I
miss something fundamental?

>Now that I'm in a good mood :-), maybe somebody can tell me why I can buy
>a Mbyte of 100ns memory for a Mac-II for $160 but a Mbyte of memory for a
>Sun-3 costs more like $500, even from a third party like Clearpoint or

You might want to check with Clearpoint again. We're on GSA pricing and
we just bought Sun 3/60 memory for $180/Mbyte. I was told the price was
still going down when we bought. Wouldn't expect to see a serious
difference between GSA and normal (and that's with ECC!).

Howard C. Hart                  UUCP:{sun!sunncal,pyramid}!leadsv!laic!nova!hart
Lockheed Missiles and Space Co.
Orgn 59-53, Bldg 593            Ph: (408) 743-2253 or -7353