[mod.computers.sun] SUN-Spots Digest, v4n20

Sun-Spots-Request@RICE.EDU (Vicky Riffle) (07/02/86)

SUN-SPOTS DIGEST           Wednesday, 2 July 1986        Volume 4 : Issue 20

Today's Topics:

		       Compiling TeX on SUN2 running 3.0
			    UNDUMP for TeX for SUN3
			    Bug in 3.0 UDP checksums
		 	  Beware The Profiler! (C bug)
		      Mount deadlocks (really /etc/rc.boot)
			   Sun as a timesharing box (2)
		       Adding an Eagle-and-a-half to a Sun
				   Panic: ifree
				    SL on Sun3
			     Sun-2 and 3.0 questions?
			  3.0 strip run on 2.0 binaries?
		    Finding out your hostname when you boot?
			  Diskless SUNs with Gould server?
------------------------------------------------------------------------
Date: Thu 26 Jun 86 16:18:16-PDT
From: Pierre MacKay <MACKAY@WASHINGTON.ARPA>
Subject: compiling TeX on SUN2 running 3.0

I seem not to have been clear enough about the problem reported to me 
with compilations of TeX.

They do not stem from the attempt to run 68020 binaries on a 68010 system.  
In fact the result of that attempt is simple and obvious.  You get the message 
Cannot execute binary file
Exec format incorrect.
or something like that.  It is baffling the first time when you hit it by 
accident, but not thereafter.

The TeX compilation problem is more serious.  It involves source on a SUN2, 
compiled on a SUN2, loaded on a SUN2 and run on a SUN2.  If the principal 
compilation or loading were not done on a SUN2, I suspect that you wouldn't 
even get the appearance of the thing starting to run.  So far as I can
discover, there is no obvious way for 68020 code to get into this act, unless 
the libraries are all in 68020 based binaries.  But the general tendency is 
for people to face the problem of a SUN2 server feeding a SUN3 node, rather 
than the other way around.  

Anyway, I do not think that most of the queries I have had are from people 
trying to run 68020 binaries on a 68010 machine.  I haven't had the 
opportunity to try the trick myself yet, but I hope to. 

Incidentally, I suppose the possibility of a SUN2 compilation picking up 
sun3 library routines is going to take some serious thought, once general 
networking is assumed.
					Pierre
-------

-----------------------------

Date:     Fri, 27 Jun 86 12:49:37 CDT
From: William LeFebvre <phil@titan.rice.edu>
Subject:  UNDUMP for TeX for SUN3

I got a copy of Barry's undump and added a few more things to it so
that it will work with more than just ZMAGIC files.  I'll place a tar
file with the source on one of our machines here at Rice.  By the time
the Sun-spots readers get this, it will be available vis anonymous FTP
(any password) from host "dione.rice.edu" in the tar file
"public/undump.tar".  Since our connection costs us money, please only
FTP it between 6 p.m. and 7 a.m. on weekdays, and expect the transfer
to take longer than normal.

I also intend to send it in to unix-tex in the hopes that they will put
it in the official Unix TeX distribution and make it available on one
of their machines.

			William LeFebvre
			Department of Computer Science
			Rice University
			<phil@Rice.edu>

-----------------------------

Date: Fri, 27 Jun 86 12:04:43 EDT
From: Steve D. Miller <steve@gyre.umd.edu>
Subject: Bug in 3.0 UDP checksums

   The UDP checksum code in Sun 3.0 is incorrect.  As shipped, UDP
checksums are turned off (presumably for performance reasons);
if checksums are turned on (by setting udpcksum to 1 in the kernel
using adb), however, the Sun code cannot talk to itself or any
other UDP implementation.  The problem is that the checksum code in
udp_output() is confused about the UDP/TCP ip overlay versus
the true IP header; the code sets the overlay up properly for
checksumming, incorrectly sets the header length field (changing
a zero field in the overlay to something other than zero), then
checksums the now incorrect packet.  Protocol-wise, the packet looks
fine, but the checksum will always be wrong.

   Since the header length is set in udp_output() only so that the
fast loopback code (code that detects things destined for loopback
interface and hands such packets directly to udp_input()) will
work correctly, the fix is to set the header length in the fast
loopback code (fix for vanilla Sun 3.0 sources):

*** bad udp_usrreq.c	Fri Jun 27 11:47:57 1986
--- udp_usrreq.c	Fri Jun 27 10:57:25 1986
***************
*** 196,202
  	ui->ui_sport = inp->inp_lport;
  	ui->ui_dport = inp->inp_fport;
  	ui->ui_ulen = (u_short)ui->ui_len;
- 	((struct ip *)ui)->ip_hl = sizeof (struct ip) >> 2;
  
  	/*
  	 * Stuff checksum and output datagram.

--- 196,201 -----
  	ui->ui_sport = inp->inp_lport;
  	ui->ui_dport = inp->inp_fport;
  	ui->ui_ulen = (u_short)ui->ui_len;
  
  	/*
  	 * Stuff checksum and output datagram.
***************
*** 221,226
  				ui->ui_src.s_addr = ui->ui_dst.s_addr;
  			}
  			((struct ip *)ui)->ip_len -= sizeof (struct ipovly);
  			udp_input(m);
  			return (0);
  		}

--- 220,226 -----
  				ui->ui_src.s_addr = ui->ui_dst.s_addr;
  			}
  			((struct ip *)ui)->ip_len -= sizeof (struct ipovly);
+ 			((struct ip *)ui)->ip_hl = sizeof (struct ip) >> 2;
  			udp_input(m);
  			return (0);
  		}

   I'd guess that having UDP checksums turned off when running UDP across an
ethernet is probably OK, since (after all) most packets which arrive do so
correctly; I think that if I was running UDP over a serial line or
something, though, I'd much rather have the checksums on...

   It should be noted that the ku_fastsend routine used to send NFS/RPC/UDP
packets quickly in the NFS implementation also avoids checksums.  This
routine should (perhaps) be disabled in similar situations, depending on
one's level of paranoia.

	-Steve

Spoken: Steve Miller 	ARPA:	steve@maryland	Phone: +1-301-454-4251
CSNet:	steve@umcp-cs 	UUCP:	{seismo,allegra}!umcp-cs!steve
USPS: Computer Science Dept., University of Maryland, College Park, MD 20742

***********************

-----------------------------

Date: Wed, 2 Jul 86 12:38:36 PDT
From: gerolima@Ford-wdl1.ARPA (Mark Gerolimatos)
Subject: BEWARE THE PROFILER! (C bug)

My partner, Ron Barack (give credit where due...), 
	discovered the following problem...

Given the following program...

------------------------------------Rip Here------------------------------------
main()
{
	float func();
	x = func();
	printf(" x = %f\n",x);
}
float func()
{
	float val;
	val = 1.e-20 * 1.e-21;
	return val;
}
------------------------------------Rip Here------------------------------------
Compiled with no options, the output looks like:

	x = 0.000000	<----which is correct.

BUT, with the -p option, it comes out as:

	x = 0.003403	<----oops.

With a little more investigation, I found out that
the answer was correct all the way up to the printf. "x = func()"
sets x to the correct value. Sew, vhatz going on, here?

The problem is with "libc_p.a". Fstod (Floating-point Single TO Double),
to be exact (we used the -fsoft, the default, option). When func()
returns, it's value is Fdtos'd, and placed in x, and then x is
Fstod'd, and passed to printf (Gosh, how C can be inefficient when you 
don't use -fsingle!). I checked out libc.a's Fstod, and libc_p.a, and 
couldn't find much difference between the two (not including the
mcount code, of course) EXCEPT for "UNLK A6" just before an RTS
(when the value == 0).

To whit:	(libc.a)
			cmpl    #0,a0
                        bges    Fstod+0x36
                        orl     #0x80000000,d0
                        rts

		(libc_p.a, Fstod+0x40 - Fstod+0x4e)
                        cmpl    #0,a0
                        bges    Fstod+0x4e
                        orl     #0x80000000,d0
                        unlk    a6  <--this DOES get called
                        rts

	Could that be the problem?

		-Mark

	"For almost a quarter of a century..."

"...Change Baby, Don't Worry!...	Mark Gerolimatos
 ...Welcome! Welcome!... 		ARPA:	gerolima@ford-wdl1.arpa
 ...Change Baby, Don't Worry!...	UUCP:	{sun,fortune}!wdl1!gerolima
 ...Box! Box! Box! Box!...		AT&T:	(415) 852-4105
 ...Now, We Say Good-Bye...		Mail:	c/o Ford Aerospace
 ...Welcome to the GALATT...			3939 Fabian Way
 ...G-A-L-A-T-T We're GALATT..."		Palo Alto CA 94306
 -English phrases from a Japanese song		Mail Stop X20

-----------------------------

Date: Fri, 27 Jun 86 01:47:02 EDT
From: Broadway's Streetsweeper <dupuy%amsterdam@columbia.edu>
Subject: Mount deadlocks (really /etc/rc.boot)

>John Bruner <jdb@s1-c.arpa> writes:
>
>Another problem Sun UNIX has with mounts occurs in "/etc/rc.boot".
>This file disobeys the rule that no changes be made to a corrupted
>filesystem until it is cleaned.  Specifically, zeroing "/etc/mtab",
>adding an entry for "/", and mounting "/pub" is a no-no if the root
>hasn't been salvaged yet.
>
>  { hacked /etc/rc.boot included here}
>
>[I think a better solution would be a path to single-user which avoids
>"rc.boot" entirely.]

An undocumented feature of 2.x /etc/init is the "-b" flag which causes the
/etc/rc.boot file to be skipped.  I don't know if this is also in 3.x, but I
expect it is.  All you need to do if your mtab is busted is "b vmunix -bs".

This is perhaps easier than patching rc.boot, especially under 2.x, which lacks
the System V style "sh" with builtin "test" that comes with 3.0.  If you want
to patch rc.boot for a 2.x system anyhow, you can

	% set noglob
	% mount localhost:/ /mnt
	% cp /bin/test /mnt/pub/bin
	% ln /mnt/pub/bin/test /mnt/pub/bin/[
	% umount /mnt
	% unset noglob

and your tests will work, even before you mount /pub.  

By the way, what is the point of moving all the hostname and ifconfig stuff
into rc.boot, instead of leaving it in rc.local where it belongs?  Do you
*really* need this to make the "umount -at nfs" work (it sends a clnt_broadcast
to all rpc.mountd's to clear their rmtabs)?  Maybe someone could clarify this.

@alex

-----------------------------

Date: Sat, 28 Jun 86 14:01:13 EDT
From: Barry Shein <bzs%bu-cs.bu.edu@CSNET-RELAY.ARPA>
Subject: Sun as a timesharing box (1)

Since January BU-CS.BU.EDU has been two SUN3/180s running as time-sharing
systems (the other which doesn't much show to the outside world is
BUCSD.BU.EDU.)

Not only is it possible, we are quite pleased. We replaced a VAX/780
(8MB, RP07, RA81, FPA etc) with the two systems and the large loads
we used to suffer with on the 780 are gone, it's rare to see it go
above 2.00 or 3.00 and even then it doesn't seem to much matter.

The configuration is 8MB, 2 Eagles each. One has a 16-port ALM and
the other has one now but will soon have 2 X 16. We have a SUN
6250 drive, etc. Until our 4MB expansions arrived we ran with 4MB
each, that was pretty mediocre, 8MB seems to work very well though.
I think if I did it again I would order 16MB but it doesn't seem
very critical, we'll probably expand one of these days.

All logical disks are cross-mounted through NFS such that it doesn't
matter which you log onto, your home directory will be transparently
accessible through your passwd entry. Fortunately our terminal switch
(Ungermann/Bass Net/1 broadband) randomizes which port it grabs so
all ports have the same system name, users just land on one or the
other.

I found performance is better if most binaries are locally available
so each system has its own /usr/ucb, /usr/bin etc, there are a few
cross-links here and there. These are just tuning issues and can
be handled on-line. The mail system is all set up to help transparency.

We typically have 15-20 logins per system. I must say that many of
our users are *not* doing development, this is a faculty machine.
A lot of word processing, a lot of idle time. In contrast, I would
expect a student community to be different. On the other hand, we
have a lot of general systems work, grad students etc, usually this
huge lisp job running in the background on at least one machine
perpetually, BU-CS is the campus mail relay and USENET relay, it's
not totally idle...maybe it just does the job well?

The nice thing of course is that I'll probably soon add a third
SUN3/180 to the 'cluster' to absorb the Math department here, maybe a
fourth later as users start to acquire diskless nodes, how to grow for
a while is obvious and should be painless. I am also looking ahead
at the new 2.2(?)MB/s 451 SMD disk controllers and the rumored
25MHZ SUN/? board as relatively inexpensive performance boosters
tho right now I wouldn't need them really.

Yes, it works fine, I can't tell you what your upper limits will
be but certainly it handles 16 users very well.

If you're on the ARPAnet feel free to finger @bu-cs.bu.edu/bucsd.bu.edu
during the day, our finger shows load average, logged in users, idle
times and what people are running (TWENEX style.)

	-Barry Shein, Boston University

-----------------------------

Date: Tue, 1 Jul 86 10:26:13 edt
From: Ken Mandelberg <km%emory.csnet@CSNET-RELAY.ARPA>
Subject: Sun as a timesharing Box (2)

I want to thank the several people who responded to my
note about using Sun's for timesharing. Most of the
responses suggested other Unix machines that are more
traditional for timesharing (Pyramid, Sequent, etc).

However, I would like to both clarify the motiviation behind
my original request, and ask a technical question.

Motivation: We currently use two Vax 780's to handle classes,
assigning half of each class to each machine. Loads peek at
about 20 students on each machine. Often the loads are quite
acceptable with lav's under 5. There are also bad times
when the lav soars between 10-20. It of course depends on
the class, the assignment, and how close to the due date is.
Timesharing on dumb terminals is a good match for this
particular set of classes, although we use (and will increase the use)
of Sun (and other) workstations for other class and research
applications.

Right now we are looking at replacing the 780's since they cost
a mint to maintain. I was not suggesting that we run 40 users
on one Sun. I was probably thinking more of handling the 40
users equally well (or badly) with say 3 Sun 3's. Maybe one
Sun as a NFS file server with student files and big disks,
and two others as CPU servers for logins with smaller (but fast) local
disks for /,/usr, /tmp , and paging. The fileserver could do  
some other things like batch troff at low priority and handle
some peripherals (printers, tapes, etc).

This may not be a good solution, but I am wondering how it
would compare to two 780s doing the same thing.I have nothing
against other (and probably better) timesharing solutions.
All things being equal I like Sun because they are clearly
a leader in the Unix community, and the task of keeping 
current with applications software, compiler and OS eccentricities,
and system administration is easier with a single CPU family. 

Unfortunately, I suspect the answer is that all things are not
equal.

**

Technical Question: I got several repsonses to the effect
that Sun's are workstations and just not well designed for
timesharing. No real reasons were given. 

I would like comments on comparing a Sun well configured for
timesharing with say a Vax 780. My thoughts are:

1) The 68020 is a faster CPU than the 780 CPU, and seems to
be as reaasonable a Unix processor as the Vax CPU.

2) The Sun 3 VME bus is faster than the Unibus. 780s have a
fast Masbus, but recent DEC configurations tend to emphasize
Unibus peripherals.

3) The DEC terminal multiplexors are nothing special. The
DZ's are terrible, the DH/DMF's do DMA but do little in the
way of FEP processing to unburden the CPU. I don't know how
good or bad the Sun multiplexor's are. Presumably they do
DMA and are similar to DH's.

4) The DEC disks are nothing special either. It seems that
people that use third party controllers and Fujitsu disks
on Vaxen are at least as happy as those that use the UDA/RA
Dec disks. Figure the Sun can use a fast eagle, though I don't
how good or bad the controller for it is.

5) The DEC ethernet boards sit on the slow Unibus and contend
with other peripherals. The Sun ethernet controller is on the
cpu/board (I think), and doesn't contend for the VME bus. The
DEC DEUNA is slow (though more recent replacements are faster).

On the otherside I recognize that 780's can be equipped with
multiple Unibus adaptors, and I guess (??) the Sun can't.
It also appears that most Sun periperals end up using a
Multibus adapter which presumably slows it down.

Discussion is solicited.


Ken Mandelberg
Emory University
Dept of Math and CS
Atlanta, Ga 30322

{akgua,sb1,gatech,decvax}!emory!km   USENET
km@emory                      CSNET
km.emory@csnet-relay          ARPANET

-----------------------------

From: telesoft!pilotti@sdcsvax.ucsd.edu
Subject: Re: Adding an Eagle-and-a-half to a Sun?

    Roy,
    
    I can verify that third-party Eagles can be successfully connected to a
    Sun-3/160,180.  Simply match the jumpers on the three Eagle PC cards to
    the jumpers on the Eagle that came from Sun.  You can use standard SMD
    command & data cables to daisy-chain the disks together.  It isn't as
    pretty as getting the fancy connectors and modified VME-adapter
    faceplate (if such a beast exists for two drives), and it works. 

    Note that this if for "standard" Eagles; we haven't tried mixing
    flavors on one controller.
    
    Diag> away! 
    +KP
    ------------------------------------------------------------------------
    /+\ Keith P  \ <Pilotti@TeleSoft.COM> (Internet) /  10639 Roselle Street
    \+/ TeleSoft  \  1+(619) 457-2700 x172 (Voice)  /   San Diego, CA  92121
    ------------------------------------------------------------------------

-----------------------------

Date: Tue, 1 Jul 86 13:06:06 PDT
From: fluke!jeff@uw-beaver.arpa (Jeff Stearns)
Subject: panic: ifree

In SUN-Spots Digest v4n19, John Bruner writes:

> BTW, ever since we switched to 3.0FCS we've been having frequent crashes
> in "ifree".  "fsck" ref ses to recover these because the link count in
> some inode is too small (i.e. there are more links than the inode
> indicates).  Usually one of the links is "/etc/mtab" or "/etc/rmtab".
> Without source code I've been unable to figure out what is happening.
> This does not happen in Sun 2.0 or on our NFS VAXes (where I do have
> source code).  Has anyone else experienced this problem?
> --
>   John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
>   MILNET: jdb@mordor [jdb@s1-c.ARPA]    (415) 422-0758
>   UUCP: ...!ucbvax!decwrl!mordor!jdb    ...!seismo!mordor!jdb

Yup.  We've had this problem too.  And it gets worse as the disks get busier.

It will also happen in release 2.0 and 2.2 if your fileservers get busy enough.

For releases 2.0 and 2.2, the fix was to put each of our fileserver's disks
on a separate xylogics 450 controller.  I strongly believe that this will
prove to be the case with release 3.0.

I also suspect that Sun may not tell you this.  They seem to have some real
blind spots with respect to this nasty bug.

    Jeff Stearns	John Fluke Mfg. Co, Inc.	(206) 356-5064

-----------------------------

Date: Fri, 27 Jun 86 11:43:59 PDT
From: guy@SUN.COM (Guy Harris)
Subject: SL on Sun3?

>   I am trying to bring SL (Serial Line IP) up on a Sun 3/52 running 
> Sun Unix 3.0.  When i make a kernel with the mods, the sun seems to behave
> alright, but when i bring up Suntools, and type a character into a window,
> i get this error : 'ws_read_indev length error 1', whiel everything else
> seems alright...

I suspect you installed the non-Sun version of SLIP.  What line discipline
number does it use?  Line disciplines 5 and 6 in Sun UNIX are not available
for user-supplied line disciplines such as SLIP; they are used for mouse and
keyboard ports, respectively.  Sounds like you may have installed SLIP as
line discipline 6 (or maybe 5).

>   The other end of the ttyline (which is what SL uses) is connected to a
> VAX 11/750 with 4.2 BSD which already has SL running.  I JUST got SL from
> seismo.arpa, so it is definitely the 'latest' version.

Last time I looked, Rick supplied Sun versions of some of the SLIP source
files, including "tty_conf.c" (there are other differences in Sun UNIX which
require other modules to change, as well).

-----------------------------

Date: Thu, 26 Jun 86 18:43:08 mst
From: "Roger Hayes" <rogerh@arizona.edu>
Subject: Sun-2 and 3.0 questions

Should we convert our Sun-2's to 3.0?

Is performance adequate?  What configuration is required?  How much
meory is useful?

Will the Lucasfilms/Pixar software recompile?  (See Hawley, Portland
Usenix Proceedings, for description).  Are they still using Suns?
Is anyone else using their software? 

	Thanks,
	Roger Hayes

	rogerh@arizona.edu
	...!{ihnp4,ucbvax}!arizona!rogerh

-----------------------------

From kgk%brown.csnet@CSNET-RELAY.ARPA  Tue Jul  1 11:31:05 1986
From: dbo%textset@eecs.umich.edu
Date: Tue, 1 Jul 86 19:18:27 EDT
Subject: 3.0 strip run on 2.0 binaries

Here's a good one.  Take an unstripped 2.0 binary and strip it with 3.0 strip.
Every case I try generates an unusable binary.  Trying to execute the
resultant binary causes the famous

pid xxx: killed due to swap problems in getxfile: i/o error mapping page.

error.

Is this written down some place that I didn't notice?

	-Doug

-----------------------------

Date: Tue, 1 Jul 86 22:16:42 edt
From: seismo!allegra!phri!roy@SALLY.UTEXAS.EDU (Roy Smith)
Subject: Finding out your hostname when you boot

	There is something I don't understand about what happens when a
diskless client boots.  When you turn the power on, the only thing a client
knows is its ethernet address.  Then it does some arp stuff and finds out
its internet address.  I don't quite understand how arp works, but I can
deal with the fact that it does.

	What I don't understand is why in /etc/rc.boot, the hostname is
hard-wired into the file system.  First thing it does is "hostname=foo" and
then later is runs "/bin/hostsname $hostname".  Since the machine has
managed to find out its internet number, isn't there some way for it to
find out its hostname over the network as well?  I could envison something
like:

	hostname=`ypcat hosts.byaddr | egrep $inaddr | sed "s/[0-9 .]//g"`

being the first line of rc.boot, but I can't see any way to get the
internet address available to a user program.  I suppose you could do some
poking around in /dev/kmem to dig it out, but that seems too disgusting to
think about.

-----------------------------

Date: Mon, 30 Jun 86 12:50:41 -0300
From: Leonid Rosenboim <leonid@taurus.BITNET>
Subject: Diskless SUNs with Gould server

Has anyone experienced using diskless SUN with a GOULD PN/6000 as a server ?
In particular, does GOULD run ND ? Are there any problems with NFS ?
Two binary copies are needed, what about manuals etc. ?
What is the performanceratio between a Gould or SUN-2/180S server ?

                        Leonid Rosenboim
                        CS. Departament, System group, Tel-Aviv Univ.
                UUCP:   humus!taurus!leonid
                ARPA:   leonid%taurus@wiscvm.ARPA

-----------------------------

End of SUN-Spots Digest
***********************