[comp.unix.aix] network and 3005

lixj@acf3.nyu.edu (mr. lixj) (06/02/91)

Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
a lot of trouble with the ethernet. The network crashes 
frequently(about twice a day, unable to rlogin, telnet, etc), 
and the only way I know to recover
is rebooting. Do other users have the same problem? I'd like to hear any
advices(even how to complain to IBM).

Xiaojian

gs26@prism.gatech.EDU (Glenn R. Stone) (06/04/91)

In <1991Jun2.053220.12638@cmcl2.nyu.edu> lixj@acf3.nyu.edu (mr. lixj) writes:

>Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
>a lot of trouble with the ethernet. The network crashes 
>frequently(about twice a day, unable to rlogin, telnet, etc), 
>and the only way I know to recover
>is rebooting. Do other users have the same problem? I'd like to hear any
>advices(even how to complain to IBM).

Sounds familiar.  Do the folks in Austin have any comments?  
Systems affected are 320 with 16mb/320mb, and 520 with 16mb/600mb.
My 530 (64mb/2.5gb) doesn't seem to be affected... is this a hint?

Inquiring minds want to know... BAD.

-- Glenn R. Stone
gs26@prism.gatech.edu

dbeedle@rs6000.cmp.ilstu.edu (Dave Beedle) (06/04/91)

>
>>Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
>>a lot of trouble with the ethernet. The network crashes 
>>frequently(about twice a day, unable to rlogin, telnet, etc), 
>
>Sounds familiar.  Do the folks in Austin have any comments?  
>Systems affected are 320 with 16mb/320mb, and 520 with 16mb/600mb.
>My 530 (64mb/2.5gb) doesn't seem to be affected... is this a hint?
>
>Inquiring minds want to know... BAD.

     Add a model 530 with 16meg to the list.  I post about this (or
something very similar) a couple of weeks ago.  I did talk with software
support but they were unable to help.  They did suggest having our CE check
trace data from the network card (token ring in our case) but I never got
around to calling.  Our solution:  the problem "disappeared" after the
system crashed while installing 3005 updates.  I restored from backups (to
3003 and the problem was gone!  I'm assuming some kind of software glich
which appeared.  We are now at 3005 with no problems.  
     If someone does figure it out this inquiring mind would like to know
what was/is going on as well.  For now...the AIX gremlins (with deamons
there's gotta be gremlins, right) fixed it!  I hate/love when that happens!


-- 
  Dave Beedle                                    Office of Academic Computing
                                                    Illinois State University
  Internet:  dbeedle@rs6000.cmp.ilstu.edu                    136A Julian Hall  
    Bitnet:  dbeedle@ilstu.bitnet                          Normal, Il   61761

paw@eleazar.dartmouth.edu (Pat Wilson) (06/05/91)

dbeedle@rs6000.cmp.ilstu.edu (Dave Beedle) writes:

>>>Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
>>>a lot of trouble with the ethernet. The network crashes 
>>>frequently(about twice a day, unable to rlogin, telnet, etc), 

>>Sounds familiar.  Do the folks in Austin have any comments?  
>>Systems affected are 320 with 16mb/320mb, and 520 with 16mb/600mb.
>>My 530 (64mb/2.5gb) doesn't seem to be affected... is this a hint?

>     Add a model 530 with 16meg to the list.  I post about this (or
>something very similar) a couple of weeks ago.  I did talk with software
>support but they were unable to help.  They did suggest having our CE check
>trace data from the network card (token ring in our case) but I never got
>around to calling.  

Odd.  Just thought I'd add (to the confusion of all) that I've seen 
no such problems while upgrading _my_ 3005 systems (320 16/320s, with
either thick or thin ethernet).  I was coming from 3003 - I wonder if that
made a difference?


-- 
Pat Wilson
Systems Manager, Project NORTHSTAR
paw@northstar.dartmouth.edu

walter@hyper.hyper.com (Walter R. Steinemann) (06/07/91)

In article <1991Jun5.131358.18905@dartvax.dartmouth.edu> paw@eleazar.dartmouth.edu (Pat Wilson) writes:
>dbeedle@rs6000.cmp.ilstu.edu (Dave Beedle) writes:
>
>>>>Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
>>>>a lot of trouble with the ethernet. The network crashes 
>>>>frequently(about twice a day, unable to rlogin, telnet, etc), 
>
>>>Sounds familiar.  Do the folks in Austin have any comments?  
>>> ...
>>     Add a model 530 with 16meg to the list.  I post about this (or
>>something very similar) a couple of weeks ago. ...
>
>Odd.  Just thought I'd add (to the confusion of all) that I've seen 
>no such problems while upgrading _my_ 3005 systems (320 16/320s, with
>either thick or thin ethernet).  I was coming from 3003 - I wonder if that
>made a difference?

Why not a bit more confusion ...
We have a 520 which was recently upgraded to 3005.  Our rlogins seem to fine,
(I don't we use telnet or ftp very often), BUT we have PCs running FTP Inc.'s
PC-TCP software to NFS mount parts 520s file systems which CRASH our IBM.
My co-worker is talking to FTP to figure out what the problem is, it may
have something to do with symlinks.  I suspect there is also something wrong
on our 520 - it should NOT CRASH, and running the same thing on the PC and
mounting from our Personal IRIS does not CRASH it (it generates a few NFS
errors, but it still works).

-- WalteR --
-- 
Walter R. Steinemann                    HYPERCUBE, INC.
                                        #7-419 Phillip Street
Phone:  (519) 725-4040                  Waterloo, Ontario
email:  walter@hyper.com                N2L 3X2

jet@karazm.math.uh.edu (J Eric Townsend) (06/09/91)

Another datapoint:  Our 320s w/16Mb under 3005 won't accept incoming
telnet sessions from our terminal servers (of all things).  They will
accept rlogin connections, however.

--
J. Eric Townsend - jet@uh.edu - bitnet: jet@UHOU - vox: (713) 749-2126
Skate UNIX! (curb fault: skater dumped)

   --  If you're hacking PowerGloves and Amigas, drop me a line. --

jcburt@ipsun.larc.nasa.gov (John Burton) (06/10/91)

In article <1991Jun8.213817.13113@menudo.uh.edu> jet@karazm.math.uh.edu (J Eric Townsend) writes:
>
>
>
>Another datapoint:  Our 320s w/16Mb under 3005 won't accept incoming
>telnet sessions from our terminal servers (of all things).  They will
>accept rlogin connections, however.
>
>--
>J. Eric Townsend - jet@uh.edu - bitnet: jet@UHOU - vox: (713) 749-2126
>Skate UNIX! (curb fault: skater dumped)
>
>   --  If you're hacking PowerGloves and Amigas, drop me a line. --


Hmmmm...This is exactly the problem I was having with AIX 3.1.3 running
on a 16 meg 520. Since we upgraded to 3.1.5 this problem has gone away,
now both rlogin AND telnet from terminal servers works just fine...

John

irwin@uvmark.uucp (Frank Irwin) (06/14/91)

In article <676362343.53@egsgate.FidoNet.Org> Dave.Beedle@f98.n250.z1.FidoNet.Org (Dave Beedle) writes:
>
>>
>>>Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
>>>a lot of trouble with the ethernet. The network crashes 
>>>frequently(about twice a day, unable to rlogin, telnet, etc), 
>>
>>Sounds familiar.  Do the folks in Austin have any comments?  
>>Systems affected are 320 with 16mb/320mb, and 520 with 16mb/600mb.
>>My 530 (64mb/2.5gb) doesn't seem to be affected... is this a hint?
>
>     Add a model 530 with 16meg to the list.  I post about this (or
>something very similar) a couple of weeks ago.  I did talk with software

We had this problem on 3003 and 3005 on our 520 w/32MB.  John Kubiak
of Tech Support finally figured out that it was an mbuf deficiency.

To see if you have the same problem, run the command:

# netstat -m

The results that I got (this was before rebooting the system, which always
cleared things up) was:

912/1024 mbufs in use:
	479 mbufs allocated to data
	77 mbufs allocated to packet headers
	124 mbufs allocated to socket structures
	215 mbufs allocated to protocol control blocks
	2 mbufs allocated to routing table entries
	13 mbufs allocated to socket names and addresses
	2 mbufs allocated to interface addresses
469/469 mapped pages in use
2004 Kbytes allocated to network  (99% in use)
293646 requests for memory denied

The numbers of mbufs in use, mapped pages in use, Kbytes allocated to network,
and requests for memory denied were the clues.

To fix this, I changed the "Maximum Kbytes of real memory allowed for MBUFS"
from 2000 to 4000 using SMIT ("Change/Show Operating System Parameters").

I also had to do the following commands:

# no -o thewall=4000
# chdev -l sys0 -a maxmbuf=4000

The system has been working fine since then, but it would always work for
a couple of weeks after rebooting, and it's only been a couple of days.

If you want to call Tech Support, the problem # is 5859.
-- 
  ===========================================================================
  Frank Irwin                   |  "The game was nothing to write home about,
  Vmark Software, Inc.          |   Unless you were writing home for money."
   ..uunet!merk!uvmark!irwin    |        - Jesse Grant  (thnx Mike)

mengel@dcdmwm.uucp (Marc Mengel) (06/26/91)

In article <1991Jun13.223015.70199@uvmark.uucp> irwin@uvmark.uucp (Frank Irwin) writes:
>In article <676362343.53@egsgate.FidoNet.Org> Dave.Beedle@f98.n250.z1.FidoNet.Org (Dave Beedle) writes:
>>
>>>>Recently I have the 3005 upgrade installed(RS6000/320). Since then I got
>>>>a lot of trouble with the ethernet. The network crashes 
>>>>frequently(about twice a day, unable to rlogin, telnet, etc), 
>>>Sounds familiar.  Do the folks in Austin have any comments?  
>>     Add a model 530 with 16meg to the list.  I post about this (or
>>something very similar) a couple of weeks ago.  I did talk with software
>We had this problem on 3003 and 3005 on our 520 w/32MB.  John Kubiak
>of Tech Support finally figured out that it was an mbuf deficiency.

Hmm... thats interesting; one of the folks here at Fermilab was sent
a patched ethernet driver that is supposed to fix a "receiver hang"
for the same sort of symptoms. (although his was haning roughly
once every two days).

The patch is labeled:

	APAR#: a16808

	SOFTWARE LEVEL(S):
	...[install instructions deleted]
	DATE: 06/17/91

We've installed it on his machine that was having the trouble, and one or
two others, and so far no repeats.  The mbuf problem sounds like a possibility
too, though...
-------
Marc Mengel
mengel@fnal.fnal.gov