[comp.unix.i386] ESIX has a bug.

jb@aablue.UUCP (John B Scalia) (11/15/89)

Hello, a short while ago a posted about a problem I had encountered with
ESIX's HDB uucp after I loaded its TCP/IP software. Thanks to all those
who responded and made suggestions. Yesterday, the ESIX tech folks did
respond about this problem. (Yeah!) They had confirmed that I wasn't 
imagining the problem and that there is something happening with either
sendmail or rwho under TCP/IP, ie. a bug. Apparently, now that they know
about it, they've said they'll try to get it fixed.

In case others might be considering ESIX, the problem caused HDB i/o to
drop by a factor 8 when TCP/IP was loaded. HDB did NOT fail altogether!
Running Uutry with -x = 9 show that packets were either getting corrupted or
not being acknowledged after receipt. Removing TCP/IP and making no
other changes, either hardware or software, made the problem disappear.
BTW, the problem was much more aggravated when ESIX was receiving.

All in all, I'd still recommend ESIX. The technician I spoke with said
that it did take a little while to duplicate the problem I had and then
figure out why it was occuring. (BTW, they DO have net access, but it
seems that they mostly lurk silently, ie. they still prefer voice calls.)

jb@aablue
-- 
A A Blueprint Co., Inc. - Akron, Ohio +1 216 794-8803 voice
UUCP:	   {uunet!}aablue!jb	(John B. Scalia)

Just a little more nonsense to clutter up the net.

palowoda@megatest.UUCP (Bob Palowoda) (11/17/89)

From article <613@aablue.UUCP>, by jb@aablue.UUCP (John B Scalia):
> Hello, a short while ago a posted about a problem I had encountered with
> ESIX's HDB uucp after I loaded its TCP/IP software. Thanks to all those
> who responded and made suggestions. Yesterday, the ESIX tech folks did
> respond about this problem. (Yeah!) They had confirmed that I wasn't 
> imagining the problem and that there is something happening with either
> sendmail or rwho under TCP/IP, ie. a bug. Apparently, now that they know
> about it, they've said they'll try to get it fixed.



  Hmm, I running about the same configureation you with sendmail rwho
  etc. And my typicial throughput on news is about 1300 to 1500cps.
  Please express more detail about a bug before posting it to the net.
  Just saying "Yeah I loaded it, it don't work, they said it dosn't work,
  they said it had a bug, but I don't know what the bug is" is not enough.
  It's really not fair to any vendors software product. What people want 
  to read is "Yeah it's got a bug, this is a list of my hardware, this
  is what the bug does, this is a list of all my config files that 
  where used, and most of all if you agree to with the vendor what the
  bug is write it down."


> 
> In case others might be considering ESIX, the problem caused HDB i/o to
> drop by a factor 8 when TCP/IP was loaded. HDB did NOT fail altogether!
> Running Uutry with -x = 9 show that packets were either getting corrupted or
> not being acknowledged after receipt. Removing TCP/IP and making no
> other changes, either hardware or software, made the problem disappear.
> BTW, the problem was much more aggravated when ESIX was receiving.

  Interesting what interupt are you using on the eithernet card? 
> 
> All in all, I'd still recommend ESIX. The technician I spoke with said
> that it did take a little while to duplicate the problem I had and then
> figure out why it was occuring.

  I'm confused in your first paragraph you said they "They had confirmed
  the you where not imagineing the problem". Now your saying they have
  to figure a way to duplicate the problem?  What state do you live in?



  ---Bob


-- 
 Bob Palowoda    *Home of Fiver BBS*                   login: bbs               
 Work: {sun,decwrl,pyramid}!megatest!palowoda                           
 Home: {sun}ys2!fiver!palowoda   (A XBBS System)       2-lines   
 BBS:  (415)623-8809 2400/1200 (415)623-8806 1200/2400/9600/19200

jde@everex.UUCP (Jeff Ellis) (11/22/89)

In article <613@aablue.UUCP> jb@aablue.UUCP (John B Scalia) writes:
>figure out why it was occuring. (BTW, they DO have net access, but it
>seems that they mostly lurk silently, ie. they still prefer voice calls.)
>jb@aablue

Hello from ESIX....
Well our usenet system is now up full time.  The last sysadm of this system
had an experimental filesystem the the computer which for the last
8 weeks sent the system down almost dailey.  Now that we have a production
Rev C system up in its place we will not keep missing your postings and your
e-mail.

About our sendmail .... It was written so that it did all of its processing
(starting jobs... etc) then checked to see if there was anything for it to
do... you can see how that messes up proformance :-) but it has been fixed,
but it has not been decided about what the patch will be handled
(i.e. put fix in Rev D?, or send it out?) 
If you do not need sendmail comment it out in /etc/rc2.d/S90netsetup so that
it does not start.

jb@aablue.UUCP (John B Scalia) (11/22/89)

Although, Bob and I have had some discussions privately (because he's got
running what I want to have running), I will respond further publicly, 
just in case I had confused anyone from my original postings.

In article <10123@megatest.UUCP> palowoda@megatest.UUCP (Bob Palowoda) writes:
>From article <613@aablue.UUCP>, by me (John B Scalia):
>> [a quicky about a confirmed problem with ESIX's tcp/ip and HDB]
>
>  Hmm, I running about the same configureation you with sendmail rwho
>  etc. And my typicial throughput on news is about 1300 to 1500cps.
>  Please express more detail about a bug before posting it to the net.
>  Just saying "Yeah I loaded it, it don't work, they said it dosn't work,
>  they said it had a bug, but I don't know what the bug is" is not enough.
>  It's really not fair to any vendors software product. What people want 
>  to read is "Yeah it's got a bug, this is a list of my hardware, this
>  is what the bug does, this is a list of all my config files that 
>  where used, and most of all if you agree to with the vendor what the
>  bug is write it down."

Sorry, I had posted my original equipment information in a posting earlier
when I asked if anyone knew what was causing the problem. I'm still not
sure, however, as to what EXACTLY is causing the problem. All the ESIX
technician would tell me was that the problem was in either sendmail or
rwho. His suggestion was to comment out both of those in rc.d. For the
record, my hardware is nothing special. Specifically, the one thing that
has seemed to get confused is the fact that I said I didn't even have
a network card of any type in my system yet.

Several people had alluded to an interrupt problem caused by the network
card. I can see where that would cause a problem, but obviously, you know
why I could dismiss those claims.

As to config files, I'm sorry I did not make it clear that the failure
was occuring after using installpkg to put the tcp/ip stuff in place. This
was, to a degree, in my original posting. I had asked if anyone knew where
I could start looking because I have had no experience with networks of
this type. The ESIX manual is not explicit in its explanation as to what
files installpkg modifies or what files are vital for smooth operation.
To that end, I still can't say for sure what is causing the system's
throughput to fall. I must still rely on what the folks at ESIX told me
which is what I relayed.

I did my best to at least document what was happening...
>> In case others might be considering ESIX, the problem caused HDB i/o to
>> drop by a factor 8 when TCP/IP was loaded. HDB did NOT fail altogether!
>> Running Uutry with -x = 9 show that packets were either getting corrupted or
>> not being acknowledged after receipt. Removing TCP/IP and making no
>> other changes, either hardware or software, made the problem disappear.
>> BTW, the problem was much more aggravated when ESIX was receiving.
>
>  Interesting what interupt are you using on the eithernet card? 

See above for that.

And finally I said...
>> All in all, I'd still recommend ESIX. The technician I spoke with said
>> that it did take a little while to duplicate the problem I had and then
>> figure out why it was occuring.
>
>  I'm confused in your first paragraph you said they "They had confirmed
>  the you where not imagineing the problem". Now your saying they have
>  to figure a way to duplicate the problem?  What state do you live in?

In my first posting I asked for information and described the problems
I had encountered. When I posted about about the bug, approximately one
had elapsed. When I reported the bug, as described by the tech at ESIX,
what I thought I had conveyed was the fact that during the week they
confirmed that I hadn't hallucinated, ie. they managed to duplicate what
I had occuring. In this case, I assumed they saw the same EFFECT which
they then attempted to isolate to a specific piece of code that was
causing it. Obviously, once you determine the piece of code causing the
problem, you can fix it, but then who knows what else may be affected?

The state I live in? I just call it confusion :-)

jb@aablue
-- 
A A Blueprint Co., Inc. - Akron, Ohio +1 216 794-8803 voice
UUCP:	   {uunet!}aablue!jb	(John B. Scalia)

Just a little more nonsense to clutter up the net.