[comp.sys.intel] Tuning Streams

peter@ficc.uu.net (Peter da Silva) (03/13/90)

We're experiencing some problems with a network implementation on an intel
320 running System V/386. The performance is dog slow, and we suspect that
the fact that it goes through streams may have something to do with it. Has
anyone any suggestions for how to go about tuning the system to optimise
streams throughput?
-- 
 _--_|\  `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \  'U`
\_.--._/
      v

pb@idca.tds.PHILIPS.nl (Peter Brouwer) (03/13/90)

 In article <4Y62V32xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>We're experiencing some problems with a network implementation on an intel
>320 running System V/386. The performance is dog slow, and we suspect that
>the fact that it goes through streams may have something to do with it. Has
>anyone any suggestions for how to go about tuning the system to optimise
>streams throughput?

Yes , you are at the right place to suspect streams. But actually its not 
streams but the spl calls that are initiated byt eh streams library functions.

We have the experience that the overhead might vary between 8 till 30%.
It depends on the stream modules used. For streams tty its ca. 10%
In one case we measured an overhead of 30%. This was a dc testprogram 
generating a 100% cpu load, 30% of that was due to spl calls. The big
spender is the function that changes the interrupt level in the PIC chip.

There is nothing to tune for this . The thing to do is to check the source
code in the use of streams calls . 
-- 
Peter Brouwer,                # Philips Telecommunications and Data Systems,
NET  : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2,
UUCP : ....!mcvax!philapd!pb  # P.O.Box 245, 7300AE Apeldoorn, The Netherlands.
PHONE:ext [+31] [0]55 432523, # 

dbrown@apple.com (David Brown) (03/15/90)

In article <4Y62V32xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) 
writes:
> We're experiencing some problems with a network implementation on an 
intel
> 320 running System V/386. The performance is dog slow, and we suspect 
that
> the fact that it goes through streams may have something to do with it. 
Has
> anyone any suggestions for how to go about tuning the system to optimise
> streams throughput?

One easy thing to do is run "crash" and type "strstat" to get streams 
statistics, and look for failures and/or maximums near the limits (you do 
not always get any sort of notification of failures - just unusual 
behavior).  If you find anything, then up those streams parameters and 
rebuild your kernel.

David Brown        415-649-4000
Orion Network Systems
(a subsidiary of Apple Computer)
1995 University Ave. Suite 350
Berkeley, CA 94704

thinman@cup.portal.com (Lance C Norskog) (03/16/90)

Ummmmm, I just remembered something.  The reason you see lots of time
charged to the splx() kernel routine is because kernel profiling is
very screwy in regards to interrupts, and all time spent in device 
interrupts is 'adjusted' (in a peculiar way) right after the splx()
routine drops interrupts to 0.  It's not really spending all that time
fiddling the PIC's.

Lance Norskog
Sales Engineer
Streamlined Networks
408-727-9909

pb@idca.tds.PHILIPS.nl (Peter Brouwer) (03/19/90)

 In article <27916@cup.portal.com> thinman@cup.portal.com (Lance C Norskog) writes:
>Ummmmm, I just remembered something.  The reason you see lots of time
>charged to the splx() kernel routine is because kernel profiling is
>very screwy in regards to interrupts, and all time spent in device 
>interrupts is 'adjusted' (in a peculiar way) right after the splx()
>routine drops interrupts to 0.  It's not really spending all that time
>fiddling the PIC's.
>
I think this is a reaction of a previous posting of me , stating a lot of
time in spend in spl routines. ( varying from 10 - 30% depending on the
drivers usage of streams ).
I did not measure this with the kernel profiler, you are correct this gives
inaccurate results. 
I did measure this with what's called a soft analist. This is simply said
a very clever logical analyser. It looks at the pins of the chip , in this
case a 386 , and samples the events there at 200ns interval.
You load in the software of the analyser the symbol table of the software
to be measured ( /unix in this case ) and specify which functions you want
to measure.
In the performance mode it lists the number of times the function is
called and total time spend in it. ( average, min and max are options )
This is where I got my info from. These figure are very accurate.
I also did a test by patching in the kernel spltty to splhi.
I measured responce times of an order entry application (16 users )
with the patched kernel. They went down from an average of 1 sec to 0.82
seconds.  The cpu time went down with 8% .
This application does terminal io on streams bases terminals.
So you see the influence of the setpicmask function in the spl handling.
-- 
Peter Brouwer,                # Philips Telecommunications and Data Systems,
NET  : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2,
UUCP : ....!mcsun!philapd!pb  # P.O.Box 245, 7300AE Apeldoorn, The Netherlands.
PHONE:ext [+31] [0]55 432523, #