peter@ficc.uu.net (Peter da Silva) (03/13/90)
We're experiencing some problems with a network implementation on an intel 320 running System V/386. The performance is dog slow, and we suspect that the fact that it goes through streams may have something to do with it. Has anyone any suggestions for how to go about tuning the system to optimise streams throughput? -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
pb@idca.tds.PHILIPS.nl (Peter Brouwer) (03/13/90)
In article <4Y62V32xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >We're experiencing some problems with a network implementation on an intel >320 running System V/386. The performance is dog slow, and we suspect that >the fact that it goes through streams may have something to do with it. Has >anyone any suggestions for how to go about tuning the system to optimise >streams throughput? Yes , you are at the right place to suspect streams. But actually its not streams but the spl calls that are initiated byt eh streams library functions. We have the experience that the overhead might vary between 8 till 30%. It depends on the stream modules used. For streams tty its ca. 10% In one case we measured an overhead of 30%. This was a dc testprogram generating a 100% cpu load, 30% of that was due to spl calls. The big spender is the function that changes the interrupt level in the PIC chip. There is nothing to tune for this . The thing to do is to check the source code in the use of streams calls . -- Peter Brouwer, # Philips Telecommunications and Data Systems, NET : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2, UUCP : ....!mcvax!philapd!pb # P.O.Box 245, 7300AE Apeldoorn, The Netherlands. PHONE:ext [+31] [0]55 432523, #
dbrown@apple.com (David Brown) (03/15/90)
In article <4Y62V32xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: > We're experiencing some problems with a network implementation on an intel > 320 running System V/386. The performance is dog slow, and we suspect that > the fact that it goes through streams may have something to do with it. Has > anyone any suggestions for how to go about tuning the system to optimise > streams throughput? One easy thing to do is run "crash" and type "strstat" to get streams statistics, and look for failures and/or maximums near the limits (you do not always get any sort of notification of failures - just unusual behavior). If you find anything, then up those streams parameters and rebuild your kernel. David Brown 415-649-4000 Orion Network Systems (a subsidiary of Apple Computer) 1995 University Ave. Suite 350 Berkeley, CA 94704
thinman@cup.portal.com (Lance C Norskog) (03/16/90)
Ummmmm, I just remembered something. The reason you see lots of time charged to the splx() kernel routine is because kernel profiling is very screwy in regards to interrupts, and all time spent in device interrupts is 'adjusted' (in a peculiar way) right after the splx() routine drops interrupts to 0. It's not really spending all that time fiddling the PIC's. Lance Norskog Sales Engineer Streamlined Networks 408-727-9909
carroll@m.cs.uiuc.edu (03/18/90)
/* Written 6:32 pm Mar 14, 1990 by dbrown@apple.com in m.cs.uiuc.edu:comp.unix.i386 */ In article <4Y62V32xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >> We're experiencing some problems with a network implementation on an intel >> 320 running System V/386. The performance is dog slow, [ ... ] >One easy thing to do is run "crash" and type "strstat" to get streams >statistics, and look for failures and/or maximums near the limits [ ... ] /* End of text from m.cs.uiuc.edu:comp.unix.i386 */ I'm having very slow response from the network, under 386/ix 2.0.2. My stats from crash look like ITEM CONFIG ALLOC FREE TOTAL MAX FAIL streams 96 48 48 81 51 0 queues 300 238 62 216 252 0 message blocks 2150 106 2044 266571 139 0 data block totals 1720 106 1614 238030 139 0 data block size 4 256 0 256 17618 3 0 data block size 16 256 14 242 26723 19 0 data block size 64 256 8 248 152742 39 0 data block size 128 512 84 428 15986 91 0 data block size 256 128 0 128 24795 3 0 data block size 512 128 0 128 27 2 0 data block size 1024 64 0 64 42 1 0 data block size 2048 64 0 64 97 4 0 data block size 4096 56 0 56 0 0 0 Count of scheduled queues: 0 Additionally, the problem often manifests itself under NFS, when trying to read files. If the file is longer than a certain (small, roughly a few K), nothing will be read, while small files will be read just fine. I will get a "NFS server not responding", while telnet/rlogin/ping all report everything is fine. P.S. I looked through old notes, but I didn't see anything on this topic. I though I remembered such a discussion a while back - if anyone has it, please email it to me. Thanks. Alan M. Carroll "Like the north wind whistling down the sky, carroll@cs.uiuc.edu I've got a song, I've got a song. Conversation Builder: I carry it with me and I sing it loud + Tomorrow's Tools Today + If it gets me nowhere, I'll go there proud" Epoch Development Team CS Grad / U of Ill @ Urbana ...{ucbvax,pur-ee,convex}!cs.uiuc.edu!carroll
plocher@sally.Sun.COM (John Plocher) (03/19/90)
+-- | >> We're experiencing some problems with a network implementation on an intel | >> 320 running System V/386. The performance is dog slow, [ ... ] | Additionally, the problem often manifests itself under NFS, when trying to read | files. If the file is longer than a certain (small, roughly a few K), nothing | will be read, while small files will be read just fine. I will get a +-- Aha! It sounds like you need to set rsize=1024,wsize=1024 in your /etc/fstab file for all your NFS devices... Most/(all?) 386 TCP/IP implementations on ethernet have a max packet size of 1K, and most other NFS systems assume 8K. Files under 1K work OK, but big files fail... example (this is actually only ONE line in /etc/fstab!): sun:/usr/spool/news /usr/spool/news nfs ro,soft,bg,intr,timeo=70,wsize=1024,rsize=1024,retrans=5 0 0 This *is* mentioned in Wollongong's 386 TCP/IP & NFS manuals, I don't know about LAI,ISC, Everex, or Intel... -John Plocher
pb@idca.tds.PHILIPS.nl (Peter Brouwer) (03/19/90)
In article <27916@cup.portal.com> thinman@cup.portal.com (Lance C Norskog) writes: >Ummmmm, I just remembered something. The reason you see lots of time >charged to the splx() kernel routine is because kernel profiling is >very screwy in regards to interrupts, and all time spent in device >interrupts is 'adjusted' (in a peculiar way) right after the splx() >routine drops interrupts to 0. It's not really spending all that time >fiddling the PIC's. > I think this is a reaction of a previous posting of me , stating a lot of time in spend in spl routines. ( varying from 10 - 30% depending on the drivers usage of streams ). I did not measure this with the kernel profiler, you are correct this gives inaccurate results. I did measure this with what's called a soft analist. This is simply said a very clever logical analyser. It looks at the pins of the chip , in this case a 386 , and samples the events there at 200ns interval. You load in the software of the analyser the symbol table of the software to be measured ( /unix in this case ) and specify which functions you want to measure. In the performance mode it lists the number of times the function is called and total time spend in it. ( average, min and max are options ) This is where I got my info from. These figure are very accurate. I also did a test by patching in the kernel spltty to splhi. I measured responce times of an order entry application (16 users ) with the patched kernel. They went down from an average of 1 sec to 0.82 seconds. The cpu time went down with 8% . This application does terminal io on streams bases terminals. So you see the influence of the setpicmask function in the spl handling. -- Peter Brouwer, # Philips Telecommunications and Data Systems, NET : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2, UUCP : ....!mcsun!philapd!pb # P.O.Box 245, 7300AE Apeldoorn, The Netherlands. PHONE:ext [+31] [0]55 432523, #