[comp.sys.sgi] TCP/IP process distibution

amedeo@DCDLAA.FNAL.GOV (Lisa Amedeo) (01/22/91)

      I am a systems manager and consultant at FERMILAB.  One of the
      departments I work with wants to set up a file server in a 
      heterogeneous environment - SGI, SUN, IBM and DEC "UNIX" boxes
      involved.  One of the SGI's is a 4D/240 and it is currently being
      decided if this box should be the file server. While discussing this
      issue, a co-worker recalled being told, by an SGI rep, that TCP/IP
      would not distribute it's processes on a mutiple cpu'd system.  In
      other words all TCP/IP processes run on a single CPU no matter how
      many CPU's are available.  The advantages of using the 4D/240 as the 
      file server would be drastically reduced if this information is
      correct. 

      If this is true, what are the implications for file serving and 
      for X applications run remotely?  


      FYI the 4D/240 is currently running 3.3.1 of the OS.  

                                            

                                              Lisa







*****************************************************************************
*         *          *          *          *          *          *          *
 *       * *        * *        * *        * *        * *        * *        *
*         *          *          *          *          *          *          *

                                                     LISA AMEDEO
      /       /  ----   ----            FERMI NATIONAL ACCELERATOR LABORATORY
     /       /  /      /   /       =                 CD/DCD/DSG 
    /       /   ----  /---/   ====== =  UNIX SYSTEM ADMINISTRATION CONSULTANT 
   /_____  /   ____/ /   /         =            phone: (708) 840-8023
                                           e-mail: amedeo@dcdlaa.fnal.gov

*         *          *          *          *          *          *          *
 *       * *        * *        * *        * *        * *        * *        *
*         *          *          *          *          *          *          *
*****************************************************************************

vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (01/22/91)

In article <9101212009.AA10711@dcdlaa.fnal.gov>, amedeo@DCDLAA.FNAL.GOV (Lisa Amedeo) writes:
>       I am a systems manager and consultant at FERMILAB.  One of the
>       departments I work with wants to set up a file server in a 
>       heterogeneous environment - SGI, SUN, IBM and DEC "UNIX" boxes
>       involved.  One of the SGI's is a 4D/240 and it is currently being
>       decided if this box should be the file server. While discussing this
>       issue, a co-worker recalled being told, by an SGI rep, that TCP/IP
>       would not distribute it's processes on a mutiple cpu'd system.  In
>       other words all TCP/IP processes run on a single CPU no matter how
>       many CPU's are available.  The advantages of using the 4D/240 as the 
>       file server would be drastically reduced if this information is
>       correct. 
> 
>       If this is true, what are the implications for file serving and 
>       for X applications run remotely?  

Unfortunately, things are not so simple that an answer of "Yes (no), it is
(not) true." would be valid.

1. It is true in IRIX 3.3.2 that "TCP/IP system calls" execute on the
  "network processor."  For example, a process executing a read(2) on a
  socket immediately forces itself to the CPU that fields network device
  interrupts.  Thereafter, it will tend to stay on that processor, as a
  result of the general "processor affinity" mechanism.

2. Thus, gr_osview or other similar utilites tend to show that the
  network processor is busier.

3. Other than a process that is now or was very recently executing
  network-ish system calls, there is no such thing as a "TCP/IP process."

4. It is not clear one way or another what effects this policy has on
  performance, whether measured in TCP Bytes/second, X-window-system
  packets/second, or in any other technical metric.

  If there is .9 CPU's of work to be done, and it is all "serial" by
  virtual of having to pass through a network device and the network
  queues, then you will get the best performance by using a single CPU to
  do it.  Serial work is done fastest when you do not have to pay for
  multi-process synchroniziations.  MP's are great if you have a job or
  jobs that have parts that can be done in parallel.

  It is not clear just how much of TCP/IP, NFS, or UDP/IP can be usefully
  parallelized.  We think we have a modest understanding of our code, and
  some idea how many CPU cycles the pieces require.  The current and next
  major releases spend very few cycles on individual TCP/IP packets.  The
  number of cycles per packet is not very different from the number of
  cycles required to synchronize its processing with other CPU's.  It would
  be to easy have 4 CPU's 100% loaded, but processing the same number of
  bytes/second.  (I should not write publically about our numbers.  Doesn't
  Van Jacobson claim that 4.3BSD takes about 200 instructions including
  interrupt overhead to process a TCP/IP packet on some kind of Sun?)

  We were studying MP vs. TCP long before the first clover2's (SGI MP
  systems) were shipped.  Over the years, we built at least one prototype
  MP implementation, but did not pursue it because it was slower.

5. It is not clear how much advantage multiple processors have for
  file serving.  If you're careful enough (we are not yet), then the CPU
  does not touch the data, and is no more than a switchman among the disk,
  the bulk RAM, and the network hardware.  The IRIS MP bus and memory make
  a very nice switching yard for shuffling NFS packets, but it is not clear
  whether having more than one switchman would help or hinder.  In other
  words, a 240 makes a better than average server, more because of its
  memory architecture than because its many CPUs.

7. Other companies sell "multiprocessor TCP/IP", and people think it is
  a good thing.  This business fact has nothing to do with technical
  details like MBytes/sec.  Silicon Graphics currently has very competative
  Ethernet, FDDI, Ultranet, and other speeds.  Still, people demand "MP."

  I've grown tired of trying to argue technically with marketeers and
  managers.  As a result, you might expect a future release to have the
  same or slightly worse TCP/IP benchmark performance, but look "good" to
  gr_osview.  Because of processor affinity, it might even be faster as an
  X client.

The best answer is to measure things, and do whatever is fastest,
commensurate with what you want to pay.

Vernon Schryver,   vjs@sgi.com

(This is not an announcement of future products.)