vanandel@stout.ucar.edu (Joe Van Andel) (09/09/89)
I am working on a real-time radar signal processing system that will use multiple 68020 processors running VxWorks on a VME backplane. Because of the large amount of data we are processing (400Kbytes/second), I can't afford to use TCP/IP or RPC interprocess communication methods. As I read the manual, I conclude that VxWorks doesn't provide facilities for managing shared memory or semaphores between tasks executing on different processors. I very much like VxWorks, but I feel it needs more multi-processor support. Has anyone else written code (that you would be willing to share) to provide these facilities? Do other real-time operating system vendors offer better multi-processor support? Joe VanAndel Internet:vanandel@ncar.ucar.edu NCAR - RSG P.O Box 3000 Fax: 303-497-2044 Boulder, CO 80307-3000 Voice: 303-497-2071
drk@athena.mit.edu (David R Kohr) (09/11/89)
In article <4252@ncar.ucar.edu> vanandel@ncar.ucar.edu (Joe Van Andel) writes: >[...] >communication methods. As I read the manual, I conclude that VxWorks >doesn't provide facilities for managing shared memory or semaphores >between tasks executing on different processors. I very much like >VxWorks, but I feel it needs more multi-processor support. >[...] > Joe VanAndel Internet:vanandel@ncar.ucar.edu > NCAR - RSG > P.O Box 3000 Fax: 303-497-2044 > Boulder, CO 80307-3000 Voice: 303-497-2071 I'm using both pSOS (the well-known real-time single processor kernel) and pRISM (an extension of the pSOS interprocess communications primitives to multi-CPU systems) from Software from Software Components Group (who originally designed pSOS, I believe). I was wondering if VxWorks supports anything like the pRISM primitives. Can you at least buy an add-on package for VxWorks to get these primitives? David R. Kohr M.I.T. Lincoln Laboratory Group 45 ("Radars 'R' Us") email: KOHR@LL.LL.MIT.EDU or DRK@ATHENA.MIT.EDU phone: (617)981-0775 (work), (617)527-3908 (home)
projoe@crim.eecs.umich.edu (Joseph A. Dionise) (09/11/89)
In article <4252@ncar.ucar.edu> vanandel@ncar.ucar.edu (Joe Van Andel) writes: >I am working on a real-time radar signal processing system that will >use multiple 68020 processors running VxWorks on a VME backplane. >As I read the manual, I conclude that VxWorks >doesn't provide facilities for managing shared memory or semaphores >between tasks executing on different processors. The support is there for "low-level" communication between processors using shared memory. We recently setup a shared-memory buffer between a pair of 68020's (MVME-133A) and a single 68030 (MVME-147A). I'll outline our setup. A single 68020 has a 64K segment (excuse the Intel slip) of shared memory located at the upper bounds of its onboard memory. We tricked vxWorks into not using this memory by modifying the sysMemTop routine (in sysLib.c) to return the total amount of onboard RAM minus the amount of shared memory. Specifically, sysMemTop returns 0x0f0000 = 0x100000 - 0x010000. This cpu sets up the shared memory data structures, initializes the semaphores, etc. We used a very simple queue in the shared memory. In this case, we guarded reads from the queue and writes to the queue, since both are destructive. The other processors use the sysBusTas routine to perform a test-and-set across the bus on the global semaphores. If they "win", then reading/writing to the queue takes place. Our boards enable the RMW (Read-Modify-Write) sequence through the use of jumpers (68020) and software (68030). Hence, the sysBusTas routine is really just a call to the 68K tas instruction. Note that a cold boot will zero all of the onboard memory (including any shared memory segments). If this is not acceptable, then the assembly routine romInit must be modified. ################################################################################ # Joseph A. Dionise # # Robot Systems Division Internet : projoe@crim.eecs.umich.edu # # University of Michigan uucp : {..}!umich.uucp!crim.eecs.umich.edu!projoe # # 1101 Beal Avenue BIX : jdionise # # Ann Arbor, MI 48109 (313) 936-2830 # ################################################################################
ksh@vine.VINE.COM (Kent S. Harris) (09/11/89)
I have had as many as 11 processors communicating across a VME bus under pSOS. At the lowest level the communication model was one of shared memory. The application interface was a complete device driver in the usual pSOS sense which included an exclusion exchange (special message) and a synchronization exchange for supervisor and Interrupt Service Routine (ISP) synchronization. To keep life simple, I did not implement packet fragmenting so applications where limited on the size of the message they could send, but this would be simple to do. I did implement a stream style interface as an application library so an application could do byte stream i/o. All in all, no big ditty.
hmp@cive.ri.cmu.edu (Henning Pangels) (09/14/89)
In article <4252@ncar.ucar.edu>, vanandel@stout.ucar.edu (Joe Van Andel) writes: > I am working on a real-time radar signal processing system that will > use multiple 68020 processors running VxWorks on a VME backplane. > Because of the large amount of data we are processing > (400Kbytes/second), I can't afford to use TCP/IP or RPC interprocess > communication methods. As I read the manual, I conclude that VxWorks > doesn't provide facilities for managing shared memory or semaphores > between tasks executing on different processors. I very much like > VxWorks, but I feel it needs more multi-processor support. > Rather than mucking around with the sysLib routines, I modified the Makefile and usrConfig.c used to build VxWorks. In the Makefile, I define USER_CFLAGS = -DRESERVE_MEM=0x100000, which is appended to the regular CFLAGS. Then, in usrConfig.c, I change the kernel initialization to read kernelInit (TRAP_KERNEL, usrRoot, ROOT_STACK_SIZE, FREE_RAM_ADRS, sysMemTop () - RESERVE_MEM, ISR_STACK_SIZE, INT_LOCK_LEVEL); Of course, you only want to do this for the processor board on which the shared memory actually resides - on all other processors in the system, your application code will have to know where you've mapped your memory spaces in order to correctly share the memory which you've reserved above. To coordinate access to the shared memory region, I use the vxTas() routine, (which is all that is called by sysBusTas()). We have implemented a rudimentary "backplane pipe" using this mechanism, which uses mailbox- or backplane interrupts (depending on the processor board used). Even the very first un-optimized experimental version is almost 10 times faster than going through the overhead of TCP/IP sockets - as usual, it's possible to trade off some portability and generality in favor of performance. To anyone from WRS who might be listening: I agree with comments made by others that some mechanism like this should be made part of the vxWorks package. As an aside: Be careful about mapping several processor's memory spaces contiguously -- some versions of the sysMemTop() routines work by probing for live memory, so if there's no memory gap between boards, one processor might actually claim another's memory for itself. -- Henning Pangels Field Robotics Center ARPAnet/Internet: hmp@cive.ri.cmu.edu Robotics Institute (412) 268-6557 Carnegie-Mellon University
projoe@crim.eecs.umich.edu (Joseph A. Dionise) (09/15/89)
In article <6143@pt.cs.cmu.edu> hmp@cive.ri.cmu.edu (Henning Pangels) writes: > > Rather than mucking around with the sysLib routines, I modified the >Makefile and usrConfig.c used to build VxWorks. In the Makefile, I define >USER_CFLAGS = -DRESERVE_MEM=0x100000, which is appended to the regular >CFLAGS. Then, in usrConfig.c, I change the kernel initialization ... > I agree. This method is better than the approach that I outlined. > >As an aside: Be careful about mapping several processor's memory spaces >contiguously -- some versions of the sysMemTop() routines work by probing >for live memory, so if there's no memory gap between boards, one processor >might actually claim another's memory for itself. > We encountered this problem. In fact, this is why I initially modified the sysMemTop() routine. I hard coded it to return the amount of onboard RAM, instead of probing for the first "open" byte. The moral to this story : become familiar with the sysLib library. It is the gateway to your processor. ################################################################################ # Joseph A. Dionise # # Robot Systems Division Internet : projoe@crim.eecs.umich.edu # # University of Michigan uucp : {..}!umich.uucp!crim.eecs.umich.edu!projoe # # 1101 Beal Avenue BIX : jdionise # # Ann Arbor, MI 48109 (313) 936-2830 # ################################################################################
topper@mcgill-vision.UUCP (Anthony Topper) (09/16/89)
>Has anyone else written code (that you would be willing to share) to provide >these facilities? Do other real-time operating system vendors offer >better multi-processor support? It seems that a number of vendors are weak in this area. VxWorks is no exception. We had an application that required very fast interprocessor communication and vxWorks didn't have it so I wrote one. However our application required such raw speed that the package I created was not used, but I believe would fit many people's needs. Some features (long): o interprocessor, interprocess communication on the same backplane. o Very fast. It fits in between vxWorks pipes and sockets. Half the speed of pipes but 50-100 times faster than sockets. o uses vxWorks file level to be as seemless as possible. The same user code can be used for sockets between vxWorks "boxes", my shared mem for same backplane, and vxWorks pipes within a CPU. So you choose what is most appropriate. o Has many modes of operation: queue, ring buffer, mailbox, plain buffer. Each of these can be blocking or non-blocking and the blocking mode can use interrupt demons to wake processes or use "test-and-set" loops. (the former uses mailbox or backplane interrupts to wakeup processes on other cpus and requires interrupt processing, the latter takes up bus bandwidth but no interrupt overhead). All of these modes are available for each block of memory requested by the user, so the appropriate mode can be used where needed. o Supports memory in contiguous or non-contiguous blocks any where on VME or VSB bus. o Does auto-synchronization on boot-up and simultaneous memory allocation. o Has semaphore mode for counting semaphores (minimizes allocation of memory). o Uses about 2.2k memory per memory partition requested for overhead. Base overhead is about 32k. All in shared memory. Requires about 30K of code memory on CPU board. o can configure for cold-boot wipe contents of memory or not. o performance: 1 byte 64 bytes 255 bytes ------ -------- --------- VxWorks pipe: 360 390 490 shMem tas: 700 780 1150 shMem demon: 1700 1700 2800 sockets (TCP/IP): 200000 200000 200000 (datagrams are faster) All times in microseconds. This test using vxWorks timer functions between two Heurikon-V2F's (Mc68020 @ 20Mhz, 0 wait) and using Micro-Memory's MM6300 shared memory board (200 ns access, I think) o for example user does something like: CPU producer: fd = open("/dev/sharedMem/VME/myBlock", "SIZE=1000, ELSIZE=10, QUEUE, BLOCK"); for (i = 1; i < however_many; i++); { do some processing get some data ... write(fd, buffer, 10); } CPU consumer: fd = open("/dev/sharedMem/VME/myBlock", "SIZE=1000, ELSIZE=10, QUEUE, BLOCK"); for (i = 1; i < however_many; i++); { read(fd, buffer, 10); do some processing with the data ... } The open requests are automatically synchronized, the rest follows naturally. o Current implementation Only runs on the vw3.2 and Heurikon V2F cpus. A port to vw4.x and generic vxWorks CPUs is quite straight forward to do, though supported CPUs would require mailbox interrupts. I started to do it, but I just don't have the time. The code is about a year old and I haven't looked at it in quite some time. I did have a neat demo of five cpus doing a classic consumer/producer problem. o I also did a port of curses and unix level 3 file I/O for vw3.2 which is now obsolete. o I people want it they can send me a tape and I'll copy for you. I'll be away all october so be patient. Are you listening Wind River? I'll do a no cash deal for the rights if you are interested. Tony Topper _________________ McGill University, EE Dept. | / \ / \ / \ | Montreal, Canada \/ \/ \/ \/ \ *** *** / smart mailers: topper@mcgill-vision.uucp \ *** *** / usa: {ihnp4,decvax,akgua,utzoo,etc}!utscri \ * *** * / !musocs!mcgill-vision!topper \ *** / or \ * / think!mosart!mcgill-vision!topper \ / ARPAnet: topper@larry.mcrcim.mcgill.edu \ / bitnet: mcgill-vision!topper@musocs.bitnet Bell Canada: (514) 398-3788
topper@mcgill-vision.UUCP (Anthony Topper) (09/16/89)
>Has anyone else written code (that you would be willing to share) to provide >these facilities? Do other real-time operating system vendors offer >better multi-processor support? It seems that a vxWorks is weak in this area. We had an application that required very fast interprocessor communication and vxWorks didn't have it, so I wrote one. However our application required such raw speed that the package I created was not used, but I believe would fit many people's needs. Some features (long): o interprocessor, interprocess communication on the same backplane. o Very fast. It fits in between vxWorks pipes and sockets. Half the speed of pipes but 50-100 times faster than sockets. o uses vxWorks file level to be as seemless as possible. The same user code can be used for sockets between vxWorks "boxes", my shared mem for same backplane, and vxWorks pipes within a CPU. So you choose what is most appropriate. o Has many modes of operation: queue, ring buffer, mailbox, plain buffer. Each of these can be blocking or non-blocking and the blocking mode can use interrupt demons to wake processes or use "test-and-set" loops. (the former uses mailbox or backplane interrupts to wakeup processes on other cpus and requires interrupt processing, the latter takes up bus bandwidth but no interrupt overhead). All of these modes are available for each block of memory requested by the user, so the appropriate mode can be used where needed. o Supports memory in contiguous or non-contiguous blocks any where on VME or VSB bus. o Does auto-synchronization on boot-up and simultaneous memory allocation. o Has semaphore mode for counting semaphores (minimizes allocation of memory). o Uses about 2.2k memory per memory partition requested for overhead. Base overhead is about 32k. All in shared memory. Requires about 30K of code memory on CPU board. o can configure for cold-boot wipe contents of memory or not. o performance: 1 byte 64 bytes 255 bytes ------ -------- --------- VxWorks pipe: 360 390 490 shMem tas: 700 780 1150 shMem demon: 1700 1700 2800 sockets (TCP/IP): 200000 200000 200000 (datagrams are faster) All times in microseconds. This test using vxWorks timer functions between two Heurikon-V2F's (Mc68020 @ 20Mhz, 0 wait) and using Micro-Memory's MM6300 shared memory board (200 ns access, I think) o for example user does something like: CPU producer: fd = open("/dev/sharedMem/VME/myBlock", "SIZE=1000, ELSIZE=10, QUEUE, BLOCK"); for (i = 1; i < however_many; i++); { do some processing get some data ... write(fd, buffer, 10); } CPU consumer: fd = open("/dev/sharedMem/VME/myBlock", "SIZE=1000, ELSIZE=10, QUEUE, BLOCK"); for (i = 1; i < however_many; i++); { read(fd, buffer, 10); do some processing with the data ... } The open requests are automatically synchronized, the rest follows naturally. o Current implementation Only runs on the vw3.2 and Heurikon V2F cpus. A port to vw4.x and generic vxWorks CPUs is quite straight forward to do, though supported CPUs would require mailbox interrupts. I started to do it, but I just don't have the time. The code is about a year old and I haven't looked at it in quite some time. I did have a neat demo of five cpus doing a classic consumer/producer problem. o I also did a port of curses and unix level 3 file I/O for vw3.2 which is now obsolete. o I people want it they can send me a tape and I'll copy for you. I'll be away all october so be patient. Are you listening Wind River? I'll do a no cash deal for the rights if you are interested. Tony Topper _________________ McGill University, EE Dept. | / \ / \ / \ | Montreal, Canada \/ \/ \/ \/ \ *** *** / smart mailers: topper@mcgill-vision.uucp \ *** *** / usa: {ihnp4,decvax,akgua,utzoo,etc}!utscri \ * *** * / !musocs!mcgill-vision!topper \ *** / or \ * / think!mosart!mcgill-vision!topper \ / ARPAnet: topper@larry.mcrcim.mcgill.edu \ / bitnet: mcgill-vision!topper@musocs.bitnet Bell Canada: (514) 398-3788