raymond@uunet.UU.NET (Raymond Leung) (11/08/88)
I have done the following experiment on a Encore Multimax machine with 8 processors running UMAX 4.2 (BSD4.2 like with some System V features) operating system and I could not explain the system time observed. The code I wrote is a parallel sorting program base on Batcher's bitonic sorters. The parent forks off a number of processes and distribute the input numbers to these processes. After a initial sort, each process partitions the set of numbers it got and sends them to some other processs (according to the algorithm). Each process receives two sets of numbers from two other processes, which are merged, partitioned and sent to other processes. This merge-split step is repeated a number of times (depending on the number of processes) and the sorting is completed. (Never mind the exact algorithm) I used the message system (msgsnd, msgrcv - system V features) to pass the numbers between processes. In order to see the various behaviour of the program and system, I put a loop around the comparison routine so that the cost of a comparison could be varied relative to other costs (e.g. communication cost). The comparison routine looks like: cmp(x,y) int x,y; { int i, result; for (i=0; i < repeat; i++) { if (x > y) result = 1; else if (y > x) result = -1; else result = 0; } return result; } I ran experiments with 3200 numbers and 16 processes. The only varying parameter was the cost of the comparison routines (the variable "repeat"). The following are the real, user and system time for different "repeat factor"s. The user and system are the accumulated times for all the processes (parent and children). Repeat Real User System ============================================ 1 4.50 2.43 11.77 <-- 10 4.68 5.33 10.49 <-- 100 7.30 34.42 8.14 1000 46.20 325.17 8.69 10000 435.60 3234.14 23.77 20000 845.93 6466.99 41.42 They all ran on exactly the same set of data. This was repeated several times. The separate runs agree within 2-3%. The system was almost idle when the experiments were carried out. If the system measures the time spent in carrying out systems calls (messages transfers etc.) on behalf of the processes, I expect a constant system time. I also learnt that when a process got interrupted, the time spent in handling the interrupt was also charged to the process being interrupted. The system time was expected to rise with the real time and this was so. The point I cannot explain is why the system time droped when the repeat factor went from 1 to 100? I ran the same experiment with 2, 4, 8, and 32 processes, the initial drop in system time became more and more prominent as there were more processes. I have also used the getrusage system call to get some statistics about the resources usage for the cases repeat=10 and repeat=100. The various statistics were similar except for the number of voluntary context switches which were slightly higher for the case repeat=100 (yes, the one that took less system time). However, when I ran this a few times, a fairly large variance was produced. For your information, the results with 8 and 32 processes are as follows: With 8 processes: Repeat Real User System ============================================ 1 3.27 2.09 4.75 10 3.54 4.61 4.29 100 7.13 30.54 4.19 1000 44.29 290.16 5.65 10000 424.70 2896.34 22.44 20000 816.20 5765.93 41.18 With 32 processes: Repeat Real User System ============================================ 1 9.86 3.23 45.13 <-- (also unexplained) 10 10.48 6.47 47.59 100 12.68 40.01 37.33 1000 55.08 373.11 30.39 10000 498.31 3703.73 33.76 20000 965.62 7400.84 51.44 Can someone shed some light? Some information about the architecture of the Multimax and the kernal of UMAX? Does it handle all system calls on one particular processor? Someone suggested maybe spin locks are present in the kernel for handling the message passing. Anyone know how and why they are used? Thanks in advance. Raymond Leung VLSI and System Technology Laboratory, School of Electrical Engineering and Computer Science, The University of New South Wales, PO Box 1, Kensington, Sydney, NSW 2033, AUSTRALIA. ACSnet: raymond@cad.jmrc.eecs.unsw.oz ARPA: raymond%cad.jmrc.eecs.unsw.oz@uunet.css.gov UUCP: uunet!munnari!cad.jmrc.eecs.unsw.oz!raymond