[comp.os.os2.programmer] Horses!

TURGUT@TREARN.BITNET (Turgut Kalfaoglu) (09/28/90)

I was trying to get Peter Norton's HORSES.C program to work. This program
creates 5 threads, each of which are responsible to draw a simple horse
picture, and run the horses accross the screen, checking when column 75
is reached, and then declaring the winner. It's a very simple program.
The main program simply passes these 'horses' a number from 1 to 5
so that the horses can determine at which line to draw their pictures,
and at the end, which one is the winner.

I had to make some changes to the code since Norton uses DosBeginThread,
with manual creation of a stack with malloc(). So, that all vanished,
and replaced by a _beginthread. Here is the curiosity: if I put a
DosSleep(0L) in the main()'s loop where the beginthread is called five
times to create the five horses, everything is fine. If I remove that,
five horses are created, but they each receive the same index number (5),
and they move together too! (so only one horse appears on display). It
roughly looks like this:

for (j=0;j<5;j++)  {
  if (_beginthread(horse,NULL,4096,&j) == 0)  {
     printf("Error");
     exit(1);
  }
  DosSleep(0L);
}
I remove the DosSleep and it no longer works!

Any ideas?  Is it a bug in OS/2 1.1 that got corrected with 1.2 ?
Also I noticed that if the main() ends, even with DosExit(0,0),
all threads are killed, which is contrary to Peter Norton's book..

Regards, -turgut

ballard@cheddar.ucs.ubc.ca (Alan Ballard) (09/28/90)

In article <90270.152726TURGUT@TREARN.BITNET> TURGUT@TREARN.BITNET (Turgut Kalfaoglu) writes:

>... Here is the curiosity: if I put a
>DosSleep(0L) in the main()'s loop where the beginthread is called five
>times to create the five horses, everything is fine. If I remove that,
>five horses are created, but they each receive the same index number (5),
>and they move together too! (so only one horse appears on display). It
>roughly looks like this:
>
>for (j=0;j<5;j++)  {
>  if (_beginthread(horse,NULL,4096,&j) == 0)  {
>     printf("Error");
>     exit(1);
>  }
>  DosSleep(0L);
>}
>I remove the DosSleep and it no longer works!
>
>Any ideas?  Is it a bug in OS/2 1.1 that got corrected with 1.2 ?

Nope, its a bug in your version of the program.  You're passing the address
of the variable j.  There are no guarantees about when the new processes
will actually get to execute.  Without the DosSleep, the main thread
keeps running, for a while, and by the time the other threads
get to pick up the parameter it has already changed back in the main
thread.  DosSleep with a parameter of zero gives the threads a chance
to run for one time slice, so they pick up the value at that time and each
get the appropriate value. 

To do this properly, you need to either pass the parameter by value, as it 
was in the original version or use semaphores to properly synchronize
the startup so that each thread gets its parameter before the main
thread goes on to start the next.

Norton/LaFore's version of this program contains a C coding bug in the 
stack initialization, by the way (which may be the reason you switched
to _beginthread?) 
The line 
   stkptr = (int far *)malloc(STACK_SIZE) + STACKSIZE;
should replaced with some variant of
   stkptr = (int far *)((char far *)malloc(STACK_SIZE) + STACK_SIZE);

I've found Norton/LaFore good for getting the basic idea of the OS/2
kernel services, but it is rather casual about IPC issues in many
of the multi-threaded processes.  The requires synchronization is only
done in places where it is the subject of the example; lots of other 
critical sections are really needed to make the programs 
behave as described. 

>Also I noticed that if the main() ends, even with DosExit(0,0),
>all threads are killed, which is contrary to Peter Norton's book..
 
I seem to recall this was a change between 1.0 and 1.1 (or maybe
between 1.1 and 1.2).  When the main thread exits, all threads exit.



Alan Ballard                   | Internet: ballard@ucs.ubc.ca
University Computing Services  |   Bitnet: USERAB1@UBCMTSG
University of British Columbia |    Phone: 604-228-3074
Vancouver B.C. Canada V6R 1W5  |      Fax: 604-228-5116

db3l@ibm.com (David Bolen) (09/29/90)

In article <90270.152726TURGUT@TREARN.BITNET> TURGUT@TREARN.BITNET (Turgut Kalfaoglu) writes:

>           [...]                Here is the curiosity: if I put a
>DosSleep(0L) in the main()'s loop where the beginthread is called five
>times to create the five horses, everything is fine. If I remove that,
>five horses are created, but they each receive the same index number (5),
>and they move together too! (so only one horse appears on display). It
>roughly looks like this:
>
>for (j=0;j<5;j++)  {
>  if (_beginthread(horse,NULL,4096,&j) == 0)  {
>     printf("Error");
>     exit(1);
>  }
>  DosSleep(0L);
>}
>I remove the DosSleep and it no longer works!

The last parameter that you are supplying to _beginthread is a pointer,
which is simply copied as a parameter to the starting thread.  The pointer
points to the location of the local variable "j" on the main stack.  I
haven't seen the full program, but I presume as each thread starts, it
dereferences the pointer to determine which horse it is.

However - imagine that the main thread of the program keeps running for
a little bit after calling _beginthread before the new thread actually
starts up, which is certainly possible if the main thread runs fast
enough not to have run out of its timeslice after calling _beginthread.
Then, the for loop will increment the value of j on the stack, and when
the new thread actually begins executing, since it just takes the actual
value of j on the main stack, will see the incremented value and not the
value that j held when _beginthread was originally called.  In your case,
it looks like _beginthread is called all 5 times before any of the threads
begin executing, so they all see the value of j as being the final value
of the loop.

In fact, passing a pointer to the loop counter as a parameter is pretty
dangerous - it's possible that the new thread will access the value of
the counter in the middle of the main thread changing its value, and thus
the new thread will get a value that is entirely wrong.  True, when the
increment is done with "++", it's probably a single "inc" instruction and
won't be interrupted, but in general this is not a nice way to do things.

This "seems" to be fixed by adding the DosSleep(0) because you force the
main thread to give up the remainder of its timeslice after creating each
new thread, which gives each new thread a chance to start executing and
grab the value of j before it changes.  I say "seems" because other
factors might keep the new thread from executing before main gets control
again, although if they all have the same priority, the round robin
scheduling of OS/2 will probably let each thread run before main gets 
another timeslice.

How to fix this?  I'd suggest passing the value of j as the parameter,
rather than a pointer to it.  Cast j into a (void *), or whatever the last
parameter of _beginthread is, and cast it back to a integer at the start
of the thread.  This does assume that sizeof(int) <= sizeof(void *), but
for an example program that's probably not too bad.

A more "proper" alternative is to use a semaphore to let each thread tell
the main thread when it is done starting up, and has the proper value of
j.  Have main DosSemSet the semaphore before calling _beginthread, and
then have the thread function DosSemClear it when it gets its numeric
code.  Then rather than DosSleep(0) in main, use DosSemWait on the
semaphore.

To finish off a "fair" horserace - each thread should, after using
DosSemClear to tell main is it up and running, do a DosSemWait on another
semaphore.  When main finishes creating all 5 threads, it should clear
the second semaphore, thus allowing all 5 threads to begin executing at
the same time (all 5 threads will be waiting on the same semaphore).

>Any ideas?  Is it a bug in OS/2 1.1 that got corrected with 1.2 ?

Nope - I think it's just that your process is getting scheduled slightly
differently under 1.2 than under 1.1.  It is possible that the scheduler
has changed a little between versions, and your main procedure is just
running longer under 1.2 than it did under 1.1, which lets it modify j
before the new thread begins.  Perhaps you are also using different
values of TIMESLICE or MAXWAIT in your CONFIG.SYS which can affect the
maximum time a single thread can have control of the CPU.  (I don't know
if the defaults for those settings changed between 1.1 and 1.2)

>Also I noticed that if the main() ends, even with DosExit(0,0),
>all threads are killed, which is contrary to Peter Norton's book..

Well, I've never read Peter Norton's book, but if the primary thread of
an application exits, the application will (or definitely should) exit.
The main thread of an application is a special thread, and not really
equivalent to those that you create - especially in the way that signals
are always sent to the main thread, so one must exist for any running
process.  If this disagrees with the book, I think the book is wrong.

To quote from the IBM toolkit documentation, Control Program Programming
Reference, DosExit function:

   "Do not terminate thread 1 without terminating the process.  Thread 1
    is the initial thread of execution, not a thread started by a
    DosCreateThread request.  When thread 1 ends, any monitors or signal
    processing routines set for this process also end.  To avoid
    unpredictable results, DosExit should be specified with ActionCode=1
    to ensure the process ends."

I suppose the application ending falls under "unpredictable results" :-)

-- David
--
/-----------------------------------------------------------------------\      
 \                             David Bolen                             /
  |    Laboratory Automation, IBM Thomas J. Watson Research Center    |
 /              P.O. Box 218, Yorktown Heights, NY  10598              \
| - - - - - - - - - - - -  M i t h r a n d i r  - - - - - - - - - - - - |
| Internet : db3l@ibm.com                    | Bitnet : db3l@watson     |
| Usenet   : uunet!bywater!arnor!larios!db3l | Phone  : (914) 945-1940  |
\-----------------------------------------------------------------------/

TURGUT@TREARN.BITNET (Turgut Kalfaoglu) (10/01/90)

(Slap on the forehead)   Ah! Thank you, yes of course your answers
make it very clear!  I am still not too good with the multiple-threads
of execution, (born and bread MS-DOS programmer.) The reason why my 5 threads
were getting the same index must have been that they do not get dispatched
until that loop ends.  About DosExit(0,0) ending everything: I guess OS/2
1.0 did not make it a requirement that the main thread must be present
for the child threads to live.

Regards, -turgut