[comp.unix.misc] Process supervision in large SW systems.

thomas@uppsala.telesoft.se (Thomas Tornblom) (10/31/90)

What method do people use to control/create/kill/supervise different
processes in large software systems?

People are asking me how to implement a supervision system. It should be
responsible for checking that processes are alive and are feeling well.

It should also be able to in some intelligent way restart a process
that has died or that is doing the wrong thing.

Problem areas are interprocess communication, how to detect status changes
(strict hierarchy of processes?, catching SIGCHILD?).

People must have done this before in systems that requires high reliability.

The system is going to used on fault tolerant hardware in the future so
we need fault tolerant software.

E-mail prefered.

Thanks 
Thomas
-- 
Real life:      Thomas Tornblom             Email:  thomas@uppsala.telesoft.se
Snail mail:     Telesoft Uppsala AB         Phone:  +46 18 189406
                Box 1218                    Fax:    +46 18 132039
                S - 751 42 Uppsala, Sweden

thomas@uppsala.telesoft.se (Thomas Tornblom) (11/01/90)

In article <THOMAS.90Oct31145908@uplog.uppsala.telesoft.se> thomas@uppsala.telesoft.se (Thomas Tornblom) writes:

   What method do people use to control/create/kill/supervise different
   processes in large software systems?

[text deleted]

Are there any work going on in the Unix world addressing things like this?
I heard a rumor that OSF has a Request For Technology touch these areas.
Is there any substance in it and are there any candidates?

Again, E-mail prefered.

Znks,
Thomas
-- 
Real life:      Thomas Tornblom             Email:  thomas@uppsala.telesoft.se
Snail mail:     Telesoft Uppsala AB         Phone:  +46 18 189406
                Box 1218                    Fax:    +46 18 132039
                S - 751 42 Uppsala, Sweden