[comp.sys.sgi] news_server dies on VGX dropping core

randy@tessa.iaf.uiowa.edu (randy frank) (02/05/91)

	Has anyone seen this before...
System:4D210VGX 32meg RAM IRIX 3.3.1
Running a program which seems to run fine on  GTX and PI 4Ds.  Program
uses four shared memory segments (Two are 2K, one is 64K, and the fourth
is ~16M).  Program is actually three programs.  Each is spawned via system().
Each has its own graphics window(s). A total of 5 windows are used when the
programs run.  There is a launcher program which sits in the background
until one of the programs stop.  Then it waits for all others to detach
from shared memory and it takes control.  Problem: news_server drops core
when the programs try to quit.  Here's the SYSLOG entry (we don't have
enough space to hold the core.  od identified what was there as new_server):

Feb  4 14:52:17 IRIS grcond[530]: CIO: NeWS: bus error signal received
Feb  4 14:52:20 IRIS unix: dks0d1s0 (/): Out of space
Feb  4 14:52:20 IRIS last message repeated 5 times
Feb  4 14:52:20 IRIS grcond[530]: Child process /bin/news_server terminated by signal 6
Feb  4 14:52:23 IRIS grcond[530]: Restoring PROM textport microcode
Feb  4 14:52:25 IRIS unix: dks0d1s0 (/): Out of space
Feb  4 14:52:26 IRIS unix: dks0d1s0 (/): Out of space
Feb  4 14:52:26 IRIS grcond[603]: In limbo
Feb  4 14:52:32 IRIS grcond[603]: Alive
Feb  4 14:52:32 IRIS grcond[603]: CIO: ace
Feb  4 14:52:32 IRIS grcond[603]: CIO: dks0d1s0: Out of space
Feb  4 14:52:32 IRIS last message repeated 24 times
Feb  4 14:52:32 IRIS grcond[603]: CIO: gm3.u - Mon Aug 6 13:00:57 PDT 1990
Feb  4 14:52:37 IRIS unix: dks0d1s0 (/): Out of space
Feb  4 14:52:37 IRIS grcond[603]: CIO: dks0d1s0: Out of space
Feb  4 14:52:48 IRIS unix: dks0d1s0 (/): Out of space

Note: this problem does not seem to occur when the dataset size is smaller.
Visually when it occurs the windows seem to 'stick' and not close.  Sometimes
even the launcher window reappears before the others close (or the background
is refreshed...).
Anyway what my code does basically is this: 
	launcher closes its window
	launcher uses system("xxx&") to run other programs.
	launcher uses system("xxx") to run one last program

	(programs run... and work fine...)

	last program puts a QUIT token in one of the shared mem segs
	That program exit()s       
	Other programs see the token and exit() also.
	launcher takes note of FIRST exit and wakes up.
	launcher waits until all processes have detached from shmem 
	launcher reopens its window

	Anyone have any ideas???
	Thanks in advance...
--
rjf.
Randy Frank, Engineer                       |  (319) 335-6712       
University of Iowa, Image Analysis Facility |  73 EMRB              
randy@tessa.iaf.uiowa.edu                   |  Iowa City, IA 52242  

operator@IRIS.KTH.DK (Martin Liversage) (02/06/91)

In article <4291@ns-mx.uiowa.edu> randy frank
<uunet.uu.net!ns-mx!tessa.iaf.uiowa.edu> writes:

> Running a program which seems to run fine on  GTX and PI 4Ds.  Program
> uses four shared memory segments (Two are 2K, one is 64K, and the fourth
> is ~16M).  Program is actually three programs.  Each is spawned via system().
> Each has its own graphics window(s). A total of 5 windows are used when the
> programs run.  There is a launcher program which sits in the background
> until one of the programs stop.  Then it waits for all others to detach
> from shared memory and it takes control.  Problem: news_server drops core
> when the programs try to quit.

I just want to point out that the NeWS server uses a shared memory
segment (at least on my machine: 4D/20, IRIX 3.2). It has a shared
memory id of 0 and has permission 0555 (so anyone can write to it).
Programming in C it is very easy to forget to initialize a variable,
and if it has storage class static or extern it will be initialized to 0.

Maybe your program by mistake attaches to segment 0 and writes to
it. This will cause the NeWS server to bomb, but you will probably
discover that the behaviour in general will be erratic. There is
probably also a difference between different hardware platforms in how
the shared memory segment is used causing you program to behave
differently on different machins.

I once made this mistake and my machine went on crashing in strange
ways all day long until I solved the problem. I think that this shared
memory segment writable by anyone is pretty dangerous to programmers
fiddling with shared memory (but giving it an id of 0 is inviting for
real trouble).

Hope this helps.

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
\                                                                             \
\ Martin Liversage                      8616 m  /\                            \
\ Royal Dental College Copenhagen              /  \_   K2 - Mountain of Fate  \
\ Department of Pediatric Dentistry           / \   \       and Dreams        \
\ Norre Alle 20                              /   | | \                        \
\ DK-2200 Kobenhavn N                      /\    |  \ \                       \
\ +45 31 37 17 00 - 4276                  /  \ ^     | \                      \
\                                                                             \
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\