[comp.databases] Strange problems with Informix Perform

jaa@codas.att.com (James Anderson) (01/31/89)

This is a problem that has been bugging me for a while.
Maybe someone else has run into it and solved it.

Informix 3.30 Perform screen and
Informix SQL 2.10 Perform screens

A user brings up the perform screen, enters the Add or Update mode.
Something happens to the dialin port they are on and they get disconnected
from the system. The Informix process starts to run wild grabbing as much
usr and sys tics it can get (can tell by reading sar).
a Who shows the user still logged in with no idle time. A stat on the tty
usualy shows 0 idle read time and a write idle time (sometimes no write idle).

The getty defs for the tty have HUPCL in them. There is no changing of this
in any of the profile scripts.

Question: What is going on, and what can be done to prevent it?

The current solution is to look at ps for a perform or sperform that has a
large time (>10.00) and kill them and their parents.

Thanks for any help
James Anderson
jaa@codas.att.com

jdt@sfsup.UUCP (J Tais) (01/31/89)

In article <34891@codas.att.com>, jaa@codas.att.com (James Anderson) writes:
> A user brings up the perform screen, enters the Add or Update mode.
> Something happens to the dialin port they are on and they get disconnected
> from the system. The Informix process starts to run wild grabbing as much
> usr and sys tics it can get (can tell by reading sar).
> a Who shows the user still logged in with no idle time. A stat on the tty
> usualy shows 0 idle read time and a write idle time (sometimes no write idle).

I remember a similar problem on my last project, but it happened when users
were rlogin'ed over TCP/IP and running perform on the remote machine.  We
had a persistent problem with rogue perform processes grabbing all kinds of
cpu time when users disconnected in abnormal ways or were terminated by the
idle-line watcher.  However, I think the problem went away when we upgraded
to the next release of [Wollongong] TCP/IP.

I don't really know the answer, but you're definitely not alone.

> The getty defs for the tty have HUPCL in them. There is no changing of this
> in any of the profile scripts.
> 
> Question: What is going on, and what can be done to prevent it?
> 
> The current solution is to look at ps for a perform or sperform that has a
> large time (>10.00) and kill them and their parents.

Yes, we set up a shell script to do just that, actually, identified pseudo-ttys
with no logged-in user and killed their procs.

Might be worth hacking up a C function to call using 'on beginning' and make
sure signal handling for SIGHUP is set the way you want it.  Since they seem
to trap SIGINT, maybe there's some other signal handling going on.  I dunno.
Perform is a strange beast.  Seems to handle boundary conditions very poorly.
I suspect there are fixed size tables for user functions, lookups, etc; when
you exceed them unpredictable things start to happen!  I have seen perform 
screens stop functioning when a new user-defined functions were added; also
seen lookup's fail to work when a screen had too many of them.  Never took 
the time to diagnose exactly when these things stop working.

Also, never could get ESQL to work from within sperform.  Had to run SQL 
calls in a child.  Is this a feature or a bug?

John Tais
jdt@sfsup.att.com

mjm@attibr.UUCP (Mike Matthews) (02/04/89)

> Perform is a strange beast.  Seems to handle boundary conditions very poorly.
> I suspect there are fixed size tables for user functions, lookups, etc; when
> you exceed them unpredictable things start to happen!  I have seen perform 
> screens stop functioning when a new user-defined functions were added; also
> seen lookup's fail to work when a screen had too many of them.  Never took 
> the time to diagnose exactly when these things stop working.
	Perform will not allow lookups of more then 12 tables which is described
in the 2.10 manual.
	Perform and Ace have definite limits that neither compiler ( saceprep or
formbuild ) are intelligent to warn about. The after effects can be quite
painful and often do not appear until later versions are installed.
We had a Perform application where a programmer had ignored the above mentioned
limit but the application ran any way until the Informix version was upgraded.
The perform screen started exhibiting the same behavior as described in the
HUPCL problem mentioned, essentially bringing a 3B2 to it knees, much to the
surprise of the system administrators doing the upgrade.
	An ace report that results in a row greater than PAGE_SIZE -
( 32 + 4 ) will bomb with any one of many strange messages. Why can`t saceprep
tally the row size and produce a warning giving a clue to the subsequent
run-time problem?
> 
> Also, never could get ESQL to work from within sperform.  Had to run SQL 
> calls in a child.  Is this a feature or a bug?
> 
The perform language is explicitly defined in the manual. I don't think
any extra-language constructs or references are supported. Don't you have
any manuals???

Mike Matthews
ATT International
Tech Support

prc@maxim.ERBE.SE (Robert Claeson) (02/05/89)

In article <4723@sfsup.UUCP>, jdt@sfsup.UUCP (J Tais) writes:
> In article <34891@codas.att.com>, jaa@codas.att.com (James Anderson) writes:
> > A user brings up the perform screen, enters the Add or Update mode.
> > Something happens to the dialin port they are on and they get disconnected
> > from the system. The Informix process starts to run wild grabbing as much
> > usr and sys tics it can get (can tell by reading sar).
> > a Who shows the user still logged in with no idle time. A stat on the tty
> > usualy shows 0 idle read time and a write idle time (sometimes no write idle).

> I remember a similar problem on my last project, but it happened when users
> were rlogin'ed over TCP/IP and running perform on the remote machine.  We
> had a persistent problem with rogue perform processes grabbing all kinds of
> cpu time when users disconnected in abnormal ways or were terminated by the
> idle-line watcher.

I've seen this behaviour in much too many software packages -- Informix
and Oracle is just a few of them. What I think happens is that these
packages ignores SIGHUP and relies on the return code from the write()
and read() system calls to determine when a user has been disconnected.

On many machines, the return code is 0 when the disconnect occurs on a
dialup port (meaning "0 characters read/written") and -1 when a network
connection is disconnected (meaning "error"; errno is set to some reasonable
value). So these packages examines the return code, sees a 0  or a -1
and the  program logic decides  "heck, sumthin' went wrong, let's try
it again".  And off we go.

Some packages interprets the 0 return code as a hangup indication, while
they thinks that -1 is some kind of error and the fix is to retry the
operation until it succeeds.

Note that I don't say that this is what happens in all packages. I just
happens to know that this is the way it happens in some packages. In fact,
I haven't got the faintest idea about what Oracle and Informix does.
I've just seen it happen to both of them, but in Oracle's case only when
the disconnect occured on a TELNET/rlogin connection.
-- 
Robert Claeson, ERBE DATA AB, P.O. Box 77, S-175 22 Jarfalla, Sweden
"No problems." -- Alf
Tel: +46 758-202 50  EUnet:    rclaeson@ERBE.SE  uucp:   uunet!erbe.se!rclaeson
Fax: +46 758-197 20  Internet: rclaeson@ERBE.SE  BITNET: rclaeson@ERBE.SE

jdt@sfsup.UUCP (Happy Informix User) (02/08/89)

In article <130@attibr.UUCP>, mjm@attibr.UUCP (Mike Matthews) writes:
> >I write:
> > Perform is a strange beast. Seems to handle boundary conditions very poorly.
> > I suspect there are fixed size tables for user functions, lookups, etc; when
> > you exceed them unpredictable things start to happen!  I have seen perform 
> > screens stop functioning when a new user-defined functions were added; also
> > seen lookup's fail to work when a screen had too many of them.  Never took 
> > the time to diagnose exactly when these things stop working.
> 	Perform will not allow lookups of more then 12 tables which is described
> in the 2.10 manual.

We were definitely looking up less than 12 tables.  12 fields?  I dunno.
Besides, it's not in the 3.3 manual.  Not much is.

> 	Perform and Ace have definite limits that neither compiler ( saceprep or
> formbuild ) are intelligent to warn about. The after effects can be quite
> painful and often do not appear until later versions are installed.
> We had a Perform application where a programmer had ignored the above mentioned
> limit but the application ran any way until the Informix version was upgraded.
> The perform screen started exhibiting the same behavior as described in the
> HUPCL problem mentioned, essentially bringing a 3B2 to it knees, much to the
> surprise of the system administrators doing the upgrade.

I find the causal relationship here suspicious, but with Informix, who knows.

> 	An ace report that results in a row greater than PAGE_SIZE -
> ( 32 + 4 ) will bomb with any one of many strange messages. Why can`t saceprep
> tally the row size and produce a warning giving a clue to the subsequent
> run-time problem?

I would assume that the form compilers and the form interpreters were written
be separate groups to a given design spec...unfortunately, one or both of them 
took shortcuts and imposed undocumented limitations on the language.

> > Also, never could get ESQL to work from within sperform.  Had to run SQL 
> > calls in a child.  Is this a feature or a bug?

> The perform language is explicitly defined in the manual. I don't think
> any extra-language constructs or references are supported. Don't you have
> any manuals???

Have you ever attempted to develop something with Informix, or are you just
some kind of tech support parrot who keeps saying "RTFM"?  Well, let me tell
you, I have full Informix 3.3/SQL/4GL manuals, and even so, you tend to get
beyond what they cover pretty easily. 

As for the above, you obviously do not understand the question I am asking.  
I said 'sperform', not 'perform screens' or 'forms' or '.frm files' or 
whatever you care to call them.  We wished to execute certain DB operations 
in our C functions called from the form; however ESQL wouldn't work.  You DO 
know that you can link your own C functions into [s]perform, right?  Haven't
YOU ever done this?  Check your manual...

> Mike Matthews
> ATT International
> Tech Support

I don't need you telling me RTFM; I can always get that from Informix! :-)

John Tais
AT&T-BL Summit NJ
jdt@sfsup.att.com

P.S. I DO like Informix.  Honest.