cbf@toro.UUCP (10/28/89)
We're currently experiencing some problems with Informix's RDBMS products, and we're hoping there may be some out there who can be of more help to us than Informix themselves have been so far. We're using their 4GL rel 1.10.00A, ESQL/C rel 2.10.00A on NCR Towers 32/600 running Sys V rel 1.03.02. We're also using the same two products, rel 1.10.03D and 2.10.03D respectively, on an NCR Tower 32/650 running Sys V rel 2.01.00. We use both Informix's standard back-end and their "Turbo" product, but we suspect that the two have nothing to do with our problems in this particular instance. As part of our trading system, we have two large applications written in a mixture of 4GL, EC and C. Our front end client, "size" of a little over 1M, does not feature any SQL statements and takes advantage only of 4GL's forms package. The clients communicate over Sys V IPC with our database manager, "size" of about 600K, which conversely does not utilize any of Informix's visual interface (save for the occasional "display" statement) and is exclusively in charge of applying transactions to our DB using SQL. A typical transaction might consist of looking up an object in a table, inserting a row for the object in that table if necessary, inserting related rows for the object in three other tables and/or updating existing rows in those three tables. Both applications have this unfortunate tendency to hang or die prematurely, and always in circumstances related to memory allocation. The clients will typically die with a segmentation violation while trying to open a new (sub)window and a core trace will inevitably abut in a call to malloc() through assorted Informix _library calls seemingly related to opening a window. However the frequency of those faults changes from one incarnation of the executable to another. For example, decreasing the size of a global array might mean that our clients will die every hour instead of every two... Weird, huh? Our server, on the other hand, is more varied in its modes of death. It used to be that the only one -- and we even convinced ourselves of a firm correlation between it and the number of transactions processed -- was the failure of an SQL statement with the code -257. The 4GL manual states that in this case the "System limit on cursors [was] exceeded, maximum is NUM", although what NUM is, the 4GL error statement neglects to mention. But at least it was predictable, and for 18 months or so, we learned to live with it once prolonged and anguished interaction with Informix failed to amount to anything. However we recently did some extensive rewriting of, and additions to, our transaction processing code and we now have Humpty Dumpty on our hands. Our server has now taken to dying early, often and in non- deterministic fashion. We no longer get the -257, although that may be because it never runs long enough to get there. Instead we get the occasional -406 ("Memory allocation failed.") and the quite frequent segmentation violation. Typically the core trace will list in reverse order of invocation: malloc(), _sqrealloc(), _sqgcdesc(), _iqslct(), our_function(). A final variation on that theme is the hanging SQL or string concatenation statement. After inducing a core dump in the latter case, we might see something like: malloc(), _doconcat(), our_function(). Now to be fair there have been some instances in the past where people at Informix have been of tremendous help to us (the names Chris Radcliffe and Gilbert Dibbs come immediately to mind here). But as is wont to be the case whenever the answer to a question isn't to be found on page nnn of the manual, trying to get this resolved has been like running into a brick wall (and almost as much fun). Additional (possibly relevant) information: We don't use the transaction logging feature with our databases. malloc(3X) fared no better than malloc(3C). Our kernel is configured for a 2M process size and we've experimented with the shmbrk parameter and the brk() and sbrk() system calls (both clients and server do attach to shared memory but we've ascertained that's of no discernible impact). All to no avail. Also, for various reasons (mostly related to networking issues), an upgrade to a more recent version of the OS and the associated Informix software is not really an option right now, although it would be interesting to know whether this is a known problem which has been fixed in later releases. Any comments or suggestions would be most welcome. And I'd be happy to answer any further questions on the matter. If I receive any trenchant correspondence in e-mail, I will (absolutely!) summarize back to the net. Thanks in advance. -- Charles B. Francois UUCP:...!uunet!toro!cbf