sysop@comhex.UUCP (Joe E. Powell) (07/26/89)
Has anyone else ever noticed that very large (over 300K) files sometimes tend to core dump when they are invoked? They usually work fine, but every now and again, the program will just refuse to start up. Is it just me or have other people had this happen? I've noticed this occasionally on nethack and moria, but more often with gcc (esp gcc 1.35). I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. -- Joe E. Powell unf7!comhex!sysop@bikini.cis.ufl.edu
lenny@icus.islp.ny.us (Lenny Tropiano) (07/27/89)
In article <211@comhex.UUCP> sysop@comhex.UUCP (Joe E. Powell) writes: |>Has anyone else ever noticed that very large (over 300K) files |>sometimes tend to core dump when they are invoked? They usually |>work fine, but every now and again, the program will just refuse |>to start up. Is it just me or have other people had this happen? |> |>I've noticed this occasionally on nethack and moria, but more |>often with gcc (esp gcc 1.35). |> |>I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. I would check the output (if any) in your /usr/adm/unix.log file. It is possible you are getting NMI Parity errors (if your program core dumps with a "Memory Fault" this kinda sounds like the symptoms) This usually signifies bad memory (sorry to say) ... Run memory diagnostics, although that doesn't always find the problem. I'd also suggest pulling the .5MB of RAM out, and see if it goes away ... (possibly the bad memory is on the expansion board). Good luck! -Lenny -- Lenny Tropiano ICUS Software Systems [w] +1 (516) 589-7930 lenny@icus.islp.ny.us Telex; 154232428 ICUS [h] +1 (516) 968-8576 {ames,talcott,decuac,hombre,pacbell,sbcs}!icus!lenny attmail!icus!lenny ICUS Software Systems -- PO Box 1; Islip Terrace, NY 11752
jcm@mtunb.ATT.COM (was-John McMillan) (07/28/89)
In article <929@icus.islp.ny.us> lenny@icus.islp.ny.us (Lenny Tropiano) writes: >In article <211@comhex.UUCP> sysop@comhex.UUCP (Joe E. Powell) writes: >|>Has anyone else ever noticed that very large (over 300K) files >|>sometimes tend to core dump when they are invoked? They usually >|>work fine, but every now and again, the program will just refuse >|>to start up. Is it just me or have other people had this happen? >|> >|>I've noticed this occasionally on nethack and moria, but more >|>often with gcc (esp gcc 1.35). >|> >|>I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. Uhhhhh. Finally, a chance to disagree with Lenny!?-) Sounds to me like our once-a-week visit to SWAP land. If you've exhausted SWAP space, it is presented to the program as an ENOMEM error in some phase of forking/execing/malloc-ing/stack-extending. And programs often presume there's lotza mem, so why check return values! ***** NOTE WELL: _MY_ code never does this ]8-) ***** If so: run less, or allocate more. john mcmillan -- att!mtunb!jcm -- "What NEVER? ... Hardly EVER ..." Gilbert & Sullivan (Pinafore)
sysop@comhex.UUCP (Joe E. Powell) (07/28/89)
In article <929@icus.islp.ny.us>, lenny@icus.islp.ny.us (Lenny Tropiano) writes: > In article <211@comhex.UUCP> sysop@comhex.UUCP (Joe E. Powell) writes: > |>Has anyone else ever noticed that very large (over 300K) files > |>sometimes tend to core dump when they are invoked? They usually > |>work fine, but every now and again, the program will just refuse > |>to start up. Is it just me or have other people had this happen? > |> > |>I've noticed this occasionally on nethack and moria, but more > |>often with gcc (esp gcc 1.35). > |> > |>I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. > > I would check the output (if any) in your /usr/adm/unix.log file. It I mail myself a copy of the unix.log file every night, so I know I'm not having anything like that show up. > is possible you are getting NMI Parity errors (if your program core dumps > with a "Memory Fault" this kinda sounds like the symptoms) This usually Hmmm...but doesn't it say "Memory Fault" every time a program dumps core? No matter what? > signifies bad memory (sorry to say) ... Run memory diagnostics, although > that doesn't always find the problem. I'd also suggest pulling the .5MB > of RAM out, and see if it goes away ... (possibly the bad memory is on > the expansion board). This happens on two different machines. One with 1MB on the motherboard w/1.5MB combo card, and another with 2MB on the motherboard w/.5MB ram card. Let me clarity what happens: $ nethack Memory fault - core dumped $ nethack Memory fault - core dumped $ nethack Memory fault - core dumped $ nethack the nethack screen starts up. Do you still think I'm having memory problems? -- Joe E. Powell unf7!comhex!sysop@bikini.cis.ufl.edu
res@cbnews.ATT.COM (Robert E. Stampfli) (07/30/89)
In article <211@comhex.UUCP> sysop@comhex.UUCP (Joe E. Powell) writes: >Has anyone else ever noticed that very large (over 300K) files >sometimes tend to core dump when they are invoked? They usually >work fine, but every now and again, the program will just refuse >to start up. Is it just me or have other people had this happen? > >I've noticed this occasionally on nethack and moria, but more >often with gcc (esp gcc 1.35). > >I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. > >-- >Joe E. Powell >unf7!comhex!sysop@bikini.cis.ufl.edu Yes, I have noticed this also with gcc 1.35. We recently added more memory to one of our machines (a 2-meg expansion card), making our configuration remarkably similar to yours: 2.5 meg ram (.5 on motherboard, 2.0 on expansion) 40 meg disk, 3.51 (not "a" revision). We have a serial card, also. Now, all of a sudden, I notice that gcc-1.35 dumps core more than half of the time, but when it doesn't, it works fine. This is when I run it from the tty002 line connected to my terminal. Now the kicker: When I run gcc from the console, it *always* works. My hypothesis up to this point, without looking at the problem in detail, was that there is probably some interaction with the number of bytes of exported variables on the stack, perhaps a bug in gcc that caused it to use more stack than was allocated. This would explain it working sometimes and not others, but after your posting I am less convinced it is a gcc problem. I am curious: Do you run gcc from the console or a tty line. If so, which tty (before the upgrade, my tty was on tty001, and it worked fine from there, although I have not had the opportunity to recable and try it since). Also, is your machine a .5/2.0 configuration? BTW, the memory card is an upgraded .5 meg card which was run thru numerous passes of the diagnostics without a glitch. I forget the signal number, but gcc dies with a segmentation fault, which is unlikely to be due to a hardware problem. Rob Stampfli att!cbnews!res (work) osu-cis!n8emr!kd8wk!res (home)
bob@rush.cts.com (Bob Ames) (08/03/89)
In article <211@comhex.UUCP> sysop@comhex.UUCP (Joe E. Powell) writes: >Has anyone else ever noticed that very large (over 300K) files >sometimes tend to core dump when they are invoked? They usually >work fine, but every now and again, the program will just refuse >to start up. Is it just me or have other people had this happen? > >I've noticed this occasionally on nethack and moria, but more >often with gcc (esp gcc 1.35). > >I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. I don`t have gcc, but I have the same thing happen on moria. I have never seen nethack2.2 have a problem. On a slightly related subject, I`ve been trying, for about 1 YEAR, to get a version of nethack higher than 2.2. I`ve never seen kitchen sinks... Does *anybody* have a working 2.3x version that wouldn`t be too hard to the unix-pc? I can now FTP, I think, if provided exact instructions after the 'ftp site.name.xxf.fgr' command... Or I could call you... Or I could send disks with SASE... Or I could .... Or I could .... Also, What`s happened to killer? I finally got a DOS card for this beast and now can`t get to killer to get ibm-pc archives |-( Sorry for all these subjects, I only get unix-pc.all currently. Bob Ames Bob Ames The National Organization for the Reform of Marijuana Laws, NORML "Pot is the world's best source of complete protein, alcohol fuel, and paper, is the best fire de-erosion seed, and is america's largest cash crop." - USDA bob@rush.cts.com or ncr-sd!rush!bob@nosc.mil or rutgers!ucsd!ncr-sd!rush!bob 619-741-UN2X "We each pay a fabulous price for our visions of paradise," Rush
jcm@mtunb.ATT.COM (was-John McMillan) (08/03/89)
In article <211@comhex.UUCP> sysop@comhex.UUCP (Joe E. Powell) writes: >Has anyone else ever noticed that very large (over 300K) files >sometimes tend to core dump when they are invoked? They usually >work fine, but every now and again, the program will just refuse >to start up. Is it just me or have other people had this happen? A final (HAhahaha...) muttering from me on this: 1) While I posted (or E-mailed) earlier that this sounds like classic outta-SWAP trouble, another possibility occured to me. 2) Until approx. the 3.51C kernel, there was an error in the 'getcontext()' code. Despite mis-leading COMMENTS, the code failed to properly change context between two processes. Specifically, if the OUTGOING process had SHARED MEMORY [SHM], it was left mapped-in in the 'MMU'. This caused gross pain when that SHM was at a low enough address that another process attempted to use the same Virtual Address [VA] space. Since the 'MMU' indicated the page was PRESENT, the new process didn't fault-in its own page on 1st access -- ie., start-up, usually. Since it's typically difficult to execute another programs shared data, death by illegal- instruction was common if the new program had a LARGE TEXT image. The error remained as long as it did because: a) the coincidence of low-VA SHM & concurrent large TEXT process is rare; b) the kernel code was 'correct' and the comments were mis-leading: the CONCEPT was flawed because the stated process was not 'in-context' when the code was executing; c) the entire, creaking/ancient VM base for the 3B1 -- based on some Berkeley model -- is obscure and barely even patchable due to its anomalies. In brief, there are SOME cases where program start-up failures may reflect the above problem. I presume IPCS(1) would indicate if you've SHM in use, but WHERE it's mapped is another Q. john mcmillan -- att!mtunb!jcm
wolfer@cbnewse.ATT.COM (paul.d.wolfson) (08/05/89)
In article <211@comhex.UUCP>, sysop@comhex.UUCP (Joe E. Powell) writes: > Has anyone else ever noticed that very large (over 300K) files > sometimes tend to core dump when they are invoked? They usually > work fine, but every now and again, the program will just refuse > to start up. Is it just me or have other people had this happen? > > I've noticed this occasionally on nethack and moria, but more > often with gcc (esp gcc 1.35). > > I'm running 3.51a, with a 40 MB drive and 2.5 MB of RAM. > > -- > Joe E. Powell > unf7!comhex!sysop@bikini.cis.ufl.edu I've been running umoria (4.85, all patches applied) for about two years. I have just the basic system (20M HD, 3.5, 1Meg RAM). Umoria has a few panic save routines to prevent you from losing your character, but I haven't tried the latest version of Nethack. With umoria, the few times it crashed (very few) I tracked down to overindexing arrays, of all things. Some larger Unix machines seem to let you get away with this to a certain degree, but not the unixpc. Umoria is compiled with the -g option in the makefile, so just run sdb to find where in the code it's blowing up, or With Nethack, check to see of the -g option is used, and check the code for the overindexing problem. I have found no indications of core dumps being caused by the unixpc, itself, for any of the large games I've run. Also, it's always a good idea to run lint on these big games. They are usually writted for and run on the big mainframes on college campuses. Most of them are written for BSD with hooks for SYSV. I'm not sure where the unixpc fits in to these unix versions.. It seems to be a mongrel composed of a little of both. Anyway, I've found some very strange things in the lint output for some of these games, and lint output may give you some leads as to where to search for your core dump problems. ________________________________________________________________________ P. Wolfson
student@unf7.UUCP (student account) (08/07/89)
In article <1586@mtunb.ATT.COM>, jcm@mtunb.ATT.COM (was-John McMillan) writes: >> In article <211@comhex.UUCP> I write: >> [ description about large programs dumping core on startup -- deleted ] > [ description of shared memory page conflict deleted ] That was it! I took out all the programs that were using shared memory and everything is working fine now. You say the problem has been fixed? I assume we'll see the fixes in the upcoming fix disk? -- Joe E. Powell unf7!comhex!sysop@bikini.cis.ufl.edu
jcm@mtunb.ATT.COM (was-John McMillan) (08/08/89)
In article <212@unf7.UUCP> unf7!comhex!sysop@bikini.cis.ufl.edu writes: > >That was it! I took out all the programs that were using shared memory >and everything is working fine now. > >You say the problem has been fixed? I assume we'll see the fixes in the >upcoming fix disk? I say: the problem is fixed in currently available fix-disk -- to the best of MY recollection. Regardless, the fixed source has been submitted & accepted, so far as I know. If this is a problem, and you will accept a *3.51* experimental kernel, submit an E-mail address to moi: ^^^^^^^^^^^^ att!mtunb!jcm john mcmillan -- as above -- Growling on forever....