lenny@icus.islp.ny.us (Lenny Tropiano) (11/09/89)
Well my machine was up for 30 days, 20 hours, 14 minutes and it just flipped out! Things were running slowly. Does memory get fragmented like disks do? Should you reboot frequently, and what is the frequency? I was running a news unbatch (compress running...). I was in elm, and I have a few drivers loaded (just a few) :-) DEVNAME ID BLK CHAR LINE SIZE ADDR FLAGS wind 0 -1 7 -1 0x9000 0x54000 ALLOC BOUND lipc 1 -1 -1 -1 0x7000 0x360000 ALLOC BOUND cmb 2 -1 -1 -1 0x3000 0x5d000 ALLOC BOUND voice 3 -1 9 -1 0xa000 0x367000 ALLOC BOUND tp 4 -1 10 -1 0x3000 0x371000 ALLOC BOUND starlan 5 -1 12 -1 0x14000 0x3de000 ALLOC BOUND Then of course I have the typical StarLAN daemons running. My daemons, and whatever else (probably an uucico & vi). Am I just asking for too much out of the machine for 3.5MB. I know this has been discussed before, shouldn't the processes just swap out to disk, if there isn't enough memory? I do have the standard 5MB swap partition. The last dying word of my machine was: sysinfo: cannot read /dev/rfp002 (NOTE: there are not HDERR's in unix.log) The machine was very quiet, sounded like the news unbatching halted. The LEDs were normal. The mouse responded and tried to select windows. Then I got a prompt back: [654 Filecabinet] ps -ef Killed [655 Filecabinet] ps -ef Killed Would an extra .5MB help? I'm going to do the 4MB upgrade (1.5MB on the combo card, a hardware patched 512K card, and 2MB on the motherboard). Should I decrease the available memory to compress (USERMEM) and recompile? What major effects will this have on compressing (time wise)? -Lenny -- | Lenny Tropiano ICUS Software Systems [w] +1 (516) 589-7930 | | lenny@icus.islp.ny.us Telex; 154232428 ICUS [h] +1 (516) 968-8576 | | {ames,pacbell,decuac,hombre,sbcs,attctc}!icus!lenny attmail!icus!lenny | +------- ICUS Software Systems -- PO Box 1; Islip Terrace, NY 11752 -------+
gil@limbic.UUCP (Gil Kloepfer Jr.) (11/09/89)
In article <1020@icus.islp.ny.us> lenny@icus.islp.ny.us (Lenny Tropiano) writes: >Well my machine was up for 30 days, 20 hours, 14 minutes and it just >flipped out! Things were running slowly. Does memory get fragmented >like disks do? Okay, I'll bite. Note that the following is definitely based on theory, and not on documented experience, but it sounds reasonable. The way the UNIX-pc MM works, memory will probably be "fragmented" 99% of the time, but the hardware page tables should map the memory into what appears to be a "contiguous" section of memory. What it can't get in "real" memory, it will page off to disk as "virtual" memory by virtue of the memory management system. Now someone with a good working knowledge of the internals of the UNIX-pc kernel (you know who you are ;-) could probably check the way that pages are allocated and freed and whether the page table entries are being maintained properly. Someone mentioned in an earlier article that he never saw his machine (via sysinfo) go below .5 meg free. This might (??) be a related problem. > Should you reboot frequently, and what is the frequency? I would say that most of us would say that you should NEVER have to reboot your machine. The AT&T hotline would most certainly say you should, every day if possible ;-) [if that doesn't work, you could always reformat the hard disk and reload the OS ;-) ;-) ] Considering the number of daemons running on your machine, and the nature of the devices you have, it might be a good idea to do a "ps -lef" and check the SZ and RSZ fields (I think those are memory, right folks?!) and see if any of them continuously increase from day to day. One of these daemons might be eating your memory to oblivion! >The last dying word of my machine was: >sysinfo: cannot read /dev/rfp002 Hmmm.... System buffers maybe? For those who find it necessary to flame for incorrect information, my disclaimer here is that I don't claim to know all about this, but I'm hoping that these comments will encourage some thought about what might be happening. Gil. ----- | Gil Kloepfer, Jr. | ICUS Software Systems/Bowne Management Systems (depending on where I am) | ...ames!limbic!gil
spear@druco.ATT.COM (Steve Spearman) (11/10/89)
> > In article <1020@icus.islp.ny.us> lenny@icus.islp.ny.us (Lenny Tropiano) writes: >>Well my machine was up for 30 days, 20 hours, 14 minutes and it just >>flipped out! Things were running slowly. Does memory get fragmented >>like disks do? As far as paging goes, the UnixPC definitely can page correctly. I was running a 512K system for a while (yes, it was very painful) and EVERYTHING would page out. I really think the problem you are experiencing is related to a system fault, not insufficient memory size. I run for months with 2.5 meg with no problems, but have nowhere near the amount of drivers and load it sounds like you have. Sounds like either a driver bug or a real kernel bug. At this point, a planned restart might be a reasonable option rather than pursuing something that takes so long to reproduce. Weekly or monthly would probably be fine. I've seen others that believe this is a good idea. Steve Spearman spear@booboo.att.com
jbm@uncle.UUCP (John B. Milton) (11/11/89)
In article <577@limbic.UUCP> gil@limbic.UUCP (Gil Kloepfer Jr.) writes: >In article <1020@icus.islp.ny.us> lenny@icus.islp.ny.us (Lenny Tropiano) writes: >>Well my machine was up for 30 days, 20 hours, 14 minutes and it just >>flipped out! Things were running slowly. Does memory get fragmented >>like disks do? Slow down sounds like the clist problem. Was your disk doing lots of recals? (unusual buzzing or humming on some drives) >the nature of the devices you have, it might be a good idea to do >a "ps -lef" and check the SZ and RSZ fields (I think those are memory, >right folks?!) and see if any of them continuously increase from day to >day. One of these daemons might be eating your memory to oblivion! But that is still the size(s) of the process, not the working set (the part that wants to be in RAM all the time). Various RAM based fragmentation in the kernel can slow the system down a little, but tremendously. >>The last dying word of my machine was: >>sysinfo: cannot read /dev/rfp002 Disk error bud, no other reason fo read failure, check /usr/adm/unix.log for the bad news. John -- John Bly Milton IV, jbm@uncle.UUCP, n8emr!uncle!jbm@osu-cis.cis.ohio-state.edu (614) h:252-8544, w:785-1110; N8KSN, AMPR: 44.70.0.52; Don't FLAME, inform!
wtm@neoucom.UUCP (Bill Mayhew) (11/12/89)
I have 2 meg on the motherboard and a 67 meg drive on my 3b1. According to sysinfo, the smallest I've seen the free store drop is 497K; even when I've tried to excessively load the machine. I'm not sure if the machine is sensing a low water mark at 512K and starting to page to disk at that point or what. Heaven only knows what algorithm the kernel is using. At work, we use a 7300 with a 40 meg disk with 1 meg on the motherboard and 1.5 on cards. We have three serial ports going. We use tty000 to drive a trailblazer and tty000 connected to a dz port on a vax 750. The 7300 is our news server since we've had a lot of problems with SILO overruns on the vax ports talking to the trailbalzer. I've noticed that about every two weeks the 7300 at work goes brain-dead with a normal console display and working mouse, but ignores any input via the keyboard or tty. It would seem that all the jobs in the ready queue, except for smgr, are hung becaue anything such as sysinfo does not update the display, but the time and date in W5 at the top of the screen keep updating. We're running HDB uucp. Unfortunately, since the machine is locked up it is rather difficult to tell what is going on. At home, I had a LOT of problems running version 3.51 and the stock uucp. My machine crashed at least once a day. After considerable consultation with ye olde hotline, AT&T replaced the motherboard after trying a recompiled uucico that they downloaded to my machine. Even with the new motherboard, I still had crashing. The HDB uucp solved the problems I was having. The crashes that I got at home were the same in symptom that we get on occasion at work. The difference is that at work we transfer about 4 megs in and 4 megs of data out of the 7300 every day. At home, my input is about 400 to 750K per day. We can live with the occasional crashes at work, since once every two weeks or so isn't that awful :-). I built a little box with a 6502 CPU chip that monitors the serial line going to the vax. If the monitor doesn't see a uucico from the 7300 at least once an hour, it picks a relay to reboot the 7300. What I have noticed on my machine is that big chunks of memory erode over the course of two or three days and then reappear. The free store is about 900K right after booting the mahcine, and this usually drops to aobut 625K over a couple of days, after which the memory seems to return from limbo. As someone mentioned a while ago, the clist buffers seem to slowly go away too. My machine starts out with about 130, and that drops to about 120 after a while, but seems to stabilize at 120. One last item. I've talked to people that had tons of problems with the 3.51 uucp such as myself, while other people have had no problmes at all. One person speculated that use of /dev/ph* for uucp was a factor in the crashing. Bill wtm@neoucom.edu
karl@zip.UUCP (Karl F. Fox) (11/13/89)
In article <577@limbic.UUCP> gil@limbic.UUCP (Gil Kloepfer Jr.) writes: >In article <1020@icus.islp.ny.us> lenny@icus.islp.ny.us (Lenny Tropiano) writes: >>The last dying word of my machine was: >>sysinfo: cannot read /dev/rfp002 > >Hmmm.... System buffers maybe? Nope, /dev/rfp002 is a character device, not a block device, so it doesn't use kernel block buffers. -- Karl F. Fox, Morning Star Technologies, Inc. karl@MorningStar.COM