KFL@AI.AI.MIT.EDU (Keith F. Lynch) (07/25/87)
We have had a lot of problems with the Sun 3/260 (3.3) (4.2BSD) we have had for two months. The file system has gotten totalled several times. Our Sun representative has told us that if either disk partition becomes more than 90% full, it is normal for all files on both partitions to be trashed without warning. Is this right? If it is, is there a way to prevent more than 90% of a partition from being used? He also said it could be trashed if a program tries to use too much memory, for instance with large arrays of real numbers. Is this true? If so, how can we prevent this? He has also said that after using doing a restore of a zero level dump, it is necessary to immediately do another zero level dump or the file system will get hosed again. Is this really needed? If so, can it be done overnight, to /dev/null? Please reply to me. I am not on both of these lists. ...Keith
barry@mind.UUCP (Barry Lustig) (07/26/87)
In article <8467@brl-adm.ARPA> KFL@AI.AI.MIT.EDU (Keith F. Lynch) writes: >We have had a lot of problems with the Sun 3/260 (3.3) (4.2BSD) we have >had for two months. The file system has gotten totaled several times. > >Our Sun representative has told us that if either disk partition becomes >more than 90% full, it is normal for all files on both partitions to be >trashed without warning. Is this right? If it is, is there a way to >prevent more than 90% of a partition from being used? That has got to be the one of the most pathetic explanations I've ever heard. There is no reason for any file to get trashed because the file systems is 90% full. If it were true, 75% of the filesystems in UNIX land would be trashed. 90% is an interesting figure though. In the Berkeley fast filesystem, only root can allocate the last 10% of a filesystem (changeable with tunefs(8)). >He also said it could be trashed if a program tries to use too much >memory, for instance with large arrays of real numbers. Is this true? >If so, how can we prevent this? More garbage from your Sun representative. >He has also said that after using doing a restore of a zero level dump, >it is necessary to immediately do another zero level dump or the file >system will get hosed again. Is this really needed? If so, can it be >done overnight, to /dev/null? And even more garbage. Do you by any chance have either a Xylogics 451 controller or a Fuji SuperEagle? If so, that is where you problem probably is. Under very heavy loads with 2 drives hanging off of it, the 451 has been known to write data with bits shifted. The Fuji SuperEagles have also been know to have problems. I recommend that you call 1-800-USA-4SUN (Sun's technical support number) and demand some competent help with your problem. Barry Lustig Cognitive Science Lab Princeton University
jpn@teddy.UUCP (John P. Nelson) (07/27/87)
>In article <8467@brl-adm.ARPA> KFL@AI.AI.MIT.EDU (Keith F. Lynch) writes: >>He has also said that after using doing a restore of a zero level dump, >>it is necessary to immediately do another zero level dump In article <1052@mind.UUCP> barry@mind.UUCP (Barry Lustig) writes: >And even more garbage. Most of what the "Sun representative" is supposed to have said was just that: garbage. Interestingly, this part is NOT garbage. Oh, not doing another level 0 dump will not trash the filesystem, but it COULD render all subsequent incremental backups useless. To quote from the "restore" manual page: A level zero dump must be done after a full restore. Because restore runs in user mode, it has no control over inode allocation; this means that restore repositions the files, although it does not change their contents. Thus, a full dump must be done to get a new set of directories reflecting the new file positions, so that later incremental dumps will be correct.
john@xanth.UUCP (John Owens) (07/28/87)
In article <1052@mind.UUCP>, barry@mind.UUCP (Barry Lustig) writes: > In article <8467@brl-adm.ARPA> KFL@AI.AI.MIT.EDU (Keith F. Lynch) writes: > >He has also said that after using doing a restore of a zero level dump, > >it is necessary to immediately do another zero level dump or the file > >system will get hosed again. Is this really needed? If so, can it be > >done overnight, to /dev/null? > > And even more garbage. Your other comments are good, but in this case, the Sun person was somewhat correct, even if he didn't really know what he was talking about. The filesystem certainly won't get "hosed" if you don't dump it, but future incremental dumps will. If you do a complete filesystem restoration (level 0 and any incrementals), it's good practice to do a fresh level 0 dump. You *must* do this before any future incrementals on that filesystem. If you know the next scheduled backup of that filesystem is a level 0 dump, it's safe not to worry about it. -- John Owens Old Dominion University - Norfolk, Virginia, USA john@ODU.EDU old arpa: john%odu.edu@RELAY.CS.NET +1 804 440 4529 old uucp: {decuac,harvard,hoptoad,mcnc}!xanth!john
gordon@sneaky.UUCP (08/01/87)
> /* Written 6:53 pm Jul 25, 1987 by mind.UUCP!barry in sneaky:comp.unix.ques */ > In article <8467@brl-adm.ARPA> KFL@AI.AI.MIT.EDU (Keith F. Lynch) writes: ... > >He has also said that after using doing a restore of a zero level dump, > >it is necessary to immediately do another zero level dump or the file > >system will get hosed again. Is this really needed? If so, can it be > >done overnight, to /dev/null? > > And even more garbage. ... I'm not so sure this is pure garbage. The 4.2/4.3 BSD restore program restores files by going through the file system, not the disk device, and the inodes of the files restored do not necessarily have the same numbers as they had on the dump. If you don't do a full level zero dump, and later do an incremental dump, and your file system gets trashed again (not because of not doing a level zero dump, but because of Murphy's Law), and you try to restore the OLD level zero dump and the NEW incremental on top of it, you will probably get garbage. The first dump of the restored file system should be level 0. Gordon L. Burditt ...!ihnp4!sys1!sneaky!gordon ...!convex!infoswx!hal6000!sneaky!gordon
mangler@cit-vax.Caltech.Edu (System Mangler) (08/01/87)
In article <1052@mind.UUCP>, barry@mind.UUCP (Barry Lustig) writes: > Under very > heavy loads with 2 drives hanging off of it, the 451 has been known to > write data with bits shifted. Our old Xylogics 450 did the same thing, and this was without overlapped seeks, so I don't understand how it can matter how many drives are on the controller. (Yes, we had two on ours). I guess this makes the 451 "bug-for-bug" compatible with the 450? Do they use the same microcode? In our case this particular problem went away after a controller swap. Don Speck speck@vlsi.caltech.edu {ll-xn,rutgers,amdahl}!cit-vax!speck
mangler@cit-vax.UUCP (08/01/87)
>In article <8467@brl-adm.ARPA> KFL@AI.AI.MIT.EDU (Keith F. Lynch) writes: >>He has also said that after using doing a restore of a zero level dump, >>it is necessary to immediately do another zero level dump In article <4221@teddy.UUCP>, jpn@teddy.UUCP (John P. Nelson) writes: > not doing another level 0 dump [...] COULD render > all subsequent incremental backups useless. To quote from the "restore" > manual page: > [...] Thus, a > full dump must be done to get a new set of directories > reflecting the new file positions, so that later incremental > dumps will be correct. [BSD-specific discussion] Could someone explain to me why this should be true? After a full restore, the st_ctime of every file/directory has been updated, so the next dump, no matter what level, should dump every single file, right? How is running restore on a fresh filesystem different from cleaning off the current filesystem with "rm -rf" and creating a bunch of files? Admittedly, since the incremental will be just as large as a full, you might as well do a full, but it seems like something is basically wrong if an incremental doesn't work just because you changed *everything*. [Yes, I know that restore bombs if you changed *nothing*, but so what]. Perhaps the comments in the man page are a holdover from 4.1bsd dump, which did, in fact, need this restriction because restor [sic] wrote the raw disk and thus did not update st_ctime? Don Speck speck@vlsi.caltech.edu {ll-xn,rutgers,amdahl}!cit-vax!speck
jerry@oliveb.UUCP (Jerry F Aguirre) (08/29/87)
In article <8467@brl-adm.ARPA> KFL@AI.AI.MIT.EDU (Keith F. Lynch) writes: >He has also said that after using doing a restore of a zero level dump, >it is necessary to immediately do another zero level dump or the file >system will get hosed again. Is this really needed? If so, can it be >done overnight, to /dev/null? 1/half true. Under 4.1BSD and before the restor (sic) worked on the raw file system. This meant that the restored files had the same inode number and ctime. (The only thing different would be the actual data block addresses and that is desirable.) After 4.2BSD, "restore" worked on the mounted file system and did creat, write, link, etc. calls to put the files on disk. Because of this the inode numbers would almost certainly be different. Also, while the restore could reset the atime and mtime using utimes(2), there is no system call to reset the ctime. So, the inodes are different and the ctime is the time of restore. The updated ctimes will force the next dump, of whatever level, to dump every file that was restored. If you are planning a small level 9 dump and the entire file system gets dumped this can cause confusion. The best way to avoid confusion is to do another level 0 dump. The only rush is to do it before, or in place of, the next regular dump of that file system. People have suggested playing with the system date or editing dumpdates. This will not fix anything because the inode numbers have changed. That is the kind of thing that can cause corrupted files if you have to do another restore later. The important thing to remember is that all the files really have been changed (as far as dump/restore is concerned) and will get dumped. Jerry Aguirre