cwc@daisy.UUCP (C.W. Chung) (06/06/85)
(This may have already been posted. Please excuse if so). Taking a full dump is a pain in the neck. Our VAX/750 has 2 Eagle dirves and one 80 Mbyte removable drive. It takes literally a whole day to take a full dump of the entire system. Even if I take a dump for one file system at a time, it still requires more than 3 hours for a large partition (h or g partition). I would like to do the backup when the sytem is quiet. So how do I squeeze in the few hours to do the backup? Usually late at night or in the weekends. However, many folks on the system love to work at night and on weekends/Sundays. I guess I could shut down the system, but I would rather do it otherwise considering that I have to shut down the system for a few hours. Has anyone out there with a much faster 'dump' programs, or a faster way to do 'dump'? I can see the existing 'dump' can be improved significantly since the Cipher Tape is not streaming and most of the time, the VAX is idle waiting for I/O. Any ideas, comments, pointers? Please mail directly to me. I'll summarise if requested. Thanks. C.W.Chung Tel: (415)960-6976 {cbosgd,hplabs,ihnp4,seismo}!nsc!daisy!cwc --
chris@umcp-cs.UUCP (Chris Torek) (06/09/85)
net.news.sa is not the right place for this, so I've stuck a Followup-To header in... > Taking a full dump is a pain in the neck. Our VAX/750 has 2 Eagle drives > and one 80 Mbyte removable drive. It takes literally a whole day to take > a full dump of the entire system. [...] I guess I could shut down the > system, but I would rather do it otherwise considering that I have to > shut down the system for a few hours. Running dump on active file systems is not a good idea. Dump scans the entire file system once first to see what to do, then assumes that nothing changes as it goes along. Locally, we compromise by doing full backups with the machine in single user mode and incrementals with the system active. At least we'll never have to go back more than two weeks.... > Has anyone out there with a much faster 'dump' programs, or a faster > way to do 'dump'? I can see the existing 'dump' can be improved > significantly since the Cipher Tape is not streaming and most of the > time, the VAX is idle waiting for I/O. Don Speck (at CalTech) and I had this one out fairly recently. He's got some changes to /etc/dump that make it use N processes; this does a good job of keeping the disk and tape drives active. I solved the same problem a different way, by sticking a hack in the kernel that does pseudo-asynchronous I/O on character devices. (I call my hack the ``mass driver''. I think I'll bring something on it to Portland.) With my changes, we have cut the time to do backups literally in half on the big 780s, and down to a third on the 750s. (The 750s have streamers---TU80s---which can run 4 times as fast in streaming mode; that's what gives them their edge.) We used to take umcp-cs down on Wednesdays at 7:30 and have it back up by 1:30 or so; now it's back up around 10:30 (when all goes well; occasionally one of the two tape drives craps out). I've also used the mass driver to make a version of ``dd'' that runs much faster; we use this for making distribution tapes. (Makes quite a difference to have 20-level buffering when the load is around 7....) For those who missed it the first time, I still have a copy of the original mass driver distribution kit. It's available via anonymous FTP from host MARYLAND.ARPA (grab the file mass_driver). By the way, I discovered (quite by accident) that after increasing MAXBSIZE, stdio sometimes breaks on /dev/null, because there are two fixed-size buffers (_sibuf and _sobuf) which are used for stdin and stdout, and because the stat system call puts MAXBSIZE in the st_blksize field. Recompiling the C library fixes that. (I did consider the MAXBSIZE increase, but decided that it wouldn't hurt since no one was going to go make 16k/2k file systems on their disks, so I figured st_blksize would always be <= 8K.) [I think Berkeley should do what Sun did: remove _sobuf and _sibuf from stdio and just use malloc.] Also by the way, Don Speck's code is also available from him (I believe). It takes a lot less work to install. I don't know for sure how it compares to my mass driver for performance. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
cwc@daisy.UUCP (C.W. Chung) (06/18/85)
This is summary of what I have heard from the net. Thanks to all those netters who has sent me info on this subject. I really appreciate your responses and thoughtful suggestions. There are several ways to speed up the dumping: use concurrent processes to overlap the disk and tape I/O, modify the kernel to support higher throughput, modify the record size of dump/restore, dump to disk in stead of tape. The 'caltech mod' by Don Speck (now at BRL?) make use of concurrent processes to overlap the disk and tape I/O. The number of concurrent processes is apparently a tunable parameter. It is set to 3 by default: one process writes to the tape and two processes read from the disk. As Jeff Gilliam and others pointed out, Don Speck has posted the source modification to the /etc/dump program a while ago. I went back to the net.sources and found it ( id: 8339brl-tgr.arpa; posted 2/24/85). It replace dumptape.c of the 4.2 /usr/src/etc/dump. The making and installation is quite simple. I have it up and running very quickly. I have not tried it on a large file system. However, the results on dumping the root partition does not measured up to my expectation. I found no substantial improvement at all ( 4:39 minutes with 4.2 dump vs 4:33 minutes with the modified dump to dump 4996 tape blocks). It probably is due to the small size of the file system. With an Eagle disk, the root partition spreads in 100 cylinders although a minimum of 16 cylinders is enough to hold 7.8Mbyte. With such a small disk, there is no significant advantage in creating several processes to overlap I/O. I expect the time saving of a larger file system will be substantial, however. Don's mod does a good job in overlapping the activities in the I/O. However, it does not solve the overhead problem with the UNIX file system. The kernel has to be modified to support a higher steady I/O transfer rate. Apparently, Don is also working on that (the forthcoming raw i/o speedup?). Chris Torek of Univ. of Maryland has a mod call 'mass driver' (to dump out a huge block of data at high rate??). He said he may bring that to USENIX. That should be a welcome speedup. Another way is to use a bigger blocking factor. (The default is 1K block and 10 blocks per record.) There is an undocumented option '-b' to do just that. However, the 'restore' does not support this option!! Dave Barto has a version of restore that could understand the '-b' option. Roy Smith makes a suggestion which I have also been contemplating for some time: dump to the disk and then write the image to a tape afterwards. This shortens the time that the system should be quiet/shutdown. Obviously, you need to have independent seek arms ; the disk you are writing on is better to be not the one that you are dumping from. You also need to allocate a reasonable big space for the dump. A candidate is the swap partition. Another possibility is to use a different disk pack for the dumping purpose if you have a removable disk drive. The option of tape density (-d) and tape length (-s) can be used to control how many blocks are dumped to the device and hence should be useful for multi-reel dump also. This mode of dumping is particularly useful for incremental dump. However, I am more interested in shortening the time to do full and multi-reel dump. If the dump fits in one tape, then I can insert the tape in the tape drive, kick off the dump in early morning without human assistance. Backup is annoying only because I have to sit around to wait for the dump to finish one tape so that I could put in another reel. I have hacked up the dump to give me an estimate of the tape to be used so that I could decide whether to take the dump or not. Looks like with a combination of concurrent processes, speedup of raw i/o and larger blocking factor, we could really speed up the dump/restore a lot. I am looking forward to seeing the posting from Don Speck, Christ Torek and Dave Barto. I am also looking forward to seeing actual data measure of various dump speedup. I only have the original posting from Don Speck. I'll be happy to forward it to whoever may miss the original one. Thanks. C.W.Chung -- {cbosgd,fortune,hplabs,ihnp4,seismo}!nsc!daisy!cwc Chi Wo Chung Daisy System Corp, 700 Middlefield Road, Mountain View CA 94039. (415)960-6976