krowitz@RICHTER.MIT.EDU (David Krowitz) (12/02/88)
Can a node running SR9.7 do backups for nodes running SR10 that are on the same ringnet? The SR10 transition class I took could not come upt with a conclusive answer. SR9.7 nodes are supposed to be able to have file access to SR10 nodes and vice-versa, but the class supervisor wasn't certain whether the SR9.7 version of WBAK/RBAK could handle the SR10 ACL's correctly, and he could not find out whether there was a version of the SR10 WBAK/RBAK that had been compiled to run under SR9.7. I have not found anything in the SR10 release notes (as far as I've read so far) that gives me any hints, my tape drive is attached to a DSP80 which can not be upgraded to run SR10. Can anyone give me a definitive answer as to whether I'll be able to run backups on my SR10 master node? -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter@eddie.mit.edu krowitz%richter@athena.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
donp@CAEN.ENGIN.UMICH.EDU (Don Peacock) (12/02/88)
Can a node running SR9.7 do backups for nodes running SR10 that are on the same ringnet? The SR10 transition class I took could not come upt with a conclusive answer. SR9.7 nodes are supposed to be able to have file access to SR10 nodes and vice-versa, but the class supervisor wasn't certain whether the SR9.7 version of WBAK/RBAK could handle the SR10 ACL's correctly, and he could not find out whether there was a version of the SR10 WBAK/RBAK that had been compiled to run under SR9.7. I have not found anything in the SR10 release notes (as far as I've read so far) that gives me any hints, my tape drive is attached to a DSP80 which can not be upgraded to run SR10. Can anyone give me a definitive answer as to whether I'll be able to run backups on my SR10 master node? -- David Krowitz We have 400 sr97 nodes and maybe 6-10 sr10 nodes. Our backups are done exclusively on sr97 and we have noticed NO problems. Don Peacock University of Michigan donp@caen.engin.umich.edu
lnz@LUCID.COM (Leonard N. Zubkoff) (12/02/88)
I have been having backups done by a SR9.7 node of my SR10 nodes for several months now without any problems. Should a restore be necessary, however, the ACLs created on the SR10 node would not be quite correct; the SR10 required entries for person, group, and organization would not be set and the SR9.7 tape ACL would show up as an SR10 extended ACL. My understanding is that the SR9.7 node only "sees" SR9.7 format ACLs; when 9.7 accesses 10 or vice versa appropriate mappings take place so each node is happy examining the others ACLs. The bottom line is that the data will be preserved just fine, but the ACL information will need some fixup on a restore. Leonard
wicinski@nrl-cmf.UUCP (Tim Wicinski) (12/02/88)
Will the 4.3 that Apollo ships with SR10 include "dump" and "rdump"? Will they ever fix their compiler (re NFS) or will we be forced to abandon them for other vendors? remember, their "4.2" was anyting but... tim
jec@iuvax.cs.indiana.edu (James E. Conley) (12/02/88)
You'll have to abandon them for another vendor... SR10 does not include dump, restore, or their remote versions. This makes some sense if you allow for Apollos different file system formats, but it is still a pain since wbak and rbak are pretty primative (and slow). I'm still waiting for Mach (not from Apollo, I hope!). III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 855-7729 U I U U.S. Mail: James E. Conley U I U Indiana University UUUIUUU Dept. of Computer Science I 004 Lindley Hall III Bloomington, IN. 47405
dclemans.falcon@mntgfx.mentor.com (Dave Clemans) (12/03/88)
From article <152@nrl-cmf.UUCP>, by wicinski@nrl-cmf.UUCP (Tim Wicinski): > Will they ever fix their compiler (re NFS) or will we be forced to abandon > them for other vendors? > Presumably you are talking about the ability to run Apollo binaries stored on non-Apollo disks via NFS (or something similar) The problem is not the compiler, the "problem" is the high degree of intelligence in the program loader. In contrast to typical Unix systems, the Apollo loader just pages in the program from the disk on the remote node. The system is unable to do virtual memory paging using NFS, thus the program can't be loaded. Other problems involve file typing (something that doesn't exist in NFS). The only way I can think of for this ability to be implemented is: if while going through directories looking for a program to execute you cross a NFS boundary, don't check the Apollo file type; just assume that it is a COFF format file create a temporary file on the local node; copy the file from the remote node to the local temporary file then execute the program from the temporary file, arranging to delete the file when the programe exits. This would let you execute programs from NFS disks, though at a performance cost proportional to the size of the program. dgc
krowitz@RICHTER.MIT.EDU (David Krowitz) (12/03/88)
A DSP80 can only have 1.5MB of memory on it, which is not enough to run SR10 (which won't boot unless you have 2MB or more, according to my instructions). The DSP80A and the DSP90 can have 3 MB or (of) memory, so they are ok. -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter@eddie.mit.edu krowitz%richter@athena.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
achille@cernvax.UUCP (achille) (12/04/88)
In article <15471@iuvax.cs.indiana.edu> jec@iuvax.UUCP (James E. Conley) writes: > > You'll have to abandon them for another vendor... SR10 does not include >dump, restore, or their remote versions. This makes some sense if you allow >for Apollos different file system formats, but it is still a pain since wbak >and rbak are pretty primative (and slow). > Actually, I'm using almost daily both (dump+restore) and (r/wbak). Wheter I agree that w/rbak are painfully slow, they actually allow you to save AND restore your files in a PREDICTABLE way. The same is not true for dump and restore. One of the biggest problems with dump is that it does not produce a list of saved files, how do you know a few weeks after the backup, which file is on which tape ? Also, dump stores files in 'random' order on tape (I think is disk block ordering), if you want to restore 2 files that were in the same directory, chances are they are NOT in the same tape (yes, we have dumps that span up to 20 IBM 3480 cartridges). Can you imagine the pleasure of mounting 20 cartridges to find out the file you want on the last tape ? With the wbak output, I can just pick up the right tape and read only that one ! I care about the backup speed, but I also think that speed is not everything ! Your information should not only be saved, it should also be retrievable ! It should be a easy task to do that, operators should be able to do it ! Do you know that after restoring a full dump, you are supposed to run a.s.a.p another full dump ? that's what the Cray doc says, not a decent explanation about this requirement. Why ? I think dump and restore are pretty awkward pieces of software, they are just fast, I would stop using them tomorrow if had w/rbak on the Cray. Probably Apollo hasn't produced very remarkable software, but w/rbak are real neat (as functionality if not as speed) and that should be said. Achille Petrilli Cray & PWS Operations
pha@CAEN.ENGIN.UMICH.EDU (Paul H. Anderson) (12/04/88)
wbak and rbak are sufficient to daily backup a ring of 450 nodes with around 80 gigs of disk storage. rbak and wbak are hardly "primitive" slow, yes, but primitive, no way. Paul Anderson CAEN
jec@IUVAX.CS.INDIANA.EDU (James E. Conley) (12/05/88)
Well, by primative I meant that they lacked some very useful features of dump and restore, namely multiple levels of backup and tape retries if you put a bad tape on. It is better than nothing, but I'd prefer dump and restore any day. Not to mention the ability to do dumps on other machines (A VAX to an Alliant, not just from Apollo to Apollo) using rdump and rrestore. Also, dump dates aren't updated until the backup finishes. And of course, wbak/rbak are slow. I am curious though, what method you use to backup all that data. We have only about 4GB here and it takes several hours to do an incremental of that much data even when it only really writes about 1/5 of a tape. I'll admit that I'm not fluent in AEGIS, but since they are supposed to run UNIX I would hope that I wouldn't need to.
krowitz@RICHTER.MIT.EDU (David Krowitz) (12/05/88)
The reason that incremental backups with WBAK are so slow is that you wind up having to touch every file on the disk to check the date/time it was last modified. Unfortunately, the Apollo file system does not store the DTM in the directory entry of the file, so WBAK must open each file in addition to opening and reading the parent directory. One way in which backups can be speed up is the method used by Workstations Solutions' backup product. They start clients on several nodes which all feed data back to a server which writes the tape. Since the clients run independently of each other they can process several disks simultaneously and send the server buffers of data which have already been formatted for the backup tape. The only drawback to this approach is that you wind up with files from multiple disks all interleaved in a single backup file on the tape rather than in seperate backups. It is easier to retrieve files from a backup when you know for certain which tape it is on. Incremental backups, however, are frequently done which several disks all on a single tape, in which case the method used by Workstation Solutions gives the same results a whole lot faster. -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter@eddie.mit.edu krowitz%richter@athena.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
markley@celece.ucsd.edu (Mike Markley) (12/06/88)
In article <8812051455.AA06261@richter.mit.edu> krowitz@RICHTER.MIT.EDU (David Krowitz) writes: >One way in which backups can be speed up is the method used by Workstations >Solutions' backup product. They start clients on several nodes which all >feed data back to a server which writes the tape. Since the clients run >independently of each other they can process several disks simultaneously >and send the server buffers of data which have already been formatted for >the backup tape. The only drawback to this approach is that you wind up >with files from multiple disks all interleaved in a single backup file on >the tape rather than in seperate backups. It is easier to retrieve files >from a backup when you know for certain which tape it is on. Incremental >backups, however, are frequently done which several disks all on a single >tape, in which case the method used by Workstation Solutions gives the >same results a whole lot faster. > > > -- David Krowitz > I have read in the SR10 documentation that rbak/wbak will write to a file so it would be possible to create a backup directory on every node and then run wbak as a server that writes to the backup directory. You could set it up so that wbak ran at some odd hour and then every morning it would only be necessary to copy the backup directories to tape. This would be faster since you could always do a full backup and then wbak would not have to check the dates on the files. This is the strategy that I plan to impliment when I upgrade to SR10 some time in the future. Mike Markley markley@celece.ucsd.edu
jec@iuvax.cs.indiana.edu (James E. Conley) (12/06/88)
That sounds like an excellent idea (backing up to files). I think it will probably solve most of my objections about speed of backups (since doing a distributed search will undoubtably speed things up), but there is still the problem that wbak writes slowly to the tape. I believe 4.3 dump speed should be achievable for a single file as far as keeping a streaming tape drive busy to at least occassionally stream. I guess with a fast network you could always just rcp these files to some machine with a faster tape system. III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 855-7729 U I U U.S. Mail: James E. Conley U I U Indiana University UUUIUUU Dept. of Computer Science I 004 Lindley Hall III Bloomington, IN. 47405
mkhaw@teknowledge-vaxc.ARPA (Mike Khaw) (12/06/88)
<15567@iuvax.cs.indiana.edu>, by jec@iuvax.cs.indiana.edu (James E. Conley): [regarding backup ...] > I guess with a fast network you could always just rcp these files to > some machine with a faster tape system. But if SR10 behaves like SR9.x, when you rcp back from the remote system/ tape drive, certain files like binary executables won't have their file type set to "obj" anymore, so you'll have /com/obty them. Mike Khaw -- internet: mkhaw@teknowledge.arpa uucp: {uunet|sun|ucbvax|decwrl|ames|hplabs}!mkhaw%teknowledge.arpa hardcopy: Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303
donp@CAEN.ENGIN.UMICH.EDU (Don Peacock) (12/06/88)
From: krowitz@richter.MIT.EDU Subject: Re: more SR10 questions <deleted text> One way in which backups can be speed up is the method used by Workstations Solutions' backup product. They start clients on several nodes which all feed data back to a server which writes the tape. Since the clients run independently of each other they can process several disks simultaneously and send the server buffers of data which have already been formatted for the backup tape. The only drawback to this approach is that you wind up with files from multiple disks all interleaved in a single backup file on the tape rather than in seperate backups. It is easier to retrieve files from a backup when you know for certain which tape it is on. Incremental backups, however, are frequently done which several disks all on a single tape, in which case the method used by Workstation Solutions gives the same results a whole lot faster. As Paul Anderson (pha@caen.engin.umich.edu) stated, we have 450 Apollos and do daily incrementals (around 2 gigs/day) and weekly full backups. We use some home grown software to keep up with this mess and I will quickly try to explain how it works. I have left out most of the specifics but it really does work, and better than I had expected while designing it. Incrementals: 1) We have a bank of nodes (6 dn4000's with 329meg formatted disks) which are used for storing the incremental trees. (more about this later) 2) Each node tries to do an incremental backup every half hour thru cron, What it actually tries to do is a cpt of the appropriate trees to an incremetal node, this is where the date time stamp is checked. 3) We have a locking mechanism for limiting the number of nodes cpt'ing to an incremental node at one time. (currently this is set at 6) Simple math tells us that this gives us a maximum of 36 concurent cpt's at any given time. The incremental code also watchs the incremental nodes disk space and aborts if/when it thinks it can no longer finish and leave a certain amount of disk space on the incremental node (this is a buffer zone which is needed when the incremental node later goes to tape with its data, currently 10Megs). 4) Currently backup operators dump these incremental nodes to magtape and by around 10 or 11:00 am we have our incrementals for the day done and to tape. We currently are completeing all but 6-10 nodes /day and these nodes are not getting done due to hardware problems etc. 5) We can easily monitor which disks have not done their incrementals because it mails is a list every morning of the disks that have not done their backups for three consecutive days. (this morning there was 11 nodes in this category for one reason or another) There is a backup person responsible for checking these problem nodes out each morning and responding to the rest of the backup group with the cause and status. 6) we use our full backup code to clean the incremental disks off each day and automatically deleteing the incremental trees once they are safely put to tape and logged. Our logging automatically keeps listings of wbaks and creates the labels for the tapes, So our restore program (rest_req) can easily let a backup operator know which tapes need mounted etc. 7) We have bought a couple 8mm tapes and are going to automate our incrmentals further. By allowing the backup operators to simple swap tapes once a day for incrementals, instead of using 10-20 mag tapes each day. Full backups 1) A Network wide logging scheme is used (similiar to incrementals) so we can keep track of ANY node that has not had a Full backup in the last 7 days. 2) To run a full backup a backup operator simply crp's onto a node with a magtape and runs our backup code. He then simply follows the instructions (ie load tape, swap tape, label tape with xxx.xx... etc). 3) ALL the logging etc is taken care of automatically. Now that I have tried to explain how we do our backups I would like to make some comments that don't necessarily relate to backups but relate to this news group that I feel sure many of you will take issue with. 1) Although wbak and rbak are slow it is because of what they do and how they intelligently interact with a VERY ROBUST network file system (NOT NFS). 2) The tools for manageing a LARGE network of Apollos are NOT there, but the underlying capabilities for creating these tools are available and taken for granted by the majority of those people that constantly flame Apollo for not being a vanilla Unix machine. (thank GOD its not because we couldn't keep 450 vanilla Unix machines happy without at least ten times the effort that it takes us to manage the APOLLOS) 3) I dont agree with everything Apollo has done over the past couple of years but I do know that my job is easier because of their capabilities and Apollo's efforts to not be pulled Backwards into the stone ages by a group of people worshipping an operating system that was never intended for anything more than a stand alone machine. I do like the Unix interface but this beauty is only skin deep and needs a STRONG underlying structure to give us the ability to manage an entire network as a single machine. Don Peacock University of Michigan donp@caen.engin.umich.edu
crgabb@sdrc.UUCP (Rob Gabbard) (12/06/88)
In article <5627@sdcsvax.UCSD.EDU>, markley@celece.ucsd.edu (Mike Markley) writes: > In article <8812051455.AA06261@richter.mit.edu> krowitz@RICHTER.MIT.EDU (David Krowitz) writes: > I have read in the SR10 documentation that rbak/wbak will > write to a file so it would be possible to create a backup > directory on every node and then run wbak as a server that > writes to the backup directory. You could set it up so This would be a nice solution except for the fact that you would have to have as much free space on each disk as you have used space (for a complete) or be sure that you have as much free space as what has been changed since the last backup (for an incremental). At the ADUS conference I sat through a VERY interesting talk on NBS, the Apollo Network Backup System. With NBS Apollo seems to be addressing all of these backup complaints and much more. It has its own scheduling language and is designed to live in a heterogeneous world. I'm not sure about release info. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Rob Gabbard (uunet!sdrc!crgabb) _ /| Workstation Systems Programmer \'o.O' Structural Dynamics Research Corporation =(___)= U =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
giebelhaus@hi-csc.UUCP (Timothy R. Giebelhaus) (12/13/88)
In article <8812041914.AA20493@umix.cc.umich.edu> jec@IUVAX.CS.INDIANA.EDU (James E. Conley) writes: > > Well, by primative I meant that they lacked some very useful features >of dump and restore, namely multiple levels of backup and tape retries if you >put a bad tape on. It is better than nothing, but I'd prefer dump and restore >any day. Not to mention the ability to do dumps on other machines (A VAX to >an Alliant, not just from Apollo to Apollo) using rdump and rrestore. Also, >dump dates aren't updated until the backup finishes. I assume what you are looking for here is a set of levels of dumps. You can do this with wbak. Simply do you full backup just as you would your level 0 with dump. Then do each incremental with the -nhi switch. The wbak facility will forget that it did the incremental so next time wbak does an incremental backup, it will do the incremental since the last full backup. You can get much more sophisticated by manipulating the backup_history files. For example, you can get muliple levels of dumps by saving multiple backup_history files. Though this is more complicated, it can be handled through scripts. > And of course, wbak/rbak are slow. It is not wbak that is slow, it is hardware. If you had a tape drive connected which would stream, you would see much faster backups of the local disk. Many sites I have been to will not use rdump because it is too slow to backup over the network. The wbak program is also bound by the speed of the network. It has already been brought up that one can backup to a file. The file can be over the network. Finally, there is the new product which was mentioned. -- UUCP: uunet!hi-csc!giebelhaus UUCP: tim@apollo.uucp ARPA: hi-csc!giebelhaus@umn-cs.arpa ARPA: tim@apollo.com Tim Giebelhaus, Apollo Computer, Regional Software Support Specialist. My comments and opinions have nothing to do with work.
ross@sword.ulowell.edu (Ross Miller) (12/16/88)
> will probably solve most of my objections about speed of backups (since doing > a distributed search will undoubtably speed things up), but there is still > the problem that wbak writes slowly to the tape. I believe 4.3 dump speed > should be achievable for a single file as far as keeping a streaming tape > drive busy to at least occassionally stream. > > I guess with a fast network you could always just rcp these files to > some machine with a faster tape system. What I find will work is that if one runs wbak on a machine that is very lightly loaded and the right configuration. I can occasionally stream a cypher 6250 9track off of a dsp80 if I run from the single user shell. The tape will not stop moving otherwise, but if I attempt to run that node with spm, ethernet, llbd, and other stuff, then the 3.0 Mb system just won't keep up and the drive stops moving. Ross