[comp.unix.questions] Checkpointing For Unix?

bill@inls1.ucsd.edu (Bill Reynolds) (04/23/91)

Greetings,
	We are a computational physics group running a network of Sun 
and SGI workstations. We often have long running jobs on many of our
machines. This leads to problems when a machine needs to be taken down
that has a job in the third day of a five day run. What we would like
is a routine to checkpoint a job to a disk file for later reloading
into memory. I've looked at undump, but isn't adequate, we need to
restart the job where it was interrupted. I've also looked at condor,
but it seems to be a fly-with-a-sledgehammer type solution. I'm
wondering if there are any simple unix/sun/sgi utilities to do
checkpointing. (I know that such facilities exist for crays).
							Thankyou much,
--
_______________________________________________________________________
						|  Bill Reynolds
	  				 	|  bill@inls1.ucsd.edu