[net.unix-wizards] restarting core images

tom@puff.UUCP (Tom Christiansen) (04/04/86)

OS:   BSD4.3 UNIX w/ Sun NFS
CPU:  MicroVax II WorkStations, Vax 7[58]0's

Does anyone out there have any experience with restarting core images as
generated either by the core() function out of sys/kern_sig.c or by the
gcore program?   The end goal is to be able to checkpoint a process for
eventual restarting, possibly but not necessariily on a different machine.  
I see two major difficulties: retaining open files and restoring all 
three segments.

I think I can deal with retaining open files, providing we're talking
about a real file, rather that a socket or a device: just record where
the current r/w pointer is, seek to the start of the file and write it
somewhere, like core.pid.fd0, core.pid.fd1, etc.  Then when you want
restart the core image, open these files and seek to the correct
point.  This way you don't even have to know the names of the files.
Anyone think of any reason that would require knowledge of the file
names themselves?

I know how rogue saves things; it just writes out its entire data
segment, does an sbrk() of the write size on restart, and just writes
the old data segment there.  But how do you restore the old stack?
Could you use the same trick but instead of sbrk(), just call a
recursive function until you stack is big enough?


Thanks,
tom

-- 
Thomas Scott Christiansen, Project Assistant
Computer Systems Lab
Department of Computer Sciences
University of Wisconsin, Madison

UUCP: ...!{harvard,ihnp4,seismo,topaz}!uwvax!tom
ARPA: tom@{rsch,crys,limburger,puff,gumby,pokey,devo}.wisc.edu

ksh@rtgvax.UUCP (Kent S. Harris) (04/08/86)

In article <741@puff.UUCP>, tom@puff.UUCP (Tom Christiansen) writes:
> OS:   BSD4.3 UNIX w/ Sun NFS
> CPU:  MicroVax II WorkStations, Vax 7[58]0's
> 
> Does anyone out there have any experience with restarting core images as
> generated either by the core() function ...

There is a simple program (sorry, I don't have it on the machine I'm
currently on but I can get you a copy if you ask me by E-mail) distributed
in the public domain by the folks that turn out the TeX distribution
tape (its called undump).  `Undump' takes an original a.out and a core
file (generated by abort() for example) and builds a new a.out image
with all of the original .text as .data.  Be advised, not all vendors
have remained true to the ZMAGIC (413) a.out format.  Sun (vax, etc) allocates
an entire UPAGE for the exec structure on  the front of 413 file, whereas ISI
allocates just the exec struct immediately followed by text segment (the text
size includes this header).  ISI should have cobbed up a new magic number
as there is nothing sacred with "413".