andrew@comp.vuw.ac.nz (Andrew Vignaux) (06/01/89)
I've hit a small problem trying to port KCl to our 835. KCl uses dynamic loading [ld -A] to load its object files into memory. However, it has appended some text to the object file which it loads separately. All the lds that I have seen before, allow extra rubbish on the end of object files, but the 835 loader says /bin/ld: foo.o: Not a valid object file (invalid system id) if the length of the data is >= 128 bytes. Interestingly, it works fine if the length is < 128 bytes. [The system-id is valid in both cases!] Any ideas? Another problem I am having, is what to do with the object file after I have loaded it. I read the object module's header to determine how much space I should allocate in memory. I allocate the space, and pass the starting address to "ld -A bar -R %x -o baz". The object file that I get back has a number of interesting properties - the starting address of the text segment has been rounded up to a page boundary -- is there anything in the architecture that requires this? - the starting address of the data segment has also been rounded up to a page boundary. Again is there any real reason for this? - I want to branch to the first routine in the file. The inter-space stub seems to be at TEXT+4 (*TEXT is a break). Is there a better way to find this? - the header gives a different size than the one I worked out earlier [Surprise, surprise]. Any suggestions on a better size predictor (I am currently using size+PAGESIZE). I guess I should write my own linker :-( While I've got your attention :-), are there any 9000/800 assembler gurus out there? I've got the following declaration C declaration: extern struct character character_table[]; which KCl indexes with a character. Because characters are signed on some machines, the space for the array is defined in an assembler file .globl _character_table .space 1024 _character_table: .space 1024 in the appropriate syntax for the particular machine. Note: the label is in the middle of the space. I've been able to put this in the DATA subspace but not in the BSS subspace. Any thoughts on how to get this in the BSS subspace? Is it possible to get the assembler to define a symbol that is the value of another symbol + an offset? Andrew -- Domain address: andrew@comp.vuw.ac.nz Path address: ...!uunet!vuwcomp!andrew
shankar@hpclscu.HP.COM (Shankar Unni) (06/03/89)
> I've got the following declaration C declaration: > > extern struct character character_table[]; > > which KCl indexes with a character. Because characters are signed on > some machines, the space for the array is defined in an assembler file > > .globl _character_table > .space 1024 > _character_table: > .space 1024 > > in the appropriate syntax for the particular machine. Note: the label is in > the middle of the space. I've been able to put this in the DATA > subspace but not in the BSS subspace. Any thoughts on how to get this in > the BSS subspace? .space $PRIVATE$ ; this is how you specify spaces .subspa $BSS$ ; this is how you specify subspaces _character_table .comm 1024 Better still, consider the following: A source file called chartab.c, whose contents are the single line char _character_table[1024]; cc -c this file, and you get an object file that has a "common definition" for your space. Portably. ---- Shankar.
andrew@comp.vuw.ac.nz (Andrew Vignaux) (06/05/89)
In article <1340054@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) writes: >In article <14870@comp.vuw.ac.nz> I wrote: >> .globl _character_table >> .space 1024 >> _character_table: >> .space 1024 >> >> in the appropriate syntax for the particular machine. Note: the label is in >> the middle of the space. I've been able to put this in the DATA >> subspace but not in the BSS subspace. Any thoughts on how to get this in >> the BSS subspace? > > .space $PRIVATE$ ; this is how you specify spaces > .subspa $BSS$ ; this is how you specify subspaces > _character_table > .comm 1024 >Shankar. I'm afraid that's not what I meant (I guess my paragraph was ambiguous). What I need is a "common" definition that is in the MIDDLE of the data, so the program can use a signed char to access it (pretty wierd huh!). I can't get this to happen in the BSS subspace. BTW: the 800 assembler requires a label for the .COMM directive. On a related issue (yes still KCl), is there any way to use the value of $global$ in a C routine? (The C compiler doesn't like the $). I could get an assembler routine to put the value in a global, but that means a memory dereference every time a certain macro is used (rather than constant folding done at compile/link time). I'll probably just hard code 0x40000000. Andrew -- Domain address: andrew@comp.vuw.ac.nz Path address: ...!uunet!vuwcomp!andrew
jmorris@hpsemc.HP.COM (John V. Morris) (06/07/89)
Unlike most lds, the 835 linker loads multiple modules from a single file. Thus, if you append extra stuff at the end of an object file, the 835 linker thinks there is another module present and attempts to load it. The workaround is fairly simple. You have to add a valid object module header in front of the extra stuff. I've attached a program that shows how to do it. Coincidently, the header is is 128 bytes long. Concerning the address rounding. I believe the addresses are rounded in order to support load on demand and shared code. There ought to be a way to turn it off, but I don't know how to do it. Does the -N option help? John Morris HP Technology Access Center (415) 725-3871 ---------------------------- dummy_som.c ----------------------------- /********************************************************************* dummy_som file >>object.o dummy_som creates a Standard Object Module (SOM) containing the given file. The dummy SOM can be appended to a conventional object file and will be ignored by the linker. This program is useful for applications that wish to append their own data to object files. Written for the HP 9000 S800 at the HP Technology Access Center. **********************************************************************/ #include <filehdr.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> main(argc, argv) /******************************************************************** dummy_som creates a null Standard Object Module containing the given file ************************************************************************/ char **argv; { struct header hdr; struct stat file_status; char buffer[8192]; int fd, len; /* get the arguments */ if (argc<2) {perror("usage: dummy_som file >> obj.o"); exit(1); } /* open the file to append to header */ fd = open(argv[1], O_RDONLY); if (fd < 0) {perror("dummy_som: Can't open input file"); exit(1);} /* get information about the file */ if (fstat(fd, &file_status) < 0) {perror("dummy_som: Can't get status of input file"); exit(1);} /* create a dummy header that reserves the extra space */ memset(&hdr, 0, sizeof(hdr)); hdr.system_id = HP9000S800_ID; hdr.version_id = VERSION_ID; hdr.a_magic = RELOC_MAGIC; hdr.som_length = sizeof(hdr) + file_status.st_size; hdr.checksum = compute_checksum(&hdr); /* output the dummy headr */ write(1, &hdr, sizeof(hdr)); /* append the data file to the header */ while ((len=read(fd, buffer, sizeof(buffer))) > 0) write(1, buffer, len); return 0; } compute_checksum(hdr) /**************************************************************** compute_checksum calculates the checksum of an object module header ******************************************************************/ struct header *hdr; { int sum, *ptr, i; /* start at beginning of header */ sum = 0; ptr = (int *)hdr; /* add up the checksum */ for (i = 0; i<sizeof(*hdr)/4 - 1; i++) sum = sum ^ ptr[i]; /* done */ return(sum); }
mar@hpclmar.HP.COM (Michelle Ruscetta) (06/08/89)
> I've hit a small problem trying to port KCl to our 835. > > KCl uses dynamic loading [ld -A] to load its object files into > memory. However, it has appended some text to the object file which > it loads separately. All the lds that I have seen before, allow extra > rubbish on the end of object files, but the 835 loader says > > /bin/ld: foo.o: Not a valid object file (invalid system id) [ correctly answered in previous response ] > > Another problem I am having, is what to do with the object file after > I have loaded it. I read the object module's header to determine how > much space I should allocate in memory. I allocate the space, and > pass the starting address to "ld -A bar -R %x -o baz". The object file > that I get back has a number of interesting properties > > - the starting address of the text segment has been rounded up to a > page boundary -- is there anything in the architecture that > requires this? > - the starting address of the data segment has also been rounded up > to a page boundary. Again is there any real reason for this? > Yes, the HPUX loader requires page alignment of both the text and data segments. This is primarily because memory protection is done on a page basis. Even though you will essentially be 'loading' your own code, this alignment is still performed. > > - I want to branch to the first routine in the file. The > inter-space stub seems to be at TEXT+4 (*TEXT is a break). Is > there a better way to find this? > You MUST use the "exec_entry" field in the HPUX auxiliary header (which immediately follows the standard file header), or use the entry_offset field in the file header. > - the header gives a different size than the one I worked out > earlier [Surprise, surprise]. Any suggestions on a better size > predictor (I am currently using size+PAGESIZE). > Sorry, no good size predictor -- it is very difficult to determine the size of an a.out file, given a relocatable object file, unless you know thatthe a.out doesn't include any code from other objects. > I guess I should write my own linker :-( > Good luck! -- The linker for the series 800 is much more complex than the linkers I have seen for other CISC architecures -- due to some RISCY requirements. There are some other things that complicate dynamic linking on the s800 architecture: 1) HP-UX on the s800 still does not support non-sharable, writable text, so dynamically-loaded code must be placed in the data space. This means that inter-space "stubs" must be created in order to support brancheinh between the code and the data space (This is because the standard procedure call and return sequence cannot branch across spaces). 2) The process of "stack unwinding" cannot handle dynamically-loaded code, so getting a stack trace from a debugger will be impossible when executing within the dynamically loaded code -- this is also why the Pascal try/recover (escape()) feature will not work. 3) Address relocation is complicated by the instruction format, which is not a typical " add a constant to a full word" type of patching (In fact, for fun take a look at the a.out manual page to see what the fixup formats look like). 4) You have to be careful about flushing the instruction/data caches (due to #1 above), before executing the code that has been 'loaded' into memory. Below, I have an example of a program which uses dynamic linking, this might give you some help/insight as to what's involved with dynamic linking on the series 800, using the ld -A option. The -A option was implemented in the s800, HPUX 3.0 release. The -A option is used when you want to dynamically link a file from an existing 'main' program. The link command is called from within the main program (using 'system()' or 'exec()'), using the main program as the basefile (ld -A basefile ...) so that any symbols defined in the basefile will be used to resolve references from the file which is being dynamically linked (for example if you want to make calls from a dynamically linked function to routines which are defined in the main program). Normally, space is allocated in the main program's data area using malloc(), but since you don't know the size of the executable file that you will be placing into the data area, the malloc size is just a guess. The address returned from malloc must be page-aligned, and then can be used in the link (ld -A basefile -R data_address ...) command to inform the linker to link the file using that address for code placement. The link command should also sppecify the -N option to tell the linker to place the data immediately following the code, since we want code and data to be contiguous when we read it into the main program's data area. The executable file resulting from the link can then be read into the space allocated using information from the HPUX auxiliary header record, such as size of text, the file location of the program entry point, and the size of data. The execuatble file is read into data, and then can be executed by dereferencing a function pointer which has been set to the address of the entry point (found in the HPUX auliary header). There are other details to be taken care of as well, such as doing a memset for BSS (to initialize all of bss to zero), since the loader (exec()) usually does that for you, and we are bypassing the loader. Basic steps: (Note: this is not necesarily a complete nor syntactically correct C program but serves for illustration only): main() { char *x; int (*funcptr)(); x = malloc(some_large_size); /* page align since ld expects page-align value for -R */ page_align(x); /* get the value of 'x' into the ld command that we are going to call */ sprintf(cmd_buf, "ld -A basefile -R %x -N dynfunc.o -o dynfunc -e foo",x); /* call the linker to link the file */ system(cmd_buf); /* now we open the resulting executable for reading */ fileptr = fopen("dynfunc", "r"); /* seek to and read the auxiliary header record fseek(fileptr, sizeof(struct header), 0); fread(&filhdr, sizeof(filhdr), 1, fileptr); /* determine the size of the executable -- and see if we allocated enough space */ dynfunc_size = filhdr.exec_dmem + filhdr.exec_bsize - filhdr.exec_tmem; if(dynfunc_size > some_large_size) { /* do something -- either error, or realloc and relink */ } /* seek to and read in the text area of the dynamically linked file */ fseek(f, filhdr.exec_tfile, 0); fread(filhdr.exec_tmem, filhdr.exec_tsize, 1, f); /* seek to and read in the data area of the dynamically linked file */ fseek(f, filhdr.exec_dfile, 0); fread(filhdr.exec_dmem, filhdr.exec_dsize, 1, f); /* init the BSS area to zero */ memset(filhdr.exec_dmem+filhdr.exec_dsize, 0, filhdr.exec_bsize); /* set the function ptr to the entry point of the dynamically linked file */ funcptr = (int (*)()) (filhdr.exec_entry); /* flush the data and instruction caches -- not this must be done on the series 800 ! -- see the flush_cache assembly routine below */ flush_cache(); /* call the dynamically linked function */ (* funcptr)(); } /* END OF PROGRAM */ The following is the routine that can be used to flush the caches courtesy of Cary Coutant: ; ; Routine to flush and synchronize data and instruction caches ; for dynamic loading ; ; Copyright Hewlett-Packard Co. 1985 ; .code ; flush_cache(addr, len) - executes FDC and FIC instructions for every cache ; line in the text region given by the starting address in arg0 and ; the length in arg1. When done, it executes a SYNC instruction and ; the seven NOPs required to assure that the cache has been flushed. ; ; Assumption: the cache line size must be at least 16 bytes. .proc .callinfo .export flush_cache,entry flush_cache .enter ldsid (0,%arg0),%r1 mtsp %r1,%sr0 ldo -1(%arg1),%arg1 fdc %arg1(0,%arg0) loop fic %arg1(%sr0,%arg0) addib,>,n -16,%arg1,loop ; decrement by cache line size fdc %arg1(0,%arg0) ; flush first word at addr, to handle arbitrary cache line boundary fdc 0(0,%arg0) fic 0(%sr0,%arg0) sync nop nop nop nop nop nop nop .leave .procend .end
cary@hpcllak.HP.COM (Cary Coutant) (06/08/89)
A few comments to the previous responses: 1. An easier way to put your own junk at the end of an object file is to put 128 bytes of zeroes before the junk. The linker will not attempt to read any more object modules from the file if it sees a header full of zeroes. 2. The flush_cache() routine in the previous response was an old version that may not work correctly on some HP-PA implementations. I've included the correct version below. 3. One way to guarantee that you have to do the ld -A link only once is to call sbrk(0) before the link to obtain the starting address (for the -R option), then after the link use sbrk() to allocate enough space. This technique assumes that you don't do anything that would cause a call to malloc() in between the two calls to sbrk(). 4. The linker does indeed round both text and data addresses to page boundaries because of loader (i.e., exec()) requirements. For -N links, this rounding should probably be eliminated, and we may fix this in a future release. For this reason, you should always make sure you look in the aux header exec_tmem and exec_dmem fields for the actual addresses. Cary Coutant, Hewlett-Packard Computer Language Lab ; ; Routine to flush and synchronize data and instruction caches ; for dynamic loading ; ; Copyright Hewlett-Packard Co. 1985 ; .code ; flush_cache(addr, len) - executes FDC and FIC instructions for every cache ; line in the text region given by the starting address in arg0 and ; the length in arg1. When done, it executes a SYNC instruction and ; the seven NOPs required to assure that the cache has been flushed. ; ; Assumption: the cache line size must be at least 16 bytes. .proc .callinfo .export flush_cache,entry flush_cache .enter ldsid (0,%arg0),%r1 mtsp %r1,%sr0 ldo -1(%arg1),%arg1 copy %arg0,%arg2 copy %arg1,%arg3 fdc %arg1(0,%arg0) loop1 addib,>,n -16,%arg1,loop1 ; decrement by cache line size fdc %arg1(0,%arg0) ; flush first word at addr, to handle arbitrary cache line boundary fdc 0(0,%arg0) sync fic %arg3(%sr0,%arg2) loop2 addib,>,n -16,%arg3,loop2 ; decrement by cache line size fic %arg3(%sr0,%arg2) ; flush first word at addr, to handle arbitrary cache line boundary fic 0(%sr0,%arg2) sync nop nop nop nop nop nop nop .leave .procend .end
andrew@comp.vuw.ac.nz (Andrew Vignaux) (06/09/89)
This is great -- thanks to everyone who responded. However, there are a few comments I would like to make. In article <1340056@hpclmar.HP.COM> mar@hpclmar.HP.COM (Michelle Ruscetta) writes: > You MUST use the "exec_entry" field in the HPUX auxiliary header (which > immediately follows the standard file header), or use the entry_offset > field in the file header. At least in the version of the loader I am using (A.01.04??) both of the fields point at the "main" program's $START$. I can't use the -e option because I don't know the name of the initial function. Is it unreasonable to get ld to default to the "first" function in the dynamically loaded file when -A is used? BTW: My tmem+4 hack doesn't work if the loaded object does any indirect function calls. I should probably search around in the symbol table to find the correct address & 03. I had not realised that I needed a flush_cache() routine after my load. I had read the note after the SYNC instruction, thought "You'll never catch me writing self-modifing code", and promptly forgot it. I don't think I would have guessed about ld's multiple module feature. I was using a wrapper around ld to strip the "data" off while the load was going on, which was a bit slow and a little dangerous. Things are working a lot better (and faster) now. My latest incremental loading problem is trying to use function pointers in the main program that have been computed in the dynamically loaded routine (here-in-after referred to as fred). Shouldn't the address of a routine be the address of the export stub? I suspect function pointers from the main program, being used in fred, would have the same problem -- but, fortunately, I don't think I need to do that. Does setjmp/longjmp cope with multiple space programs? BTW: adb doesn't like my incrementally loaded objects. (segmentation fault) Does the loader need to generate a different import stub for every call for the same routine? Thanks, Andrew -- Domain address: andrew@comp.vuw.ac.nz Path address: ...!uunet!vuwcomp!andrew