frank@zen.co.uk (Frank Wales) (10/12/89)
Okay, HP, I give in. This problem's been annoying me for some time, and was logged as a bug in s800 HP-UX 3.01 (and is still in 3.1), but my curiosity as to its cause has got the better of me. xdb(1) takes a *long* time single-stepping the return of a large structure. Here is the example program I sent to our Customer Response Centre (instructions start in main's comment block): xdbslow.c: /* big structure to pass around...also try it as: * * struct big {char Message[1024];} * * Decide what you think should happen beforehand. */ struct big {enum {A,B,C} Class; char Message[1024];} foo() { struct big a; /* uninitialised, but who cares? */ /* ...now, try bu, followed by c, and note the speed; * ...then try bx, followed by c, and note the speed; * ...finally, try s, then go for a coffee... */ return a; } main() { struct big b; /* First time here, type S to macro-step this function call; * note the time it takes. Then, on each subsequent * pass, type s to step through the whole function... * [further instructions in foo()] */ b=foo(); return 0; } Compile with `cc -g xdbslow.c'. Enter xdb and follow the instructions in the comments, noting the execution time of s when stepping over the return statement in foo(). My question is simply: why does this happen? [And is there a tweak I can apply *now* to get rid of it? ;-)] [By the way, why are the "Procedures:" and "Files:" counts presented on xdb entry both one too large under 3.1? (They were the *same number* under 3.01, so at least that got fixed :-).) Are they counting the startup module that gets linked in? Whyever (sp!) they are too large, is it documented? Just curious.] -- Frank Wales, Systems Manager, [frank@zen.co.uk<->mcvax!zen.co.uk!frank] Zengrange Ltd., Greenfield Rd., Leeds, ENGLAND, LS9 8DB. (+44) 532 489048 x217
jmn@hpfcso.HP.COM (John Newman) (10/14/89)
Greetings: > xdb(1) takes a *long* time single-stepping the return of a large > structure. Here is the example program I sent to our Customer Response > Centre (instructions start in main's comment block): > > ... > foo() > { > struct big a; /* uninitialised, but who cares? */ > ... > return a; > } > > My question is simply: why does this happen? Here we have an example of what it means to debug on a RISC machine :-) The single statement return a; when viewed in disassembly mode (using xdb's "td" command), looks like this: 0x0000085c foo +0004 LDO -1076(30),1 0x00000860 foo +0008 OR 28,0,31 0x00000864 foo +000c LDO 1028(1),19 0x00000868 foo +0010 LDWS,MA 4(0,1),20 0x0000086c foo +0014 COMBF,<= 19,1,foo+0010 0x00000870 foo +0018 STWS,MA 20,4(0,31) For those of us who don't speak PA-RISC fluently, it translates to something like this (I've taken some liberties here for the sake of clarity): R1 <- address_of('a') R31 <- contents(R28) # R28 is pre-set return-value pointer R19 <- address_of('a') + sizeof('a') loop: R20 <- memory[contents(R1)], R1 += 4 memory[contents(R31)] <- R20, R31 += 4 if (R1 <= R19) goto loop goto contents(R2) # R2 contains return address The structure is being copied to the caller's stack-frame one word (4 bytes) at a time. This means that the 3 instruction loop gets executed 256 times. When in single-step mode, there is a considerable amount of context-switching overhead, and the debugger actually single-steps at the *instruction* level. So for this one C statement, the debugger is actually single-stepping the process thru approx. 770 instructions, one at a time, with control passing back and forth between the debugger and the child for each instruction executed. This takes enough time as to be noticible to the user. Being a RISC machine, there is no "blockmove" instruction, so I guess you could call this an architectural limitation (feature). > And is there a tweak I can apply *now* to get rid of it? ;-) If you must single-stepping thru a return statment that returns a large structure, no there isn't. As a workaround, there's a couple of things you could do: 1. Upon reaching the return statement, use the "bu" command to set an uplevel breakpoint (which wil be at the 1st instruction following the return), and continue ("c") to it. This takes the debugger out of single-step mode, and all 770 instructions execute at full speed, all at once. -or- 2. Return a pointer to the structure, rather than the entire structure. This means the return statement effectively compiles into 2 (two) instructions, which are stepped thru quickly and easily. BTW, Frank, we have seen the SR you filed concerning this, to be appearing in a Software Status Bulletin (SSB) near you soon. Unfortunately, because of the process overhead inherent in using single-step mode, and because such a large amount of data must be moved with this sort of return statement, there really isn't a "fix" possible (not on a RISC architecture, anyway). > [By the way, why are the "Procedures:" and "Files:" counts presented on > xdb entry both one too large under 3.1? (They were the *same number* > under 3.01, so at least that got fixed :-).) Are they counting > the startup module that gets linked in? Whyever (sp!) they are too large, > is it documented? Just curious.] When debuggable programs are linked, there is an extra object file that is added: on the s800, it's named /usr/lib/xdbend.o. This file contains a single subroutine _end_(), which is used for controlled returns from command-line procedure calls, and a data buffer __buffer[], which is used for storing character data in the user's address space when a string assigment, or a cmd-line procedure call with string arguments, is made. Inclusion of this file (which is itself debuggable) increases both the "Procedures:" and "Files:" counts both by 1. I don't think this is clearly documented anywhere. I believe it should be, and will endeavor to make it so. > Frank Wales, Systems Manager > Zengrange Ltd. John Newman Hewlett-Packard, Colorado Language Lab Symbolic Debuggers jmn%hpfcrt@hplabs.HP.COM CMA: All opinions expressed are strictly my own.
frank@zen.co.uk (Frank Wales) (10/18/89)
In article <7370014@hpfcso.HP.COM> jmn@hpfcso.HP.COM (John Newman) writes: >In article <1728@zen.co.uk> I wrote: >> xdb(1) takes a *long* time single-stepping the return of a large >> structure. Here is the example program I sent to our Customer Response >> Centre (instructions start in main's comment block): >> [code fragments deleted] >> >> My question is simply: why does this happen? > >Here we have an example of what it means to debug on a RISC machine :-) > >[detailed explanation deleted] >When in single-step mode, there is a considerable amount of >context-switching overhead, and the debugger actually single-steps at the >*instruction* level. So for this one C statement, the debugger is actually >single-stepping the process thru approx. 770 instructions, one at a time, >with control passing back and forth between the debugger and the child for >each instruction executed. This takes enough time as to be noticible to >the user. Yep. :-) Thanks for the info, John, it's just what I wanted to know (and roughly what I'd guessed). [BTW, why does removing the enum from the struct definition slow things down further? Just curious. :-)] >> And is there a tweak I can apply *now* to get rid of it? ;-) > >If you must single-stepping thru a return statment that returns a large >structure, no there isn't. As a workaround, there's a couple of things you >could do: > > 1. Upon reaching the return statement, use the "bu" command to set > an uplevel breakpoint (which wil be at the 1st instruction > following the return), and continue ("c") to it. This takes the > debugger out of single-step mode, and all 770 instructions > execute at full speed, all at once. This (or "bx", depending on the mood I'm in) is what I do use, when I remember to not just keep tapping [return]. > 2. Return a pointer to the structure, rather than the entire > structure. This means the return statement effectively compiles > into 2 (two) instructions, which are stepped thru quickly and > easily. Not practical, given the current software design (mallocing result values complicates things hideously, and having them statically allocated defeats re-entrancy, which is one of the requirements for many of the routines which return these objects). >BTW, Frank, we have seen the SR you filed concerning this, to be appearing >in a Software Status Bulletin (SSB) near you soon. Unfortunately, because >of the process overhead inherent in using single-step mode, and because >such a large amount of data must be moved with this sort of return >statement, there really isn't a "fix" possible (not on a RISC architecture, >anyway). [sticks finger in air and puts on his best air of confident ignorance] But wait! Since this can be made to work quickly by the general strategy of "set a temp. breakpoint on the next instruction and continue to it," why can't xdb do this for me? Debugging at the source level, I don't care about possible side-effects or assertion hits *within* the execution of a single source statement. If I did, I could toggle into assembly mode and force instruction-by-instruction debugging. So, is this feasible? I.e., does the set/continue approach to source-level stepping work equivalently to the current approach for all (common) cases? If so, could I have it, please? ;-) >> [By the way, why are the "Procedures:" and "Files:" counts presented on >> xdb entry both one too large under 3.1?] > >When debuggable programs are linked, there is an extra object file that is >added: on the s800, it's named /usr/lib/xdbend.o. >Inclusion of this file (which is itself debuggable) increases both the >"Procedures:" and "Files:" counts both by 1. > >I don't think this is clearly documented anywhere. I believe it should be, >and will endeavor to make it so. Thank you, John. -- Frank Wales, Systems Manager, [frank@zen.co.uk<->mcvax!zen.co.uk!frank] Zengrange Ltd., Greenfield Rd., Leeds, ENGLAND, LS9 8DB. (+44) 532 489048 x217