[comp.sys.ibm.pc] NOP

cjdb@sphinx.UUCP (09/25/87)

In debugging code written in assembly-language, I notice that MASM
periodically inserts NOP's in places where they don't appear in the
source. I imagine this is done to slow down the processor, but how
does MASM "decide" the appropriate circumstances?

Thanks in advance.



-- 
Bitnet:	  	 lib.cb@uchicago.bitnet
Internet:      lib.cb@chip.uchicago.edu
uucp:	  ..!ihnp4!gargoyle!sphinx!cjdb

jack@csccat.UUCP (Jack Hudler) (09/25/87)

In article <2306@sphinx.uchicago.edu>, cjdb@sphinx.uchicago.edu (Charles Blair) writes:
> 
> In debugging code written in assembly-language, I notice that MASM
> periodically inserts NOP's in places where they don't appear in the
> source. I imagine this is done to slow down the processor, but how
> does MASM "decide" the appropriate circumstances?


MASM places NOPs in certian places to align some instructions on WORD 
boundarys or pad EVEN directives.
-- 
See above 	 (214)661-8960

michael@cit-vlsi.Caltech.Edu (Michael Lichter) (09/25/87)

Keywords:

In article <2306@sphinx.uchicago.edu> cjdb@sphinx.uchicago.edu (Charles Blair) 
writes:
>
>In debugging code written in assembly-language, I notice that MASM
>periodically inserts NOP's in places where they don't appear in the
>source. I imagine this is done to slow down the processor, but how
>does MASM "decide" the appropriate circumstances?

Unlike some other assemblers that force/allow you to decide whether you
want 8-bit or 16-bit relative calls and jumps, MASM decides for itself.  Not
being able to determine whether it can get away with the shorter instruction
on the first pass, MASM reserves three bytes.  If the offset is small enough,
you get [instruction] [offset] [nop], and if it's not, you get
[instruction] [offset] [offset].

There are other occasions when MASM will insert instructions that you might
not expect, such as WAIT instructions before any co-processor instructions.

Michael

perkins@bnrmtv.UUCP (Henry Perkins) (09/25/87)

In article <2306@sphinx.uchicago.edu>, cjdb@sphinx.uchicago.edu (Charles Blair) writes:
> In debugging code written in assembly-language, I notice that MASM
> periodically inserts NOP's in places where they don't appear in the
> source. I imagine this is done to slow down the processor, but how
> does MASM "decide" the appropriate circumstances?

It's not done to "slow down the processor"; it's done to make the
code size/location unambiguous.

There are two reasons why MASM will insert a NOP that isn't in the
original code: assembler directives and short jumps.  Directives
like EVEN and ORG specify where the next instruction starts; the
intervening space is padded with NOPs.  A single NOP is used after
a short jump when the assembler source did not specify (with the
SHORT pseudo-op) that the JMP was the shorter, 8-bit displacement
type.  Rather than use the slower 16-bit displacement jump, MASM
will use a short jump and add a NOP to preserve the offset of the
next instruction.  If you specify that the jump is to be short, of
course, no NOP is inserted.
-- 
{hplabs,amdahl,ames}!bnrmtv!perkins         --Henry Perkins

It is better never to have been born.  But who among us has such luck?
One in a million, perhap po

johnl@ima.ISC.COM (John R. Levine) (09/25/87)

In article <2306@sphinx.uchicago.edu> cjdb@sphinx.uchicago.edu (Charles Blair) writes:
>In debugging code written in assembly-language, I notice that MASM
>periodically inserts NOP's in places where they don't appear in the
>source. I imagine this is done to slow down the processor, but how
>does MASM "decide" the appropriate circumstances?

No, the 8088 is plenty slow already without any help from the assembler.
You'll probably find that the NOPs always follow a two-byte jump that is
jumping to an address later in the program. On the first pass, the assembler
assumes that it'll need a three-byte jump, but finds on the second pass that
the target is close enough that a two-byte jump, which is faster, will do. It
puts in the NOP to avoid changing the addresses of all later code.

It is possible to write assemblers that get the short vs. long jumps correct
in almost all cases. Every Unix assembler I've seen on machines with such
jumps does so, at least as far back as the PDP-11 fifth edition ca. 1974.

It turns out that getting the correct jump in every case is an NP complete
problem, so barring a major advance in complexity theory, it's too hard.
Getting the right answer in most cases is pretty easy. There's a classic paper
by Szymanski in the CACM about 10 years ago that explains it all.
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
The Iran-Contra affair:  None of this would have happened if Ronald Reagan
were still alive.

ching@amd.AMD.COM (Mike Ching) (09/25/87)

In article <2306@sphinx.uchicago.edu> cjdb@sphinx.uchicago.edu (Charles Blair) writes:
>
>In debugging code written in assembly-language, I notice that MASM
>periodically inserts NOP's in places where they don't appear in the
>source. I imagine this is done to slow down the processor, but how
>does MASM "decide" the appropriate circumstances?
>

More likely that these NOPs follow forward branches. MASM leaves
space for a 3-byte jump instruction during its first pass and fills
the third byte with a NOP if a two byte instruction is all that is
necessary. The NOPs are never executed and don't slow down the
processor.

mike ching

ayac071@ut-ngp.UUCP (William T. Douglass) (09/26/87)

In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>In article <2306@sphinx.uchicago.edu> cjdb@sphinx.uchicago.edu (Charles Blair) writes:
>         The NOPs are never executed and don't slow down the
>processor.

Actually, I thought that NOPs did an OR of the AX register on itself (or some
similar operation that causes no change to data.)  As such, they DO affect the
execution time involved.

Could someone knowledgeable confirm or refute this?

Bill Douglass
ayac071@ngp.UUCP

bright@dataio.Data-IO.COM (Walter Bright) (09/28/87)

In article <2306@sphinx.uchicago.edu> cjdb@sphinx.uchicago.edu (Charles Blair) writes:
>In debugging code written in assembly-language, I notice that MASM
>periodically inserts NOP's in places where they don't appear in the
>source. I imagine this is done to slow down the processor, but how
>does MASM "decide" the appropriate circumstances?

The only place I have seen it done is following a JMP instruction where
the target address is a forward reference. The reason is that MASM is
2 passes, on the first pass the values of all symbols must be determined.
JMP can be 2 or 3 bytes, depending on how far away the target is. Since
the target is forward referenced, MASM assumes worst case and allocates
enough space for a 3 byte JMP. On pass 2, if the target is close enough
that it can be a 2 byte JMP, MASM creates a 2 byte JMP followed by a NOP
to fill in the 'hole'.

If you want only a 2 byte JMP, use the JMP SHORT mnemonic.

brian@ncrcan.UUCP (Brian Onn) (09/29/87)

In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>.... The NOPs are never executed and don't slow down the
>processor.
>
>mike ching

Sorry, but the NOPS are always executed (how would the processor know that
it is in fact a NOP, if it didn't execute it) and each execution takes 
a finite, albeit minimal amount of execution time.

frisk@krafla.UUCP (09/29/87)

>In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>
>Actually, I thought that NOPs did an OR of the AX register on itself (or some
>similar operation that causes no change to data.)  As such, they DO affect the
>execution time involved.

>Could someone knowledgeable confirm or refute this?

The NOP operation is really XCHG AX,AX  and takes just as long to execute
as XGHG AX,BX (3 clock cycles).

This - however - is less than the instruction fetch time.
.
.
.

-- 
         Fridrik Skulason          University of Iceland
         UUCP  frisk@rhi.uucp      BIX  frisk

     This line intentionally left blank ...................

dalegass@dalcsug.UUCP (09/30/87)

I think people are sorta missing the point about the NOP never being
executed: NOP's are *of course* executed when the program passes over them,
but the point about this NOP stuck in after a JMP, is that the NOP is never
even passed over, because of the JMP right before it.

-dalegass@dalcsug.uucp

halvers@italy.UUCP (09/30/87)

In article <309@ncrcan.UUCP> brian@ncrcan.UUCP () writes:
>In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>>.... The NOPs are never executed and don't slow down the
>>processor.
>>
>>mike ching
>
>Sorry, but the NOPS are always executed (how would the processor know that
>it is in fact a NOP, if it didn't execute it) and each execution takes 
>a finite, albeit minimal amount of execution time.

Not in this case, since Mike was talking about NOP's padding
unconditional jump instructions, i.e.

        mov     bx,ax
        jmp     somewhere_else -----+
        nop <--- never reached      |
                                    v
next:   lea     addr

*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Pete Halverson                       ARPA: halverson@ge-crd.ARPA
General Electric Company             UUCP: halvers@desdemona.steinmetz.UUCP
Corporate R & D                            halvers@desdemona.steinmetz.ge.com
Schenectady, NY

"Money for nothin' and your MIPS for free"

jvc@prism.UUCP (09/30/87)

>>(Charles Blair) writes:
>>         The NOPs are never executed and don't slow down the
>>processor.
>Bill Douglass writes:
>Actually, I thought that NOPs did an OR of the AX register on itself (or some
>similar ... )  As such, they DO affect the execution time involved. Could
>someone knowledgeable confirm or refute this?

Yes, NOPs are executed when encountered.  They are actually XCHG AX,AX
which requires 3 clock cycles to execute.  HOWEVER, when the NOP is used
as a pad when the assembler decides that it only needs a 2-byte JMP opcode
(it decides this on the second pass after already reserving 3 bytes for a 
forward jump) , the NOP will never (under normal circumstances) be executed.
This should be obvious since the JMP isn't going to be a JMP to that NOP
(the very next byte).  The only way THAT NOP will ever be executed is if
you jump or branch to its address (which you'd probably never do).
Think about it.

FYI --  Use of SHORT operator:
   "The SHORT operator is used to tighten up code.  Two different JMP
instructions will perform identically in many cases.  The 3-byte JMP
opcode can force execution to any place within the current code
segment.  A 2-byte JMP opocode can be used to transfer control to 
within +127 to -128 bytes of the jump instuction."
   "The assembler will use the shorter form whenever it knows that the
short jump is valid, that is, when the jump is backward to a label it
has already processed.  But the assembler has no way of knowing if a 
forward jump can use the 2-byte opcode.  The purpose of the SHORT
operator is to tell the assembler that the short form of the JMP
opcode is possible"

The above two paragraphs taken from Dan Rollins' "IBM PC 8088 Macro
Assembler Programming", C1985.

jvc@mirror.tmc.com

smvorkoetter@watmum.UUCP (09/30/87)

In article <309@ncrcan.UUCP> brian@ncrcan.UUCP () writes:
>In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>>.... The NOPs are never executed and don't slow down the
>>processor.
>>
>>mike ching
>
>Sorry, but the NOPS are always executed (how would the processor know that
>it is in fact a NOP, if it didn't execute it) and each execution takes 
>a finite, albeit minimal amount of execution time.

I believe what Mr. Ching means is that NOPs after a jump are never
executed.  Thus, if you have an unconditional jump followed by a NOP
like:
		jmp	foo
		nop
	bar:	more-instructions

Then the NOP after the jump will never be executed since there is no
way to get to it.

Stefan Vorkoetter
Symbolic Computation Group
University of Waterloo

lotto@wjh12.UUCP (09/30/87)

In article <309@ncrcan.UUCP> brian@ncrcan.UUCP () writes:
>In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>>.... The NOPs are never executed and don't slow down the
>>processor.
>Sorry, but the NOPS are always executed (how would the processor know that

Mild flame...

I wish people would think before posting. The first article stated:
"NOPs are inserted after JMPs that were found not to require the full
three bytes." NOPs after JMPs are NOT going to be executed because IP
is probably never going to point to them. If you DO execute a NOP, of
course it will take (some) time. You did prefetch the instruction, but
this always happens for the bytes following a JMP anyway, and that was
not what the argument was about.

It is dangerous to quote someone out of context, but if you are going
to take them to task, at least READ the whole article!

Mild flame off...
-- 
Gerald Lotto - Harvard Chemistry Dept.
UUCP:  {seismo,harpo,ihnp4,linus,allegra,ut-sally}!harvard!lotto
ARPA:  lotto@harvard.harvard.edu

feg@clyde.UUCP (09/30/87)

In article <309@ncrcan.UUCP>, brian@ncrcan.UUCP (Brian Onn) writes:
> In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
> >.... The NOPs are never executed and don't slow down the
> >processor.
> >
> >mike ching
> 
> Sorry, but the NOPS are always executed (how would the processor know that
> it is in fact a NOP, if it didn't execute it) and each execution takes 
> a finite, albeit minimal amount of execution time.

True enough, IF the processor executes the NOP. However, what was discussed
was the substitution by the assembler of a two byte unconditional jmp for
the three bytes set aside on pass 1. Obviously, the processor never gets
to execute the NOP in such a case.

Forrest Gehrke

ching@amd.UUCP (09/30/87)

Read the article. I was talking about a NOP following a branch.

mike ching

carsten@cernvax.UUCP (09/30/87)

Even if the NOP is not executed, it has to be loaded into the instruction
register (the CPU's cannot foresee the NOP (yet))! Hence it slows down
the CPU!

Carsten Andersen

carsten@cernvax

.....  hamlet() { if ((be) || (!be)) question(); }

perkins@bnrmtv.UUCP (09/30/87)

In article <309@ncrcan.UUCP>, brian@ncrcan.UUCP (Brian Onn) writes:
> In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
> >.... The NOPs are never executed and don't slow down the
> >processor.
> >
> >mike ching
> 
> Sorry, but the NOPS are always executed (how would the processor know that
> it is in fact a NOP, if it didn't execute it) and each execution takes 
> a finite, albeit minimal amount of execution time.

Here's the original text that Brian abridged:

     More likely that these NOPs follow forward branches. MASM leaves
     space for a 3-byte jump instruction during its first pass and fills
     the third byte with a NOP if a two byte instruction is all that is
     necessary. The NOPs are never executed and don't slow down the
     processor.

     mike ching

The short jump transfers execution to some instruction other than
the NOP; the NOP is NEVER executed.  The processor DOESN'T "know"
that it's a NOP; it never has a chance to execute it (although it
does put it in its prefetch queue).  Execution of a NOP takes a
finite but NON-minimal amount of time; a NOP (same as XCHG AX,AX)
takes 3 bus cycles, whereas the minimum for an instruction is 2
bus cycles (example: CLD, which I generally use in preference to
NOP because I keep the direction flag cleared anyway).

Sorry, Brian, but I think that's a first: a one-sentence posting
with THREE errors.  Better luck next time, and for now I recommend
"The 8086 Book" by Rector & Alexy (Osborne/McGraw-Hill Books).
-- 
{hplabs,amdahl,ames}!bnrmtv!perkins         --Henry Perkins

It is better never to have been born.  But who among us has such luck?
One in a million, perhaps.

farren@gethen.UUCP (10/01/87)

In article <309@ncrcan.UUCP> brian@ncrcan.UUCP () writes:
>In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>>.... The NOPs are never executed and don't slow down the
>>processor.
>>
>>mike ching
>
>Sorry, but the NOPS are always executed (how would the processor know that
>it is in fact a NOP, if it didn't execute it) and each execution takes 
>a finite, albeit minimal amount of execution time.

Sigh.  Given an instruction sequence like:

	jmp    label
	nop
label:

the NOP will NOT be executed, and, in fact, barring certain aspects of
pipelining, will never even be SEEN by the processor.  Control flow
will jump right over that instruction every time.  The processor, in fact,
does NOT know that the instruction is a NOP - it knows nothing about it
at all - it could be any value, but is a NOP for convenience and
readability.

-- 
----------------
Mike Farren             "... if the church put in half the time on covetousness
unisoft!gethen!farren   that it does on lust, this would be a better world ..."
gethen!farren@lll-winken.arpa             Garrison Keillor, "Lake Wobegon Days"

hollen@mana.UUCP (10/01/87)

In article <309@ncrcan.UUCP> brian@ncrcan.UUCP () writes:
>In article <4478@amd.AMD.COM> ching@amd.UUCP (Mike Ching) writes:
>>.... The NOPs are never executed and don't slow down the
>>processor.
>>
>>mike ching
>
>Sorry, but the NOPS are always executed (how would the processor know that
>it is in fact a NOP, if it didn't execute it) and each execution takes 
>a finite, albeit minimal amount of execution time.

Sorry, Brian, but in almost all cases, the NOP's ARE NEVER EXECUTED.  This
is because the assembler places them just after a JMP instruction.  Since
the path of execution goes AROUND the NOP, it is not executed.  Reading the
several replies to the original question would have told you this.

	Dion Hollenbeck             (619) 455-5590 x2814
	Megatek Corporation, 9645 Scranton Road, San Diego, CA  92121
			{sdcsvax,hplabs}!hp-sdd!megatek!hollen
			{sdcsvax,seismo}!esosun!

perkins@bnrmtv.UUCP (10/02/87)

In article <6375@ut-ngp.UUCP>, ayac071@ut-ngp.UUCP (William T. Douglass) writes:
> Actually, I thought that NOPs did an OR of the AX register on itself (or some
> similar operation that causes no change to data.)

ORing a register with itself sets the flags.  OR AX,AX is the
simplest way of determining if AX is positive, negative, or zero.

NOP is XCHG AX,AX.  Since XCHG doesn't affect flags, this actually
causes no data to change.
-- 
{hplabs,amdahl,ames}!bnrmtv!perkins         --Henry Perkins

It is better never to have been born.  But who among us has such luck?
One in a million, perhaps.

perkins@bnrmtv.UUCP (10/03/87)

In article <539@cernvax.UUCP>, carsten@cernvax.UUCP (carsten) writes:
> Even if the NOP is not executed, it has to be loaded into the instruction
> register (the CPU's cannot foresee the NOP (yet))! Hence it slows down
> the CPU!

Nope.  Instruction pre-fetch operates at least as fast (2 bus
cycles per byte -- one each for segment and offset) as instruction
execution.  The only penalty occurs on the first instruction after
a branch: that instruction has to be fetched before it can be
executed.  (There are actually some exceptions, such as when a
co-processor hogs the bus and prevents the Bus Interface Unit from
fetching instructions, but these are rare.)
-- 
{hplabs,amdahl,ames}!bnrmtv!perkins         --Henry Perkins

It is better never to have been born.  But who among us has such luck?
One in a million, perhaps.

kad@ttrdc.UUCP (Keith Drescher) (10/06/87)

In article <2306@sphinx.uchicago.edu> cjdb@sphinx.uchicago.edu (Charles Blair) writes:
>
>In debugging code written in assembly-language, I notice that MASM
>periodically inserts NOP's in places where they don't appear in the
>source. I imagine this is done to slow down the processor, but how
>does MASM "decide" the appropriate circumstances?
>

If MASM does insert nops
it probably is not to
slow down the processor
(why would anyone want
to slow down the processor)
but instead to "pad" a
branch instruction that was
assumed to be a far reference
by pass one and then found to
be a near reference by pass 2. 

Just a thought -KD

-- 
Keith Drescher (kad@ttrdc)          	   | ... You can check out any      
AT&T                                       | time you like - but you can
Computer Systems Division, Skokie, Il.     | never leave ...              
PATH: ...!ihnp4!ttrdc!kad                  |          - Hotel California