[comp.sys.ibm.pc.programmer] Wanted -- experiences designing with NEC V25 processor

pec@necntc.nec.com (Paul Cohen) (05/31/90)

In Article 1570 of comp.sys.ibm.pc.programmer, jhuang@sci.ccny.cuny.edu 
(Jian Huang) comments:

> NEC 25 is actually a microcontrollor which has some useful
> features such as macro service (kind of DMA) and bank
> switch. 

The V25 has eight macro-service channels in addition to two independent
DMA channels.  Macro-service is like DMA except that it services internal 
peripherals whereas DMA is for external accesses.  The internal 
peripherals of the V25/35 include two timers, two full duplex UART's, 
a programmable interrupt controller and a wait state generator.  
As mentioned, the parts support register bank switching.  They 
provide 24 I/O lines and eight analog comparator inputs as well.

These processors have a prefetch pointer which is separate from the 
instruction pointer to support its six byte instruction prefetch queue 
and to reduce branch penalty.

>         After using it in our PC-based motion controllor
> card we found that its main problem is the same as one in
> intel 8088 --- " branch penalty " which got improved in
> 80188. So if you don't care speed then it is a good chip
> for compact motherboard design. 

The manuals for these parts show that the execution time for a far
branch for the 80188 is 14 cycles, in sharp contrast to the 15 cycles
for the V-Series parts and the 8088.  

Branch penalty is only one of many factors that affect performance.
Another is memory fetch time - two cycles per byte for the V25 and
four for the 8088 and 80188.  The most widely accepted method for
judging performance is with benchmarks so I submit the following data
from NEC's V-Series benchmark report (Stock Number 501270) which shows 
execution time in seconds:

			     0 Wait States                2 Wait States
			    V25      80188	        V25   	   80188

 EDN A			     18         16.5	          26	     22.375
 EDN E			    220        280.5	         269	    337
 EDN F			    240        274	         354	    395
 EDN H			    290        377.25	         456	    532.5
 EDN I			  51961      57856	       74635	  78204
 EDN K			   1094       1391	        1636	   1991.5
 IEEE Digital Filter	    318        363.875	         447	    537
 Sieve			 307236     328403	      509934	 534263
 Fibonacci		2804014    2700922	     4363825	4474487

The V25 measurements shown above are for the ROMless 70320 which is slightly 
slower than the 70322 ROM part.  For these nine benchmarks the execution
time for the V25 at zero wait states ranges from 76.8% to 109.1% of the
execution time for the 80188 with the average being 89.5%.  At two wait
states the range is from 79.8% to 116%, the average being 91.7%.

These benchmarks were measured using exactly the same code for the two
processors.  8 MHz. versions of each part were measured.  The V-Series 
parts have some additional instructions which speed up some of the 
benchmarks, as shown below.

 EDN F			    167           	         247	       
 EDN K			    718           	        1120