[sci.electronics] DCO's revisited - an inquiry into hardware implementation

mark@cogent.UUCP (Captain Neptune) (07/06/87)

I recently posted an article discussing the design of a totally digital
oscillator for use in a musical synthesizer.  It was posted in the
"rec.music.synth" group, I forget what the distribution was.  If anyone
did not receive it and wishes for a copy, email me.  Anyway, in an attempt
to actually work out the hardware implementation, I have run into a problem.

The key part of the DCO is the adder-latch feedback loop.  Theoretically,
as my article described, you can run it at a clock rate of your choosing and
use the number of bits of your choosing, in order to achieve the desired
range and accuracy.  The problem, however, is getting circuitry that will
run fast enough.  Specifically:

The only binary adder that I can locate that is reasonably simple and
readily available is the 7483 IC.  For faster speed, the CMOS version,
the 74HC283, is available (slightly different pinout, but same stuff).
The 74HC283 is plenty fast, BUT it is only 4 bits wide.  To handle a
large number of bits (between 20 and 32), this means cascading a bunch
of them (so their carry-bits can be connected together).  The propogation
delays for the carry bits add up to a large amount.  The bottom line is
this: the adder gets REAL slow due to the required rippling of so many
carry bits.

I see 3 possible solutions, and request feedback on them (or any other ideas):

1. Get a adder that handles more bits.  An 8-bit adder IC would approximately
   double the speed of the overall adder.  An adder that handled more than
   8 bits would be even better.  But where to find one?????

2. Use PALs.  A PAL could be programmed to act as an adder that handles many
   more bits.  An 8-bit adder could be made easily.  Some PALs even have
   latched outputs, so the whole adder-latch loop could be burned into the
   PAL.  The problem: which PAL is most suitable and where to get it
   programmed for less than an arm and a leg.

3. Use a Microprocesser Slice chip (an ALU, that is).  The 7400 series
   has a cruddy 4-bit ALU, which is no better off.  There is a place called
   Cypress Semiconductor (sp?) which makes 16-bit slices.  Two of those
   cascaded would give you a damn-good 32-bit ALU.  The problem:  fitting
   it into a reasonably simple DCO design (probably require microcode to
   run the ALU) and availability (will they sell me less than lots of 1000?)

Well, net?  There you have it.  What do YOU think is a good course of action?

Like I said, email if you want to see the original article.  It may clarify
my questions for you.

Thanks in advance.
-- 
+----------------------------------------------------------------------------+
|     Mark Steven Jeghers: the terrorist smuggling CIA weapons to Libya      |
|                                                                            |
|     {ihnp4,cbosgd,lll-lcc,lll-crg}|{dual,ptsfa}!cogent!mark                |
|                                                                            |
|     Standard Disclaimer:  Contents may have settled during shipment.       |
+----------------------------------------------------------------------------+

thompson@calgary.UUCP (07/07/87)

Off the top, it's been a while since I played with hardware, but here's my
$0.05 (after inflation :-) ) worth.

	It seems to me that the fastest method would be to either purchase
or construct a look-ahead carry generator to feed into half-adders. This
should give you the least possible number of gate-delays for any word size
you wish to use. Of course a PAL will always be faster, but the construction
is more of a pain.

	Hope this helps a little,
	Bruce Thompson.

P.S.: if you loo in any good digital circuit design book there should be
a mention of look-ahead carry generation.

------------------------------------------------------------------------------
Bruce Thompson				| Disclaimer? But...but... I didn't
University of Calgary,			| say anything....really! Well,
Computer Science Department		| nothing of any interest anyways.
(403)220-3538 or (403)220-5109 (office)	|

cmcmanis%pepper@Sun.COM (Chuck McManis) (07/07/87)

In article <235@cogent.UUCP.> mark@cogent.UUCP (Mark Steven Jeghers) writes:
.>
.>I recently posted an article discussing the design of a totally digital
.>oscillator for use in a musical synthesizer.  It was posted in the
.>"rec.music.synth" group, I forget what the distribution was.  If anyone
.>did not receive it and wishes for a copy, email me.  Anyway, in an attempt
.>to actually work out the hardware implementation, I have run into a problem.
.>
.>The key part of the DCO is the adder-latch feedback loop.  Theoretically,
.>as my article described, you can run it at a clock rate of your choosing and
.>use the number of bits of your choosing, in order to achieve the desired
.>range and accuracy.  The problem, however, is getting circuitry that will
.>run fast enough.  Specifically:
.>
.>The only binary adder that I can locate that is reasonably simple and
.>readily available is the 7483 IC.  For faster speed, the CMOS version,
.>the 74HC283, is available (slightly different pinout, but same stuff).
.>The 74HC283 is plenty fast, BUT it is only 4 bits wide.  To handle a
.>large number of bits (between 20 and 32), this means cascading a bunch
.>of them (so their carry-bits can be connected together).  The propogation
.>delays for the carry bits add up to a large amount.  The bottom line is
.>this: the adder gets REAL slow due to the required rippling of so many
.>carry bits.

The secret (one of them) is to use an adder with carry 'look ahead' so 
that each adder only adds one gate delay to the loop. The second secret
is to use smaller wave tables. A DCO built with a 256 byte lookup table
needs a clock rate of 5.12 Mhz to generate a 20Khz tone. Since 20K
is at the top of the audio spectrum and 5.12 Mhz is well with in TTL
limits you can do reasonably well at that rate. Now for generating 
music (with frequencies accurate to say three decimal digits) you will
need to 'overclock' the circuit to get the appropriate frequencies. 
Waving the hands here and throwing out some math, lets see, add three
orders of magnitude to 20K, which gives you 20Meg, multiply by 256
and wow we're up to 5 gigahertz. Fortunately if you are using the
DCO for music you can cheat because you 'know' that A in one octave is
half the frequency of the next octave up, or twice the frequency of the
next octave down. And for that means you simply set up 12 oscillators
that have the desired frequencies and switch between them depending on
the note. Say your top note is a *very* high A at 28,160Hz then your
A oscillator would be running at 7.208960 Mhz. And although that particular
value may be hard to find you can find one close and use a trimmer capacitor
to detune it to the appropriate frequency. 


--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@sun.com
These opinions are my own and no one elses, but you knew that didn't you.

prs@oliveb.UUCP (Philip Stephens) (07/08/87)

In article <22755@sun.uucp> cmcmanis@sun.UUCP (Chuck McManis) writes:
>In article <235@cogent.UUCP.> mark@cogent.UUCP (Mark Steven Jeghers) writes:
>.>
>.>I recently posted an article discussing the design of a totally digital
>.>oscillator for use in a musical synthesizer.  It was posted in the

>.>The key part of the DCO is the adder-latch feedback loop.  Theoretically,

>.>of them (so their carry-bits can be connected together).  The propogation
>.>delays for the carry bits add up to a large amount.  The bottom line is
>.>this: the adder gets REAL slow due to the required rippling of so many
>.>carry bits.
>
>The secret (one of them) is to use an adder with carry 'look ahead' so 
>that each adder only adds one gate delay to the loop. The second secret
>is to use smaller wave tables. A DCO built with a 256 byte lookup table
>needs a clock rate of 5.12 Mhz to generate a 20Khz tone. Since 20K

Lookahead, yes... or use a DSP chip with a 32 bit accumulator.  BTW, I
meant to reply to same article as this one did, that there are now MMI
PALs with 10 ns thruput, called 'D-PALS'.  Eight outputs, with 8, 6, 4, 
or none registered.

As for speed, you need about 40 Khz reguardless of currently selected
tone; vary the amount that gets added to the wrap-around pointer each
25 microseconds.  No need to go into Mhz, let alone Ghz, except for
doing 8 or 16 voices at once using the same RAM; whether same address
space or not, need seperate accumulators or swap in and out... much
higher bandwidth requirement, but less than 2 orders of magnitude 
above 40 Khz, ie less than 4 Mhz -- quite a low-tech frequency.

I happen to have borrowed documentation on the TI DSP chip TMS320C25,
which has 100 nS instruction time, 40 or 32 Mhz clock, 32 bit ALU and
accumulator, 544 'words' of onboard ram (16 bits each)... I haven't 
actually worked with it, but I think it's adequate (I borrowed the
info for looking at doing my own effects onboard; I think I'll need
seperate DSP for that if use one for sound-table indexing, as the
'effects' will need multiplication as well as addition).  I'm sure
other DSP chips and plain vanilla CPU's with 32 bit addition will do
for this indexing chore, although some might not have processing
power left over for MIDI management... any experts want to comment???


	- Phil		prs@oliveb.UUCP (Phil Stephens)     
	or:	(hplabs,ihnp4,sun,allegra)oliveb!oliven!prs 
	Mail welcome, but I'm too lazy to always answer   8-}  8-}
	Work phone: 408 996 3867 x2224	(approx 10a-6p PST/PDT)

mj@elmgate.UUCP (07/09/87)

	I've been thinking lately about doing digital music
	synthesis with a TI 32010 DSP chip.  As I understand it,
	the 32010 can perform a multiply/accumulate in 160 ns,
	and the 32035 can do it in 60!  The nice thing about
	the 32010 is that they only cost about $10, so it would
	be cheap to cascade them.

	I haven't read ANY of this anywhere, it's just info
	I got from friends, so it may be completely wrong.
	I hope to do some 320 work in the future, but right
	now I'm too busy.  If anyone tries this, drop me a line,
	I'd be very interested.

-------------------------------------------------------------------------------
Mark A Johnson - Eastman Kodak Co. - Dept 646 - KEEPS - Rochester, NY 14650
The opinions expressed above are not necessarily those of Eastman Kodak Company
and are entirely my responsibility.
UUCP ...rutgers!rochester!kodak!elmgate!mj

prs@oliveb.UUCP (Philip Stephens) (07/10/87)

In article <678@elmgate.UUCP> mj@elmgate.UUCP (Mark A. Johnson) writes:
>
>	I've been thinking lately about doing digital music
>	synthesis with a TI 32010 DSP chip.  As I understand it,
>	the 32010 can perform a multiply/accumulate in 160 ns,
>	and the 32035 can do it in 60!  The nice thing about
>	the 32010 is that they only cost about $10, so it would
>	be cheap to cascade them.

Me too!  Not sure which DSP to use, but I have borrowed doc. book
from a coworker for the TI chips.  Hard to find which commands
take more than one cycle, but 16 bit multiply is single cycle in
320- C10, 20, 25.  The C10 and 20 have 200 ns cycle (unless there is
a faster version than 1986 book lists), and the 32025 can have
cycle of 100 ns.  (Note that input clock is 4 times faster, and
has a maximum of 150 ns in all 3 cases).

Correction, I found the 1 vs 2 cycle info; mostly branches, push/pop,
and subroutine stuff take 2; all math is single instruction cycle;
there are two table-oriented instructions that take 3 cycles.  (I'm 
looking at the C10 data; the other two don't say cycles but do list
word length of instruction, and unlike the C10 they have some 2-word
math instructions, which I'm guessing take two cycles).

My guess for fetch one of several 32 bit indexes from on-chip ram, add a
16 bit increment, update index, use result as indirect address, and
output fetched word to an I/O port (ie, cycles per voice):

load word to low accumulator		1
load word to high accumulator		1
add word to low accumulator		1
store word from low accumulator		1
store word from high accumulator	1
load from offchip, with indirect addr	1??
output to port				2

total					8 (or 9?)
	ie, less than 2 microseconds with the 320C10, < 1 with 20 or C25!
(looks like can probably do about 12 voices per C10 at 40 Khz, or more
like 8 voices per chip if also do envelopes.  Can easily afford two chips,
or several.  Additional chip(s) for reverb, flanging, etc).

I'm just speculating from the data sheet; I hope someone else can give
more practical feedback on this or any other DSP chip family (including
Motorola, hint hint.  And I've sent for OKI info; anyone else already
have it?  Any other interesting chips to suggest?  DSP or other micro
processor with fast 16 or 32 bit add and multiply.)  Of course, being
cheap like the C10 helps!!!!!

BTW, you can probably also use the bugger as a UART for MIDI in and out;
silly in some applications, but handy if you put it in a gimmick box
that is just combining or splitting midi channels, etc.  And if you 
bother to learn to program the 320C10, you might as well use it instead
of a Z80 or whatever for simple tasks, even if they waste most of the
chip's ability.  Why learn 2 or 3 different processors?  My idea of
efficiency, anyway.

Is there enough interest for a DSP or C10 group, or rather, a mailing list?
(Email me, and I will report; not sure I want to moderate, but maybe).

	- Phil		prs@oliveb.UUCP (Phil Stephens)     
	or:	(hplabs,ihnp4,sun,allegra)oliveb!oliven!prs 
	Mail welcome, but I'm too lazy to always answer   8-}  8-}
	Work phone: 408 996 3867 x2224	(approx 10a-6p PST/PDT)

news@santra.UUCP (news) (07/13/87)

In article <1948@oliveb.UUCP> prs@oliven.UUCP (Philip Stephens) writes:
>
>My guess for fetch one of several 32 bit indexes from on-chip ram, add a
>16 bit increment, update index, use result as indirect address, and
>output fetched word to an I/O port (ie, cycles per voice):
>
>load word to low accumulator		1
>load word to high accumulator		1
>add word to low accumulator		1
>store word from low accumulator	1
>store word from high accumulator	1
>load from offchip, with indirect addr	1??
>output to port				2
>
>total					8 (or 9?)
>	ie, less than 2 microseconds with the 320C10, < 1 with 20 or C25!
>(looks like can probably do about 12 voices per C10 at 40 Khz, or more
>like 8 voices per chip if also do envelopes.  Can easily afford two chips,
>or several.  Additional chip(s) for reverb, flanging, etc).
>

The calculations above are obviously in the right magnitude. I quess you
want to do some filtering, too, and...

I quess it might be easier to have one or two complete voices on a chip
and the one for combining all together, with some effects, maybe, than
to have long chain of chips. The processing power doesn't increase,
of course, but it may well be easier to control and program in paraller
configuration. This way you will get more easily different sounds from
different voices.

>I'm just speculating from the data sheet; I hope someone else can give
>more practical feedback on this or any other DSP chip family (including
>Motorola, hint hint.  And I've sent for OKI info; anyone else already

Other names are National (32something), NEC, AT&T etc. Basically no real
differences in performance (given the same clock speed), at least in
this application. Texas has been around long and has support available.
(Compilers (for non-real-time stuff), evaluation boards, application 
notes ...) If you start from beginning, Texas is a good choice (as
could be many others, too; no flames. please.)

>Is there enough interest for a DSP or C10 group, or rather, a mailing list?
>
>	- Phil		prs@oliveb.UUCP (Phil Stephens)     

A few years ago there was some discussion for x.x.x.DSP, but no newsgroup.
If a mailing list will be set up, include me.

The DSP-based music syntheziser sounds as a very intresting project. I
volunteer to do some work for it, if a team can be assembled. I have quite
a good resources, wich I can use for hobby projects in my spare time.

Let's go for it!

Juha Kuusama, jku@kolvi ( ...!mcvax!tut!kolvi!jku )
 Helsinki University of Technology, Otakaari 5 I, 02150 Espoo, Finland

phd@speech1.cs.cmu.edu (Paul Dietz) (07/14/87)

In article <6641@santra.UUCP>, news@santra.UUCP (news) writes:
> In article <1948@oliveb.UUCP> prs@oliven.UUCP (Philip Stephens) writes:
> >I'm just speculating from the data sheet; I hope someone else can give
> >more practical feedback on this or any other DSP chip family (including
> >Motorola, hint hint.  And I've sent for OKI info; anyone else already
> 
> Other names are National (32something), NEC, AT&T etc...

The National part WAS the LM32900. Apparently National suddenly realized
that this was not a very profitable undertaking, and killed its entire
DSP effort. (After many years, and MANY millions...) Watch this space 
for futher announcements of other DSP chips falling by the wayside...

Paul H. Dietz
Carnegie Mellon University