还记得你父亲的DSP么?好的,德州仪器的Piccolo不是。Tom提出了“DSP在伪装”并描述了你如何能够使用它进行信号处理应用,如开关电源以及其他方便的应用。
Whistle While You Work
A Look at a Modern DSPRemember your father’s DSP? Well, the Texas Instruments Piccolo
isn’t it. Tom presents this “DSP in disguise” and describes how you
can use it to handle signal-processing applications such as
switcher power supplies and other handy applications.
Labels and acronyms have always been part of the silicon game.
Shorthand can be helpful, but sometimes it can be misleading,
especially as the underlying technology changes over time.
RISC and CISC are examples of labels whose meanings have devolved
to the point of meaninglessness. Yes, when originally coined many
years ago, the terms clearly defined distinct architectures. But
over time, as each camp adopted the best features of the other, the
differences have become blurred to the point that what a chip is
called has little to do with its “instruction set complexity.”
“Microcontroller” (i.e., MCU) and “microprocessor” (i.e., MPU) are
other schizophrenic labels, though still meaningful at the
extremes. For example, an 8051 is clearly an MCU, while the ’x86
under the hood of your PC is clearly an MPU. But in the middle are
a vast array of parts with aspects of both MCUs (e.g., on-chip
flash memory and I/O) and MPUs (e.g., external bus).
This month, let’s contemplate another acronym de jour, “DSP.” The
term itself has semantic ambiguity. After all, don’t MCUs and MPUs
process digital signals? Historically, DSPs were differentiated by
their multiple busses and high-speed math capabilities, as typified
by the classic MAC (multiply and accumulate) operation at the heart
of signal-processing (e.g., filter) inner loops. But these days,
virtually every 32-bit MCU or MPU has a measure of those
capabilities as well.
Indeed, if anything, the trend has seen classic DSPs on the
defensive. For instance, long-time Circuit Cellar contributor
Professor Michael Smith wrote an article titled “To DSP or Not To
DSP: Will a RISC Chip Do It Better?” in Circuit Cellar 28. That was
way back in 1992!
But DSPs aren’t dead. They’ve just been hiding, retro-marketed with
new labels such as Digital Signal Controller. Along the way DSP
suppliers have diligently worked to overcome historic DSP
objections: high-price, high power consumption, hard-to-program,
needs extra chips and glue logic, and more.
Let’s put all the labels and preconceptions aside and take an
impartial look at a modern DSP in disguise, the Piccolo from Texas
Instruments. Yes, it’s a natural for classic DSP apps such as fancy
(e.g., sensorless motor controls and smart (e.g., power factor
correction) switcher power supplies. But I think you’ll be
surprised to see the potential Piccolo offers general-purpose
applications as well. In many, many respects it definitely isn’t
your father’s DSP.
PIPE DREAMFigure 1—The TMS320F280xx Piccolo family is the latest addition to TI’s
venerable line of C2000 DSPs (er, make that MCUs).
Piccolo comprises a family of parts with roots in the venerable
Texas Instruments ’C2000 DSP line-up (see Figure 1). Nevertheless,
from 50,000’, you’d be hard-pressed to tell the difference between
Piccolo and a traditional 32-bit MCU (see Figure 2). Notably, it’s
got on-chip memory, peripherals, and glue logic and is fully
capable of stand-alone, singlechip operation.
Figure 2—With on-chip flash memory and RAM, glue logic, and a full
complement of I/O, Piccolo is a contender for low-cost single-chip
applications.
But dive down to treetop level and the DSP difference becomes more
apparent. The processing core itself, with an eight-stage pipeline,
is more complicated than the three-to-fivestage unit you’ll find on
a typical 32-bit MCU. It’s arguably a bit of architectural overkill
because the slowest Piccolo runs at blue-collar 40 MHz, but makes
more sense when you realize Piccolo should, and does, maintain a
measure of compatibility (i.e., assembly source) with higher-end
’C2000 parts that run at hundreds of megahertz.
Longer pipelines can be more hazardous—for example, when one
instruction tries to read an operand not yet written by the
preceding one, and Piccolo is no exception. However, the TI design
features a measure of hardware interlocking that will keep you out
of trouble. Ideally, you, or more likely the C compiler, will
schedule instructions to avoid hazards; but if not, the pipeline
will automatically stall. Notably, this does not incur the code
bloat of delay slots (i.e., NOP insertions) that purely
software-scheduled pipelines incur.
Figure 3—RISC, CISC, or DSP? The Piccolo architecture combines aspects of
all three.
Piccolo features an interesting mix of architectural styles, as
seen in the layout of the main registers (see Figure 3). The XT, P,
and ACC registers do the DSP heavy lifting for the singlecycle 32 ×
32 multiplier and barrel shifter. On the other hand, the 8-×32-bit
XAR general-purpose registers lend a RISC-like feel. To streamline
direct addressing, the 16-bit data pointer (DP) register fronts a
6-bit offset contained in the instruction to reach any location in
the lower 4M words of data space. Of course, much of the big-iron
addressing capability is underused in singlechip devices.
While Piccolo is a Harvard design capable of simultaneous
instruction fetch, memory read, and memory write, access is made
simpler by the fact memories are mapped into both program and data
spaces. That’s helpful because it makes it easy to access data
stored in flash memory or run programs stored in RAM.
Instruction set-wise, Piccolo has something for everyone with RISC,
CISC, and DSP all rolled into one. As noted above, the XAR
registers support a measure of load/store processing, yet also can
serve as pointers that allow instructions to operate directly on
memory. The DSP DNA is apparent in all manner of number-crunching
embellishments, such as saturation, rounding, and so on. But
overall, Piccolo is quite CISCy, both as a matter of principle and
to maintain software compatibility with earlier ’C2000 parts.
EASY DOES ITHistorically, one of the knocks on DSPs was that they paid little
attention to the details beyond their number-crunching mission,
burdening a design (and the designer) with the need for extra
peripheral chips, clock generation, multiple power supplies, and
sundry glue logic. It’s here that Piccolo stands out from its
predecessors with features like an on-chip oscillator with
PLL and missing clock detection, a single 3.3-V power supply with
an on-chip regulator, power-on/brownout/watchdog RESET, and a
vectored interrupt controller, all in tidy surface-mount
packages. Unlike traditional DSPs, Piccolo is downsized with
MCU-like 38-, 48-, 64-, and 80-pin package options.
Peripheral-wise, there’s a full complement comprising the usual
suspects: GPIO pins (with an input glitch filtering feature),
serial ports (UART, SPI, I2C—have it your way), and three
general-purpose 32-bit timers. Piccolo’s signal-centric aspirations
are served by multichannel, high-speed (up to 4.6 Msps) 12-bit ADC
with flexible triggering and auto-sequencing options.
These features are all well and good, but except for the formidable
number crunching capability, they’re otherwise little different
than those found on a typical 32-bit flash memory MCU (which is,
after all, the point). All else being equal, for those who aren’t
already in a committed relationship with Piccolo’s C2000
predecessors, it would seem there’s little compelling reason to
switch.
But maybe all else isn’t equal, considering the advanced I/O
capabilities embodied in Piccolo’s on-chip enhanced control
peripherals, which include enhanced (ePWM) and high-resolution
(HRPWM) PWMs, input capture (eCAP) and quadrature encoder (eQEP).
Naturally, the enhanced I/O modules are a big plus for traditional
DSP apps (e.g., motor control) but may find favor in other
high-frequency, timing-centric applications. For example, using a
fancy “Micro-Edge Positioning” technique, the HRPWM offers edge
timing resolution on the order of 150 ps. Yes, that’s “picoseconds”
with a “p.” Try that with your run-of-the-mill MCU.
Table 1—The control law accelerator (CLA) in higher-end Piccolos can handle
closed-loop control by itself, freeing the main processor for other
tasks. In this example provided by TI, the combination of the C28x
processor core and the CLA is nearly three times faster than the
processor core alone.
Higher-end Piccolos will also include what TI calls a control law
accelerator (CLA). I don’t have specs yet, but sifting through the
press tea leaves reveals the CLA is an independent 32-bit
floating-point coprocessor that autonomously runs control loops
(e.g., PID). Able to communicate directly with peripherals (e.g.,
ADC, ePWM) and deal with interrupt requests, the CLA is said to
significantly reduce overhead for the main processor (see Table 1).
SYMPHONY IN COf course, if you’re already into DSPs, TI has you covered when it
comes to apps like motor control, smart power supplies, and such.
They’ve got an effective development and prototyping regime
comprising DIMM-like processor modules that plug into various
application-specific motherboards and plenty of software, app
notes, and more to go with. There’s even something called DSP/BIOS,
kind of a mini-me modular RTOS, comprising a library with hundreds
of basic data acquisition, storage, and control functions.
But you don’t need to be a rocket scientist, or have a budget like
NASA, to kick the tires. Let’s check out the Piccolo MCU
controlSTICK, which definitely qualifies as an impulse buy at just
$39.
Photo 1—If it walks like an MCU and talks like an MCU? The MCU controlSTICK
highlights the fact Piccolo (shown here along with a Future
Technology Devices International chip that handles the USB
interface) is well-suited for low-cost single-chip applications.
If you haven’t caught on to the fact Piccolo can impersonate an
MCU, Photo 1 should make it clear. There, you can see that the
control-STICK comprises little more than the Piccolo itself along
with a Future Technology Devices International chip to handle the
USB interface. If it walks like an MCU and talks like an MCU?
Photo 2—Programmers will feel right at home with Code Composer Studio, TI’s
full-featured C/C++ IDE for Piccolo.
Photo 3—With a single-chip MCU, all the action is hidden inside behind the
pins. Piccolo includes an on-chip bus analyzer that goes far beyond
a simple “breakpoint” so you can see what's going on.
The same goes for the Texas Instruments Code Composer Studio, which
at least at first glance, looks like a typical MCU toolchain with a
dizzying array of menus and windows to play with (see Photo 2).
Indeed, if anything, Code Composer Studio takes it a step further
with advanced analysis capabilities that leverage Piccolo’s on-chip
debug hardware (see Photo 3). Scratch a bit further under the
surface and you’ll find unique vestiges of Piccolo’s DSP heritage.
For example, you can certainly monitor data with the conventional
Watch Window. But you can also capture data as a graph, and Code
Composer Studio can even run it through an FFT for you (see Photo
4). Pretty cool.
Photo 4—TI may not call Piccolo a “DSP,” but if you should happen to
stumble across a signal, Code Composer Studio has got you covered
with unique signal-centric debug features.
I’ll be frank and admit with the complexity of modern toolchains,
kicking the tires is just that. It would take quite a while to get
up to speed and actually test drive all of the advanced features.
The best I can do is tell you I ran through some of the demo
projects and everything worked as advertised.
As someone who has always been interested in computer architecture,
within the thousands of pages of chip, tool, and application
documentation, my attention was captured by the “Optimizing Your
Code” chapter in the C/C++ compiler manual.[1] I learned long ago
that any discussion of an architecture’s merits is moot unless
compiler quality is factored
in.
It was fun and interesting to see all the hoops the compiler jumps
through to tweak your code. I mean it’s almost as though the
compiler folks would just as soon get rid of the application
programmer and do it all themselves.
There’s a laundry list of dozens of potential optimizations. Many
of these are pretty obvious and oldschool, such as dead-code
removal (i.e., remove code that is not reachable) and common
sub-expression elimination (eliminates redundant calculations). And
as you might imagine, there are a number of low-level instruction
scheduling optimizations to keep that long pipeline from stalling
while preserving the intent (e.g., ordering) of the programmer.
Actually, the DSP gurus have been at the front of the pack leading
the charge to fancier compilers, and it shows in CCStudio with some
truly Poindexter highlevel optimizations. You’re probably familiar
with ones like loop unrolling (eliminates branches) and function
inlining (ditto) but how about Alias Disambiguation?
C programmers love pointers, and C compiler writers hate them.
Here’s the “alias” dilemma. Compiler writers want the flexibility
to move instructions around willynilly. And they could if it
weren’t for those darn data dependencies. For example, if the
program is:
a = a1 * a2
…
b = b1 + b2
…
z = a + b
The first two statements can be moved around and even their order
can be reversed. The only restriction is the last statement has to
remain last because z’s value depends on the prior setting of a and
b.
Easy enough. But what if instead of referencing the variable by
name, the programmer referenced it using a dynamically calculated
pointer to (i.e., address of) the variable.
Now the compiler has to try to establish the possible run-time
values a pointer can, or can’t, take on in order to guarantee
dependencies aren’t compromised. Two different pointers may point
to the same variable (the “alias”) and the compiler has to figure
out whether that is, or isn’t, possible (the “disambiguation”).
I’m no expert, so it all seems like magic to me. I do remember
discussing the subject of alias disambiguation with a compiler
expert once. He told me there are ways to deal with the challenge
theoretically, a minor caveat being possibly infinite compile time.
An effective compromise is for you, the programmer, to give the
compiler some hints. For instance, the TI compiler has
“aliased_variable” options that enable you to tell the compiler
that you’re sure a particular pointer is safe from aliasing.
Just keep in mind that optimizers, especially ones that move
instructions around, can make debugging more mind-numbingly
complicated than it already is. Consider the common technique of
viewing your compiled program as C source mixed with assembly
language. The problem is a piece of the assembly language
associated with a particular line of C code may be moved to a
different
location. The compiler has an interlist option that keeps the
listing sane by restricting the optimizations.
Similarly, watch out if you mix inline ASM with your C code,
especially if it messes with C variables, functions, and so on. For
example, if your ASM code calls a C function, you may find the
compiler didn’t know that function was needed and optimized it
away. The compiler offers a “FUNCTION_EXT_CALLED” pragma you can
use to explicitly mark functions that should be preserved. There’s
also a “CALL ASSUMPTIONS” option that tells the compiler whether
your ASM code does, or doesn’t, call C functions or modify C
variables.
Photo 5—Code Composer Studio includes a Piccolo simulator that lets you see
what’s going on under the hood.
In short, debugging compiler optimized code is a pain, although
you’re welcome to try. If you get really deep into it, CCStudio
includes a pipeline simulator that may come in handy (see Photo 5).
THREE PsSo how does Piccolo measure up to other 32-bit flash memory MCUs
when it comes to the “three Ps”—performance, power, and price?
Of course, performance depends on the application being performed.
Clearly, Piccolo will excel in traditional DSP number-crunching
applications, all the more true by taking advantage of the unique
control law accelerator. But as well, there’s every reason to
believe Piccolo can hold its own in general-purpose applications
too. Indeed, TI claims Piccolo delivers 25% better Dhrystone
performance than a Cortex-M3-based MCU.
Power consumption is equally systemand application-dependent. The
specs show active power consumption of about 2 mA/MHz, which is a
little higher than generic 32-bit flash memory MCUs, but not bad
considering the horsepower on tap. Besides the clock rate, a key
factor is which, if any, of the Piccolo peripherals aren’t used and
can be powered down. For example, turning off the CAN module saves
11 mA and in fact simply disabling the CPU clock output pin
(XCLKOUT) saves a whopping 15 mA. Standby current consumption in
the lowest power mode (i.e., everything off) is on the order of 100
μA, which is somewhat higher than typical MCUs, but certainly not a
showstopper. My conclusion is that the slightly higher Piccolo
power consumption would only be an issue in the most
batterylife-sensitive applications.
Finally, there is price. I don’t have an official price quote, but
the press release says Piccolo starts at less than $2 in volume. In
terms of historic pricing for DSPs, that’s quite a bargain
considering high-end parts (such as TI’s own C6x line) can have
tripledigit price tags. At the same time, we’ve all seen the
headlines for $1 chips from other 32-bit flash memory MCU
suppliers.
There’s no free lunch, and you get what you pay for. If you need
only a plain-vanilla MCU that’s probably what you should use. But
if you can take advantage of even just one of its advanced
features—notably the number-crunching capability or the advanced
peripherals (e.g., HRPWM, CLA)—Piccolo may hit just the right note
in your application.
Tom Cantrell has been working on chip, board, and systems design
and marketing for several years. You may reach him by e-mail at
tom.cantrell@circuitcellar.com.
REFERENCE[1] Texas Instruments, “TMS320C28x Optimizing C/C++ Compiler,”
2007,
http://focus.ti.com/lit/ug/spru514c/spru514c.pdf.
SOURCEcontrolSTICK Evaluation kit and Piccolo microcontroller
Texas Instruments, Inc. |
www.ti.com