FPGA在高性能数字信号处理领域越来越受关注,如无线基站。在这些应用中,
FPGAs通常被用来和DSP处理器并行工作。有更多的选择当然是好的,但这也意味着系统设计师需要一个确切的FPGAs及高端DSP信号处理器性能参数图。不幸的是,常用的参数图在这种情况下都是不可靠的。
例如,由于数字信号处理应用程序主要依赖于乘法累加器( MAC )操作,
DSP处理器供应商和FPGA供应商通常将MACs每秒最高运转速度作为数字信号处理器性能好坏最简单的评判方式。但仅仅通过MAC吞吐量来预测数字信号处理性能是有失公平的,对FPGA和DSP也一样。这里有几个原因。
MAC计算出来的FPGA性能指数总是假设硬连线的数字信号处理部件是在其最高时钟速率运行的。在实践中,典型的FPGA设计将采用较低的速度。另一方面,使用硬连线原理并不是在FPGA上执行实现MAC的唯一方法;另外MAC吞吐量可以通过使用可编程逻辑资源和分布式算法来实现。此外,并不是所有的信号处理算法都采用MAC密集型。例如,Viterbi译码,是电信应用中的一个关键的DSP算法,并没有用到MAC系统。
另一种用来评估信号处理性能的办法,是使用普通的DSP功能(如FIR滤波器)
。但是,这种办法也有缺点。其中一个问题是,每个供应商通常使用不同的执行方式来执行这些功能,也许是使用不同的数据宽度、不同的算法或不同的执行参数(如延迟)。这意味着,从不同的供应商得出的结论一般都没有可比性。此外,小的内核功能通常不能作为有效的FPGA基准,因为在完整的FPGA应用中执行一个功能的方法往往是完全不同于你单独执行的功能。
(相对于处理器,这些小基准通常在预测总体的DSP应用程序性能时表现不错。
)此外,经过处理器或FPGA供应商执行的基准往往缺乏独立的核查,因此工程师很难对几种设备作出比较。
几年前BDTI公司就意识到建立独立性是日益迫切需要的,确切来说,面向数字信号处理应用采用苹果对苹果的方式来比较FPGA和处理器。
(见侧栏:什么是BDTI ?)为了满足这一需要, BDTI开发出一种新的面向应用的基准, BDTI通讯基准( OFDM )™
,这是基于正交频分复用( OFDM )接收器。
最近BDTI用BDTI通讯基准( OFDM
)来评估一些新的高性能FPGAs和DSP处理器。整套的标准测试结果和分析,现刊载于BDTI的报告“DSP的FPGAs实现:第二版”。图1显示样品归一化,赛灵思SX25和典型的高性能DSP处理器的低成本结果。
正如该图所示,
BDTI公司的标准测试结果提供了一个戏剧性的证明,在高性能DSP应用中使用FPGAs有潜在的成本优势——基于这一基准,SX25比一般的高性能DSP处理器更符合成本效益,而且不止一个数量级。
设计人员还需要了解所选择的处理引擎,将如何影响它们的发展流程、实施工作和系统设计。出于这个原因,
BDTI的报告探讨了质的因素,该因素影响决定是否使用一个FPGA
,数字信号处理,或两者兼施,并提供指导如何作出明智的选择。该报告强调的关键开放性问题,这将影响FPGA在高端的DSP应用的长期成功,如FPGA的能量效率和FPGAs的新高层次合成工具的效率。
什么是BDTI ?BDTI ( www.BDTI.com )是最受关注的信号处理基准资源。
BDTI测试信号处理性能的处理器近15年,并在最近几年已扩大其测试范围,包括FPGA 、多核心芯片,以及其他技术。
原文:
FPGAs’ DSP Performance, Revealed
FPGAs are increasingly being considered for use as processing
engines in high-performance digital signal processing applications,
such as wireless base stations. In these applications, FPGAs
frequently work side-by-side with DSP processors.
Having more choices is good, but it means that system designers
need a clear picture of the signal processing performance of
FPGAs–both relative to each other, and relative to high-end DSP
processors. Unfortunately, the most commonly used performance
figures are unreliable and confusing.
For example, because digital signal processing applications often
rely heavily on
multiply-accumulate (MAC) operations, DSP processor vendors and
FPGA vendors sometimes use peak MACs per second as a simple metric
for comparing digital signal processing performance. But MAC
throughput is a lousy predictor of digital signal processing
performance, for FPGAs and DSPs alike. Here are a few reasons why.
The MAC performance numbers for FPGAs often assume the hard-wired
digital signal
processing elements are operating at their highest possible clock
rate. In practice, typical FPGA designs will operate at lower
speeds.
On the other hand, using hard-wired elements is not the only way to
implement MACs on FPGAs; additional MAC throughput can be achieved
using programmable logic resources and distributed arithmetic.
Furthermore, not all signal processing algorithms are
MAC-intensive.
Viterbi decoding, for example, is a key DSP algorithm used in
telecommunications applications that makes no use of MACs at all.
Another approach for assessing signal processing performance is to
use common DSP functions (like FIR filters). But this approach can
have drawbacks, too. One problem is that each vendor typically uses
a different implementation of these functions–perhaps using
different data widths, a different algorithm, or different
implementation parameters (such as latency). This means that
results from different vendors are generally not comparable.
In addition, small kernel functions typically aren’t effective for
FPGA benchmarking, because the way you’d implement a function
within a full FPGA application is often quite different from the
way you’d implement the function alone. (For processors, in
contrast, these little benchmarks are usually pretty good at
predicting overall DSP application performance.) In addition,
benchmarks implemented by processor or FPGA vendors often lack
independent verification, making it difficult for engineers to make
confident comparisons between devices.
Several years ago BDTI recognized the increasingly urgent need for
independent, accurate, apples-to-apples performance comparisons
among FPGAs and processors targeting digital signal processing
applications. (See sidebar: Who
is BDTI?) To address this need, BDTI developed a new
application-oriented benchmark, the BDTI Communications Benchmark
(OFDM)™, that is based on an orthogonal frequency division
multiplexing (OFDM) receiver.
Recently BDTI used The BDTI Communications Benchmark (OFDM) to
evaluate several new high-performance FPGAs and DSP processors. The
full set of benchmark results and analysis are published in BDTI’s
report, “FPGAs for DSP: Second Edition.” Figure 1 shows sample
normalized, low-cost results for a Xilinx SX25 and a typical
high-performance DSP processor.
As shown in this figure, BDTI’s benchmark results provide a
dramatic demonstration of the potential cost advantages of using
FPGAs for high-performance DSP applications–the SX25 is more than
an order of magnitude more cost-effective than a typical
high-performance DSP processor on this benchmark.
Designers also need to understand how the choice of processing
engine will affect their development flow, implementation effort,
and system design. For this reason, BDTI’s report explores the
qualitative factors that influence the decision of whether to use
an FPGA, a DSP, or both, and provides guidance on how to make an
informed choice. The report highlights key open questions that will
affect the long-term success of FPGAs in high-end DSP applications,
such as FPGA energy efficiency and the effectiveness of new
high-level synthesis tools for FPGAs.
Who is BDTI?BDTI (www.BDTI.com) is the most respected source for signal
processing benchmarks. BDTI has been benchmarking the signal
processing performance of processors for nearly 15 years, and in
recent years has expanded its benchmarking activities to
include FPGAs, multi-core chips, and other technologies.
作者:Jeff Bier, BDTI 编译:与非网 韦职英
原文出处:http://eecatalog.com/dsp/2007/09/20/fpgasdsp-performancerevealed/