博客首页 | 排行榜 |

SystemDesign的博客

个人档案
博文分类
FPGAs的DSP性能揭示  2009-05-25 17:44
FPGA在高性能数字信号处理领域越来越受关注,如无线基站。在这些应用中, FPGAs通常被用来和DSP处理器并行工作。有更多的选择当然是好的,但这也意味着系统设计师需要一个确切的FPGAs及高端DSP信号处理器性能参数图。不幸的是,常用的参数图在这种情况下都是不可靠的。

例如,由于数字信号处理应用程序主要依赖于乘法累加器( MAC )操作, DSP处理器供应商和FPGA供应商通常将MACs每秒最高运转速度作为数字信号处理器性能好坏最简单的评判方式。但仅仅通过MAC吞吐量来预测数字信号处理性能是有失公平的,对FPGA和DSP也一样。这里有几个原因。

MAC计算出来的FPGA性能指数总是假设硬连线的数字信号处理部件是在其最高时钟速率运行的。在实践中,典型的FPGA设计将采用较低的速度。另一方面,使用硬连线原理并不是在FPGA上执行实现MAC的唯一方法;另外MAC吞吐量可以通过使用可编程逻辑资源和分布式算法来实现。此外,并不是所有的信号处理算法都采用MAC密集型。例如,Viterbi译码,是电信应用中的一个关键的DSP算法,并没有用到MAC系统。

另一种用来评估信号处理性能的办法,是使用普通的DSP功能(如FIR滤波器) 。但是,这种办法也有缺点。其中一个问题是,每个供应商通常使用不同的执行方式来执行这些功能,也许是使用不同的数据宽度、不同的算法或不同的执行参数(如延迟)。这意味着,从不同的供应商得出的结论一般都没有可比性。此外,小的内核功能通常不能作为有效的FPGA基准,因为在完整的FPGA应用中执行一个功能的方法往往是完全不同于你单独执行的功能。 (相对于处理器,这些小基准通常在预测总体的DSP应用程序性能时表现不错。 )此外,经过处理器或FPGA供应商执行的基准往往缺乏独立的核查,因此工程师很难对几种设备作出比较。

几年前BDTI公司就意识到建立独立性是日益迫切需要的,确切来说,面向数字信号处理应用采用苹果对苹果的方式来比较FPGA和处理器。 (见侧栏:什么是BDTI ?)为了满足这一需要, BDTI开发出一种新的面向应用的基准, BDTI通讯基准( OFDM )™ ,这是基于正交频分复用( OFDM )接收器。

最近BDTI用BDTI通讯基准( OFDM )来评估一些新的高性能FPGAs和DSP处理器。整套的标准测试结果和分析,现刊载于BDTI的报告“DSP的FPGAs实现:第二版”。图1显示样品归一化,赛灵思SX25和典型的高性能DSP处理器的低成本结果。
BDTI公司的标准测试结果提供了一个戏剧性的证明

正如该图所示, BDTI公司的标准测试结果提供了一个戏剧性的证明,在高性能DSP应用中使用FPGAs有潜在的成本优势——基于这一基准,SX25比一般的高性能DSP处理器更符合成本效益,而且不止一个数量级。

设计人员还需要了解所选择的处理引擎,将如何影响它们的发展流程、实施工作和系统设计。出于这个原因, BDTI的报告探讨了质的因素,该因素影响决定是否使用一个FPGA ,数字信号处理,或两者兼施,并提供指导如何作出明智的选择。该报告强调的关键开放性问题,这将影响FPGA在高端的DSP应用的长期成功,如FPGA的能量效率和FPGAs的新高层次合成工具的效率。

什么是BDTI ?
什么是BDTI ?

BDTI ( www.BDTI.com )是最受关注的信号处理基准资源。 BDTI测试信号处理性能的处理器近15年,并在最近几年已扩大其测试范围,包括FPGA 、多核心芯片,以及其他技术。

原文:

FPGAs’ DSP Performance, Revealed

FPGAs are increasingly being considered for use as processing engines in high-performance digital signal processing applications, such as wireless base stations. In these applications, FPGAs frequently work side-by-side with DSP processors.

Having more choices is good, but it means that system designers need a clear picture of the signal processing performance of FPGAs–both relative to each other, and relative to high-end DSP processors. Unfortunately, the most commonly used performance figures are unreliable and confusing.

For example, because digital signal processing applications often rely heavily on
multiply-accumulate (MAC) operations, DSP processor vendors and FPGA vendors sometimes use peak MACs per second as a simple metric for comparing digital signal processing performance. But MAC throughput is a lousy predictor of digital signal processing performance, for FPGAs and DSPs alike. Here are a few reasons why.

The MAC performance numbers for FPGAs often assume the hard-wired digital signal
processing elements are operating at their highest possible clock rate. In practice, typical FPGA designs will operate at lower speeds.

On the other hand, using hard-wired elements is not the only way to implement MACs on FPGAs; additional MAC throughput can be achieved using programmable logic resources and distributed arithmetic. Furthermore, not all signal processing algorithms are MAC-intensive.

Viterbi decoding, for example, is a key DSP algorithm used in telecommunications applications that makes no use of MACs at all.

Another approach for assessing signal processing performance is to use common DSP functions (like FIR filters). But this approach can have drawbacks, too. One problem is that each vendor typically uses a different implementation of these functions–perhaps using different data widths, a different algorithm, or different implementation parameters (such as latency). This means that results from different vendors are generally not comparable.

In addition, small kernel functions typically aren’t effective for FPGA benchmarking, because the way you’d implement a function within a full FPGA application is often quite different from the way you’d implement the function alone. (For processors, in contrast, these little benchmarks are usually pretty good at predicting overall DSP application performance.) In addition, benchmarks implemented by processor or FPGA vendors often lack independent verification, making it difficult for engineers to make confident comparisons between devices.

Several years ago BDTI recognized the increasingly urgent need for independent, accurate, apples-to-apples performance comparisons among FPGAs and processors targeting digital signal processing applications. (See sidebar: Who
is BDTI?) To address this need, BDTI developed a new application-oriented benchmark, the BDTI Communications Benchmark (OFDM)™, that is based on an orthogonal frequency division multiplexing (OFDM) receiver.

Recently BDTI used The BDTI Communications Benchmark (OFDM) to evaluate several new high-performance FPGAs and DSP processors. The full set of benchmark results and analysis are published in BDTI’s report, “FPGAs for DSP: Second Edition.” Figure 1 shows sample normalized, low-cost results for a Xilinx SX25 and a typical high-performance DSP processor.

As shown in this figure, BDTI’s benchmark results provide a dramatic demonstration of the potential cost advantages of using FPGAs for high-performance DSP applications–the SX25 is more than an order of magnitude more cost-effective than a typical high-performance DSP processor on this benchmark.

Designers also need to understand how the choice of processing engine will affect their development flow, implementation effort, and system design. For this reason, BDTI’s report explores the qualitative factors that influence the decision of whether to use an FPGA, a DSP, or both, and provides guidance on how to make an informed choice. The report highlights key open questions that will affect the long-term success of FPGAs in high-end DSP applications, such as FPGA energy efficiency and the effectiveness of new high-level synthesis tools for FPGAs.

Who is BDTI?

BDTI (www.BDTI.com) is the most respected source for signal processing benchmarks. BDTI has been benchmarking the signal processing performance of processors for nearly 15 years, and in recent years has expanded its benchmarking activities to
include FPGAs, multi-core chips, and other technologies.

作者:Jeff Bier, BDTI   编译:与非网 韦职英

原文出处:http://eecatalog.com/dsp/2007/09/20/fpgasdsp-performancerevealed/
类别:芯片设计 |
上一篇:可配置电源管理ASIC--当今的系统黏合剂 | 下一篇:异构多核片上系统实例
以下网友评论只代表其个人观点,不代表本网站的观点或立场