中文解释:现场可编程门阵列
英文缩写:FPGA
英文来历:Field Programmable Gate Array
一种预先将数字电路的基本电路配置成阵列状的半成品状态的门阵列。FPGA是英文Field Programmable Gate Array的缩写,即现场可编程门阵列,它是在PAL、GAL、EPLD等可编程器件的基础上进一步发展的产物。它是作为专用集成电路(ASIC)领域中 的一种半定制电路而出现的,既解决了定制电路的不足,又克服了原...
一种预先将数字电路的基本电路配置成阵列状的半成品状态的门阵列。
FPGA是英文Field Programmable Gate Array的缩写,即现场可编程门阵列,它是在PAL、GAL、EPLD等可编程器件的基础上进一步发展的产物。它是作为专用集成电路(ASIC)领域中 的一种半定制电路而出现的,既解决了定制电路的不足,又克服了原有可编程器件门电路数有限的缺点。FPGA(现场可编程门阵列)是专用集成电路(ASIC)中集成度最高的一种,用户可对FPGA内部的逻辑模块和I/O模块重新配置,以实现用户的逻辑,因而也被用于对CPU的模拟。用户对FPGA的编程数据放在Flash芯片中,通过上电加载到FPGA中,对其进行初始化。也可在线对其编程,实现系统在线重构,这一特性可以构建一个根据计算任务不同而实时定制的CPU,这是当今研究的热门领域。
和其他可编程逻辑器件一样,FPGA也由未完成的逻辑阵列所组成,通过将这些逻辑阵列连接到一起来完成一定的功能。像PAL一样,各个阵列单元之间的互连续是可以编程的。下面介绍FPGA的发展过程。
1985年,Xilinx公司推出了世界上第一款FPGA。推出的这款FPGA包括两个器件和支持市局布线的设计工具。FPGA的发展非常快,在不到10年的时间里,时钟频率就从不到10MHz提高到100MHz.设计规则已经达到亚微米级别,FPGA芯片的规模也从几千门增加到2万多等效门。大量功能强大易用的软件工具也相继推出,使得FPGA很快占领了电子设计领域很大的一块市场。
20世纪80年代推出的FPGA可以说是Intel公司于1971年推出第一款商用微处理器的延续。那个时期,典型的微处理器系统包含微处理器、存储器和一些特殊功能的中小规模(MSI/SSI)器件。为追求更佳的件能、更小的尺寸、更低的成本、更快的错误恢复能力、高可靠性以及更快更易使用的原型,集成电路的设计者都意识到一定会有一种器件要取代当时的中小规模电路。这个概念的第一个尝试是Signetics公司于1975年推出的83S100 FPLA(现场可编程逻辑阵列)。这款可编程器件实际上是一款PLA结构的器件。它由16个输人、48个乘积项与阵列、8个输出、48个乘积项或阵列构成,通过Ni-Cr熔丝实现连续的断开或连接。这种方法在以降低速度和增加功耗为代价的前提下,给了设计师们很大的电路设计空间。但是这款可编程器件需要人工来设置熔丝的断或合,因此很复杂而且容易出错。
鉴于当时各种可编程器件的速度、结构等都不能很好地满足巾场的需求,Xilinx于1985年推出了2000系列的FPGA.该系列的FPGA是世界上第一款基于SRAM的可编程FPGA,包括两个器件:第一个器件由8x8〔共64个)的可配置逻辑模块(CLB Configurable Logic Block)构成,并在芯片的周边提供了58个输入输出接口模块(IOB,I/O Block);第二个器件出10xl0的CLB构成,并提供了总共74个IOB单元。自Xilinx推出第一款FPGA之后,世界上的其他公司也相继推出各自的FPGA品。如Actel推出很有特色的反熔丝(Anti-fused)FPGA.FPGA市场内的竞争也愈演愈烈,IC的制造商都意识到必须提供更加强大更易使用的新产品才能出有市场。在这种形势下,Xilinx在1987年就推出了它的第二款3000系列的FPGA,距第一款FPGA的推出只有2年的时间。也就是在那时,AT&T成功地获得这款FPGA器件的设计使用权。并开始提供自己的芯片和开发系统,即AT&T 3000系列FPGA.
自从第二代FPGA问世以来,各种FPGA的应用开始层出不穷,电路复杂度也相继上升。这时,Xilinx就开始研制第三代FPGA产品,AT&T也开始开发自己的下一代FPGA.Xilinx的第三代FPGA产品于1991年问世,而AT&T的下一代产品育到1992年才研制成功。认识到FPGA市场潜在的广阔空间,很多IC和软件厂商也开始向FPGA领域进军,包括一些著名的公司,如Actel、AMD、A1tera、Intel、Mento Graphics、Texas Instruments以及Toshiba。
Xilinx的成功不仅仅得益于它的硬件产品,软件也是很重要的一个砝码。更快、更智能、易用的原理图编辑、设计实现和验证工具始终都是Xilinx在FPGA领域内成功的一个法宝。
为了能让不太熟悉FPGA的读者能够对FPGA有一个整体印象,这里以Xilinx 4000系列的FPGA为例简单说明它的设汁能力。XC4025包含大约1024个CLB,它们按32x32的矩阵形式排列在FPGA芯片里,这相当于25 000个等效门。这款FPGA包含422Kbit的RAM,主要用于编程。一个CLB的运行频率可达250MHz,但如果将互连线网络引入的延迟以及像加法器这样更复杂的逻辑考虑进去,还可以获得20-50MHz的时钟频率。直观地讲,加法器这样的逻辑是由大量的CLB来构成的,例如个32bit的加法器要用掉62个CLB。
FPGA采用了逻辑单元阵列LCA(Logic Cell Array)这样一个新概念,内部包括可配置逻辑模块CLB(Configurable Logic Block)、输出输入模块IOB(Input Output Block)和内部连线(Interconnect)三个部分。FPGA的基本特点主要有:
A hierarchy of programmable interconnects allows the logic blocks of an FPGA to be interconnected as needed by the system designer, somewhat like a one-chip programmable breadboard. These logic blocks and interconnects can be programmed after the manufacturing process by the customer/designer (hence the term "field programmable") so that the FPGA can perform whatever logical function is needed.
FPGAs are generally slower than their application-specific integrated circuit (ASIC) counterparts, can't handle as complex a design, and draw more power. However, they have several advantages such as a shorter time to market, ability to re-program in the field to fix bugs, and lower non-recurring engineering costs. Vendors can sell cheaper, less flexible versions of their FPGAs which cannot be modified after the design is committed. The development of these designs is made on regular FPGAs and then migrated into a fixed version that more resembles an ASIC. Complex programmable logic devices, or CPLDs, are another alternative.
Ross Freeman, Xilinx co-founder, invented the field programmable gate array in 1984.
The historical roots of FPGAs are in complex programmable logic devices (CPLDs) of the early to mid 1980s. CPLDs and FPGAs include a relatively large number of programmable logic elements. CPLD logic gate densities range from the equivalent of several thousand to tens of thousands of logic gates, while FPGAs typically range from tens of thousands to several million.
The primary differences between CPLDs and FPGAs are architectural. A CPLD has a somewhat restrictive structure consisting of one or more programmable sum-of-products logic arrays feeding a relatively small number of clocked registers. The result of this is less flexibility, with the advantage of more predictable timing delays and a higher logic-to-interconnect ratio. The FPGA architectures, on the other hand, are dominated by interconnect. This makes them far more flexible (in terms of the range of designs that are practical for implementation within them) but also far more complex to design for.
Another notable difference between CPLDs and FPGAs is the presence in most FPGAs of higher-level embedded functions (such as adders and multipliers) and embedded memories. A related, important difference is that many modern FPGAs support full or partial in-system reconfiguration, allowing their designs to be changed "on the fly" either for system upgrades or for dynamic reconfiguration as a normal part of system operation. Some FPGAs have the capability of partial re-configuration that lets one portion of the device be re-programmed while other portions continue running.
A recent trend has been to take the coarse-grained architectural approach a step further by combining the logic blocks and interconnects of traditional FPGAs with embedded microprocessors and related peripherals to form a complete "system on a programmable chip". Examples of such hybrid technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices, which include one or more PowerPC processors embedded within the FPGA's logic fabric. The Atmel FPSLIC is another such device, which uses an AVR processor in combination with Atmel's programmable logic architecture. An alternate approach is to make use of "soft" processor cores that are implemented within the FPGA logic. These cores include the Xilinx MicroBlaze and PicoBlaze, the Altera Nios and Nios II processors, and the open source LatticeMico32 and LatticeMico8, as well as third-party (either commercial or free) processor cores.
As previously mentioned, many modern FPGAs have the ability to be reprogrammed at "run time," and this is leading to the idea of reconfigurable computing or reconfigurable systems — CPUs that reconfigure themselves to suit the task at hand. Current FPGA tools, however, do not fully support this methodology.
It should be noted here that new, non-FPGA architectures are beginning to emerge. Software-configurable microprocessors such as the Stretch S5000 adopt a hybrid approach by providing an array of processor cores and FPGA-like programmable cores on the same chip. Other devices (such as Mathstar's Field Programmable Object Array, or FPOA) provide arrays of higher-level programmable objects that lie somewhere between an FPGA's logic block and a more complex processor.
Applications of FPGAs include DSP, software-defined radio, aerospace and defense systems, ASIC prototyping, medical imaging, computer vision, speech recognition, cryptography, bioinformatics, computer hardware emulation and a growing range of other areas. FPGAs originally began as competitors to CPLDs and competed in a similar space, that of glue logic for PCBs. As their size, capabilities, and speed increased, they began to take over larger and larger functions to the state where some are now marketed as full systems on chips (SOC). FPGAs especially find applications in any area or algorithm that can make use of the massive parallelism offered by their architecture.
The typical basic architecture consists of an array of configurable logic blocks (CLBs) and routing channels. Multiple I/O pads may fit into the height of one row or the width of one column. Generally, all the routing channels have the same width (number of wires).
An application circuit must be mapped into an FPGA with adequate resources.
The typical FPGA logic block consists of a 4-input lookup table (LUT), and a flip-flop, as shown below.
There is only one output, which can be either the registered or the unregistered LUT output. The logic block has four inputs for the LUT and a clock input. Since clock signals (and often other high-fanout signals) are normally routed via special-purpose dedicated routing networks in commercial FPGAs, they and other signals are separately managed.
For this example architecture, the locations of the FPGA logic block pins are shown below.
Each input is accessible from one side of the logic block, while the output pin can connect to routing wires in both the channel to the right and the channel below the logic block.
Each logic block output pin can connect to any of the wiring segments in the channels adjacent to it.
Similarly, an I/O pad can connect to any one of the wiring segments in the channel adjacent to it. For example, an I/O pad at the top of the chip can connect to any of the W wires (where W is the channel width) in the horizontal channel immediately below it.
Generally, the FPGA routing is unsegmented. That is, each wiring segment spans only one logic block before it terminates in a switch box. By turning on some of the programmable switches within a switch box, longer paths can be constructed. For higher speed interconnect, some FPGA architectures use longer routing lines that span multiple logic blocks.
Whenever a vertical and a horizontal channel intersect there is a switch box. In this architecture, when a wire enters a switch box, there are three programmable switches that allow it to connect to three other wires in adjacent channel segments. The pattern, or topology, of switches used in this architecture is the planar or domain-based switch box topology. In this switch box topology, a wire in track number one connects only to wires in track number one in adjacent channel segments, wires in track number 2 connect only to other wires in track number 2 and so on. The figure below illustrates the connections in a switch box.
Modern FPGA families expand upon the above capabilities to include higher level functionality fixed into the silicon. Having these common functions embedded into the silicon reduces the area required and gives those functions increased speed compared to building them from primitives. Examples of these include multipliers, generic DSP blocks, embedded processors, high speed IO logic and embedded memories.
FPGAs are also widely used for systems validation including pre-silicon validation, post-silicon validation, and firmware development. This allows chip companies to validate their design before the chip is produced in the factory, reducing the time to market.
To define the behavior of the FPGA the user provides a hardware description language (HDL) or a schematic design. Common HDLs are VHDL and Verilog. Then, using an electronic design automation tool, a technology-mapped netlist is generated. The netlist can then be fitted to the actual FPGA architecture using a process called place-and-route, usually performed by the FPGA company's proprietary place-and-route software. The user will validate the map, place and route results via timing analysis, simulation, and other verification methodologies. Once the design and validation process is complete, the binary file generated (also using the FPGA company's proprietary software) is used to (re)configure the FPGA .
In an attempt to reduce the complexity of designing in HDLs, which have been compared to the equivalent of assembly languages, there are moves to raise the abstraction level of the design. Companies such as Cadence, Synopsys and Celoxica are promoting SystemC as a way to combine high level languages with concurrency models to allow faster design cycles for FPGAs than is possible using traditional HDLs. Approaches based on standard C or C++ (with libraries or other extensions allowing parallel programming) are found in the Catapult C tools from Mentor Graphics, and in the Impulse C tools from Impulse Accelerated Technologies. Annapolis Micro Systems, Inc.'s CoreFire Design Suite provides a graphical dataflow approach to high-level design entry. Languages such as SystemVerilog, SystemVHDL, and Handel-C (from Celoxica) seek to accomplish the same goal, but are aimed at making existing hardware engineers more productive versus making FPGAs more accessible to existing software engineers.
To simplify the design of complex systems in FPGAs, there exist libraries of predefined complex functions and circuits that have been tested and optimized to speed up the design process. These predefined circuits are commonly called IP cores, and are available from FPGA vendors and third-party IP suppliers (rarely free, and typically released under proprietary licenses). Other predefined circuits are available from developer communities such as OpenCores.org (typically "free", and released under the GPL, BSD or similar license), and other sources.
In a typical design flow, an FPGA application developer will simulate the design at multiple stages throughout the design process. Initially the RTL description in VHDL or Verilog is simulated by creating test benches to stimulate the system and observe results. Then, after the synthesis engine has mapped the design to a netlist, the netlist is translated to a gate level description where simulation is repeated to confirm the synthesis proceeded without errors. Finally the design is laid out in the FPGA at which point propagation delays can be added and the simulation run again with these values back-annotated onto the netlist.
Some engineering applications have used a single FPGA device to replace the function of a simple embedded-microcontroller. More recently, a complete 32-bit CPU (Central Processing Unit) core can be implemented through the programmable logic of a high-capacity FPGA. Such CPU cores are known as "soft CPU cores," examples of which being MicroBlaze™, Nios II™ and LatticeMico32™ by Xilinx, Altera and Lattice respectively.
Beyond this, some FPGA devices contain dedicated hardware CPU core(s). Selected Virtex parts from Xilinx contain 1 or more IBM PowerPC 405 CPU embedded cores, in addition to the FPGA's own programmable logic. For a given CPU architecture, a hard (embedded) CPU core will outperform a soft-core CPU (i.e., a programmable-logic implementation of the CPU). The embedded CPU contains exactly the logic and only the logic structures needed for the CPU's function, and the embedded CPU's logic is task-specific optimized, whereas a soft core CPU must live within the FPGA's general-purpose logic fabric. Embedded CPUs can be also easier to integrate into a FPGA-based application because the fixed-nature of the embedded CPU possesses predictable timing characteristics, and the complexity of an equivalent programmable-logic CPU consumes much more of the FPGA's scarce programmable-logic resources, complicating the placement and routing of the design's remaining real estate. Use of embedded CPUs can limit the choice of available devices, vendors, and design tools and as such requires careful justification over the soft core CPU approach.
As of late 2005, the FPGA market has mostly settled into a state where there are two major "general-purpose" FPGA manufacturers and a number of other players who differentiate themselves by offering unique capabilities.