How to Get the Timing Right in Critical FPGA Applications Application Note

As the capacity of FPGA devices increases, the quantity and complexity of application tasks deployed on a single PMC FPGA module likewise increases. As a result, accurate management of all the unique clocks associated with these tasks becomes a very significant challenge for the application developer. In this article, we discuss the tools and constructs available to help an FPGA application designer get the timing right.

All logic executing within the fabric of an FPGA must be based upon one or more timeframe references more commonly called “clocks.” On the FPGA module, clocks are used to synchronize read/write operations, synchronize data transmission and capture, control the timing of data processing, and prepare data for storage. For example, it may take some multiple of 8 clock cycles to process and prepare a single 8-bit byte of data for storage in a memory device. To meet these needs, Acromag’s PMC FPGA modules include on-board crystal clocks for essential tasks such as driving the PCI-X bus and on-board memory devices. Acromag modules with Xilinx® Virtex-4® FPGAs have 200MHz and 66MHz crystals; modules with Virtex-5® FPGAs have 200MHz and 133MHz crystals. However, external clock lines are necessary when external signals require highly accurate synchronization for transmission or capture and processing.

Although any input point can be used as an external clock, some input points are specifically routed by Xilinx for low-skew, low-power operation to deliver low-jitter signals at high frequencies. Regional clock lines on Acromag’s Virtex-5 FPGA modules are routed to a specific clock region and its neighbors. The functional equivalents on Virtex-4 FPGA modules are called local clocks. Global clock lines are those routed for availability throughout the FPGA fabric.

To interface external signals, Acromag’s Virtex-4 and Virtex-5 FPGA PMC modules feature a front I/O mezzanine and rear I/O access. Front mezzanine I/O is accessed via a plug-in AXM module which contains transceivers to match the field I/O signaling requirements and provide FPGA isolation. Up to four of these transceivers attach to regional/local clocks within the FPGA. Rear I/O enters the FPGA device directly from the PMC module’s P4 connector. To maximize the speed and accuracy of signal capture and processing, Acromag makes available four local clock lines on the P4 port of their Virtex-4 FPGA modules. Five regional clock lines plus eight global clock lines are available on their Virtex-5 FPGA modules. Of course, any clock line can be used as a standard I/O point for field wiring as necessary.

All of these very basic or raw clock inputs, from on-board crystals or a field I/O point, may require some level of conditioning. It may be necessary to correct for symmetry, fan-out, skew, phase adjustment, or frequency. To simplify this process, Xilinx has prepared a number of constructs. These clock management constructs include the Digital Clock Manager (DCM), Frequency Synthesis, Phase Locked Loop (PLL), Phase Matched Clock Divider (PMCD) and Delay Locked Loop (DLL).

DCMs can act as a zero delay clock buffer and can also generate new system clocks based upon their input clock signal. When passed through a DCM, externally or internally-derived system clocks are phase-aligned to the input clock to eliminate clock distribution delays. A simple set of such clocks can be generated by the PMCD which provides output clocks derived from the input clock divided by 1, 2, 4 and 8 respectively. Using a very flexible multiplication and division of the input clock, a wide range of output clock frequencies can be generated (i.e. frequency synthesis). This is useful to accommodate segments of application logic that require either higher sampling or more processing cycles to maintain application task synchronization. Fine resolution phase shifting of clocks is also possible to an increment as small as 1/256th of a clock period.

Used alone or in conjunction with other clock synthesis and management capabilities, the PLL module can remove input clock jitter or clean up output clocks for enhanced symmetry. By monitoring a sample of a DCM output clock, the DLL compensates for delays on the routing network by accounting for skew between the external input port and the individual clock loads within the device.

Just as in discrete hardware design, ensuring the accurate and timely clocking of inter-related logic segments is imperative for design success. Using FPGA modules and the available clock management tools makes the task of getting the timing right a lot easier.