Memory interface design for integrating accelerators with Xilinx Zynq platform

Field-Programmable Gate Arrays (FPGAs) are silicon chips with several configurable logic blocks connected to each other with the help of programmable interconnects. Numerous processes can be carried out on an FPGA in parallel. In this report we present a set of experiments for interfacing accelerato...

Full description

Saved in:
Bibliographic Details
Main Author: Srivastava, Shreya
Other Authors: Douglas Leslie Maskell
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74341
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Field-Programmable Gate Arrays (FPGAs) are silicon chips with several configurable logic blocks connected to each other with the help of programmable interconnects. Numerous processes can be carried out on an FPGA in parallel. In this report we present a set of experiments for interfacing accelerators (HLS generated RTL) within the Xilinx Zynq Platform using Xillybus. For Xillybus characterization, the first experiment performed is the single pipe loopback experiment which consists of FIFOs connected in a loopback. It can be concluded from this experiment that the throughput becomes nearly constant for larger input data size. The second experiment explores FPGA coprocessing using trigonometric sine function calculation as an example. This experiment introduces the concept of a synthesized function, a wrapper and a host program. We observe that sending a single set of data for processing is not efficient because of the I/O overhead and hardware and software latencies. In practical host programs, an array of structures is usually sent as the input. The third experiment conducted is the color convert experiment which takes in input color values and returns the output values after processing. A two-step optimization process is implemented for better performance and efficiency. The saturation value for the number of inputs was close to 204,000 32-bit integers. The final experiment conducted is a practical application of the color convert example which takes an image as the input and generates the output image after processing. This takes a short amount of time considering the size of input data and shows the importance of the optimization steps used in the experiment.