Master Thesis MSTR-2017-67

BibliographyQureshi, Kashif Wajid: A Scalable FFT Processor Architecture Based on Dynamic Partial Reconfiguration.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 67 (2017).
91 pages, english.
Abstract

Partial reconfiguration in Field Programmable Gate Arrays (FPGAs) was brought to market almost two decades ago, but still there are many designs not using this feature up to its full potential. By designing systems using dynamic partial configuration, it is possible to achieve more efficient use of resources and a reduction in the static power consumption of the device. In this thesis, design of a scalable Fast Fourier Transform (FFT) architecture using dynamic partial reconfiguration is proposed. This architecture covers the demand for a FFT hardware architecture for various sample rate parameter values and optimal hardware resource utilization. The design of the proposed architecture, is realized on the basis of dynamic partial reconfiguration with a fixed number of Basic Processing Element (BPE) decided at design time. The concept of dynamic reconfiguration is to implement the interconnect network with a fixed number of BPE elements. An algorithm to automatically generate the reconfiguration structure for the interconnect network is also implemented. The concept is to enable custom scalable FFT implementation optimized to the amount of resources against computation time. The symmetric structure of the FFT computation is used to reduce the dynamic reconfiguration overhead. The design of the proposed architecture is done in such a way as to use the dynamic partial reconfiguration at the micro level. The results of the proposed design are compared to the conventional multiplexer based BPE interconnect network. When evaluated and compared against the classical architecture, the proposed design is found to use 40% less resources. The power dissipation of the newly implemented architecture is 46% less than the conventional design, making it more power efficient than its predecessor. All of this improvement in making the design use fewer resources and dissipate less amount of power, has hampered the throughput of the system. The new proposed system is considerably slower and has a very high latency as compared to the classical implementation.

Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Parallel Systems
Superviser(s)Simon, Prof. Sven; Guhathakurta, Jajnabalkya
Entry dateMay 29, 2019
   Publ. Computer Science