Over the past few years there has been increased interest in building custom computing machines (CCMs) as a way of achieving very high performance on specific problems. The advent of high density field programmable gate arrays (FPGAs), in combination with new synthesis tools, have made it relatively easy to produce programmable custom machines without building specific hardware. In many cases, the performance achieved by a FPGA based custom computer is attributed to the exploitation of massive concurrency in the underlying application. In this paper we explore the sources of speedup for irregular problems in which is difficult to exploit such parallelism. We highlight 5 main sources of speedup that we have observed, namely the provision of high memory bandwidth, the use of flexible address generation hardware, the use of gather-scatter array operations, the use of lookup tables and the use of multiple tailored arithmetic units. By considering some representative examples of such irregular problems, the paper illustrates that good performance is possible given the current generation of FPGA devices and RISC processors. The paper then explores whether this performance gain will be possible given the next generation of RISC processors and FPGAs. It concludes that the only way to maintain the speedup is to alter the architecture of CCMs in combination with architectural changes to the FPGAs themselves.
|Number of pages||10|
|Publication status||Published - 1 Jan 1998|
|Event||Proceedings of the 1998 4th International Symposium on High-Performance Computer Architecture, HPCA - Las Vegas, Las Vegas, United States|
Duration: 31 Jan 1998 → 4 Feb 1998
|Conference||Proceedings of the 1998 4th International Symposium on High-Performance Computer Architecture, HPCA|
|Period||31/01/98 → 4/02/98|