Emulating a large memory sequential machine with a collection of small memory ones

James Hanlon, Simon J. Hollis, David May

Research output: Contribution to journalArticle (Academic Journal)

59 Downloads (Pure)

Abstract

Sequential computation is well understood but does not scale well with current technology. Within the next decade, systems will contain large numbers of processors with potentially thousands of processors per chip. Despite this, many computational problems exhibit little or no parallelism and many existing formulations are sequential. Therefore, it is essential that highly parallel architectures can support sequential computation by emulating large memories with collections of smaller ones, thus supporting efficient execution of sequential programs or sequential algorithms included as part of parallel programs. This paper presents a novel tiled parallel architecture which can scale to thousands of processors per-chip and can deliver this ability. Provision of an interconnect with scalable low latency communications is essential for this and the realistic construction of such a system with a high-degree switch and a Clos-based network is presented. Experimental evaluation shows that sequential programs can be executed with only a factor of 2 to 3 slowdown when compared to a conventional sequential machine and that the area is roughly only a factor of two larger. This seems an acceptable price to pay for an architecture that can switch between executing highly parallel programs and sequential programs with large memory requirements.
Original languageEnglish
Article number1210.1158
JournalarXiv
Issue number1210.1158
Publication statusPublished - 3 Oct 2012

Keywords

  • cs.AR
  • cs.DC
  • C.1.4

Fingerprint Dive into the research topics of 'Emulating a large memory sequential machine with a collection of small memory ones'. Together they form a unique fingerprint.

Cite this