FEASIBLE MESH ARCHITECTURES WITH BROADCAST BUSES FOR HIGH PERFORMANCE PARALLEL COMPUTING

Sotirios G. Ziavras

Tremendous emphasis has been put in recent years on mesh-connected parallel computers, because their scalability facilitates the construction of massively parallel systems that could potentially achieve impressive peak performance. The conventional mesh architecture, however, cannot implement efficiently distant data transfers, so it is considered to be a special purpose architecture suitable for algorithms that involve only local data transfers.

 Several enhanced versions of the conventional mesh architecture have been proposed to improve the performance of distant data transfers. A single broadcast bus that covers all processors in the mesh, broadcast buses that cover each row and each column, and programmable switches around processors in the mesh are some of the most common relevant encounters in the literature. However, the construction of parallel systems that employ these enhanced meshes is a Herculian or impossible task because of their very high VLSI complexity. Additionally, the common assumption of single-cycle broadcasts is often unrealistic due to the large length of the broadcast buses and the large number of devices attached.

 My motivation was to introduce enhanced mesh architectures of low VLSI complexity that could support the efficient implementation of distant data transfers. This project has resulted in the introduction of two mesh-connected architectures that contain sparse broadcast buses. These architectures are obtained by superimposing a pxp mesh of broadcast buses on the regular nxn processor mesh, where p Both of the proposed architectures have lower cost than the very popular mesh with multiple broadcast that has buses spanning each row and each column. However, the former architectures maintain to a large extent the powerful properties of the latter mesh. The new architecture that employs switches was even shown to often perform better than the higher-cost mesh with multiple broadcast. The new architectures were evaluated in reference to cost, and efficiency in implementing several important operations and application algorithms. The results show that these architectures are very promising alternatives to the mesh with multiple broadcast. They achieve impressive performance for algorithms that employ local and/or global data transfers and their construction is feasible. 


* Return to the "summary of recent research contributions" page

* Return to my home page


Last updated 11/02/98, SGZ