FEASIBLE MESH ARCHITECTURES WITH BROADCAST BUSES FOR HIGH PERFORMANCE PARALLEL
COMPUTING
Sotirios G. Ziavras
Tremendous emphasis has been put in recent years on mesh-connected parallel
computers, because their scalability facilitates the construction of massively
parallel systems that could potentially achieve impressive peak performance.
The conventional mesh architecture, however, cannot implement efficiently
distant data transfers, so it is considered to be a special purpose architecture
suitable for algorithms that involve only local data transfers.
Several enhanced versions of the conventional mesh architecture
have been proposed to improve the performance of distant data transfers.
A single broadcast bus that covers all processors in the mesh, broadcast
buses that cover each row and each column, and programmable switches around
processors in the mesh are some of the most common relevant encounters
in the literature. However, the construction of parallel systems that employ
these enhanced meshes is a Herculian or impossible task because of their
very high VLSI complexity. Additionally, the common assumption of single-cycle
broadcasts is often unrealistic due to the large length of the broadcast
buses and the large number of devices attached.
My motivation was to introduce enhanced mesh architectures of
low VLSI complexity that could support the efficient implementation of
distant data transfers. This project has resulted in the introduction of
two mesh-connected architectures that contain sparse broadcast buses. These
architectures are obtained by superimposing a pxp mesh of broadcast buses
on the regular nxn processor mesh, where p
Both of the proposed architectures have lower cost than the very popular
mesh with multiple broadcast that has buses spanning each row and each
column. However, the former architectures maintain to a large extent the
powerful properties of the latter mesh. The new architecture that employs
switches was even shown to often perform better than the higher-cost mesh
with multiple broadcast. The new architectures were evaluated in reference
to cost, and efficiency in implementing several important operations and
application algorithms. The results show that these architectures are very
promising alternatives to the mesh with multiple broadcast. They achieve
impressive performance for algorithms that employ local and/or global data
transfers and their construction is feasible.
Return to the "summary of recent research contributions" page
Return to my home page
Last updated 11/02/98, SGZ