Ryuta Kawano
Keywords: Interconnection Networks, Deadlock-free Routing, High Performance Computing
End-to-end network latency has become an important issue for parallel application on large-scale High Performance Computing (HPC) systems. For large parallel applications executed on the next generation of HPC systems, MPI communication latency should be lower than one microsecond. Switch delays (e.g., about 100 nanoseconds in InfiniBand QDR) are typically larger than the wire and flit injection delays, even when including serial and parallel converters. This is why inter-switch topologies should have a low diameter and low average shortest path length, both of which can be measured in terms of a number of switch hops.
In order to cope with this requirement, recently researchers have developed random topologies applicable for inter-switch networks. Compared with the conventional Torus or Fat-tree networks, these topologies can drastically reduce the number of hops and be efficiently applied to HPC systems, data centers, and many-core systems.
These irregular topologies often suffer from their lack of scalability for feasible interconnection networks. Firstly, they tend to increase the total amount of cable length between cabinets and thus to increase the cost. Secondly, the number of table entries consumed on each switch is increased. Thirdly, they increase the time complexity to compute assignment of Virtual Channels (VCs) to the routing paths for deadlock-freedom.
To improve their scalability, my research focuses attention on layout-conscious random topologies that contain randomly connected links with length limitation. These networks achieve drastic reduction of the total cable length with a minimal increase in the number of hops compared with completely random networks.
My research aims to explore a scalable routing methodology that can be applied to the layout-conscious random topologies to achieve feasible low-latency interconnection networks with
a scalable routing method for the layout-conscious random topologies, which exploits irregularity and locality in these topologies to achieve both the small number of hops between nodes and small routing table sizes required, and
an advanced method for the VC (Virtual Channel) assignment, which has a small time complexity, yet with the same number of VCs compared with conventional VC assignment methods.
The proposed routing methodology provides the better trade-offs between an achieved network latency and the number of required table entries, which improves scalability and flexibility for implementation.