< retour aux publications

Efficient Fault-Tolerant Adaptive Routing under an unconstrained Set of Node and Link Failures for Many Cores System On Chip

Auteur(s) : M. Dimopoulos, Yi Gang, M. Benabdenbi, L. Anghel

Doc. Source: Workshop on Dependable Multicore and Transactional Memory Systems (DMTM'14), (joint to HIPEAC event)

Pages : 1-2

n online fault tolerant routing algorithm for 2D Mesh Networks-on-Chip is presented in this work. It combines an adaptive routing algorithm with neighbor fault-awareness and a new traffic-balancing metric. To be able to cope with runtime permanent and temporary failures that may result in message corruption, message loss or dead locks, the routing algorithm is enhanced with packet retransmission and a new message recovery scheme. Simulation results, for various network sizes, different traffic patterns, under an unconstrained number of node and link faults, temporary and/or permanent, demonstrate the scalability and efficiency of the proposed algorithm to tolerate multiple failures likely encountered in deep submicron technologies. As the experiments have shown, the prop osed algorithm maintains high reliability of more than 99.38% for a 2D mesh network of 16x16 and in the presence of 384 simultaneous link faults.