Efficient collective communication on heterogeneous networks of workstations

Detalles Bibliográficos
Autor Principal: Banikazemi, Mohammad
Otros autores o Colaboradores: Moorthy, Vijay, Panda, Dhabaleswar K.
Formato: Capítulo de libro
Lengua:inglés
Temas:
Acceso en línea:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=708518
Consultar en el Cátalogo
Resumen:Networks of Workstations (NOW) have become an attractive alternative platform for high performance computing. Due to the commodity nature of workstations and interconnects and due to the multiplicity of vendors and platforms, the NOW environments are being gradually redefined as Heterogeneous Networks of Workstations (HNOW) environments. This paper presents a new framework for implementing collective communication operations (as defined by the Message Passing Interface (MPI) standard) efficiently for the emerging HNOW environments. We first classify different types of heterogeneity in HNOW and then focus on one important characteristic: communication capabilities of workstations. Taking this characteristic into account, we propose two new approaches (Speed-Partitioned Ordered Chain (SPOC) and Fastest-Node First (FNF)) to implement collective communication operations with reduced latency. We also investigate methods for deriving optimal trees for broadcast and multicast operations. Generating such trees is shown to be computationally intensive. It is shown that the FNF approach, in spite of its simplicity, can deliver performance within 1% of the performance of the optimal trees. Finally, these new approaches are comparedwith the approach used in the MPICH implementation on experimental as well as on simulated testbeds. On a 24-node existing HNOW environment with SGI workstations and ATM interconnection, our approaches reduce the latency of broadcast and multicast operations by a factor of up to 3:5 compared to the approach used in the existing MPICH implementation. On a 64-node simulated testbed, our approaches can reduce the latency of broadcast and multicast operations by a factor of up to 4:5. Thus, these results demonstrate that there is significant potential for our approaches to be applied towards designing scalable collective communication libraries for current and future generation HNOW environments.
Notas:Formato de archivo: PDF. -- Disponible en línea vía suscripción BECyT (Cons. 25-04-2008)

MARC

LEADER 00000naa a2200000 a 4500
003 AR-LpUFIB
005 20250423182951.0
008 230201s1998 xxu o 000 0 eng d
024 8 |a DIF-M2341  |b 2428  |z DIF002240 
040 |a AR-LpUFIB  |b spa  |c AR-LpUFIB 
100 1 |a Banikazemi, Mohammad  |9 46102 
245 1 0 |a Efficient collective communication on heterogeneous networks of workstations 
500 |a Formato de archivo: PDF. -- Disponible en línea vía suscripción BECyT (Cons. 25-04-2008) 
520 |a Networks of Workstations (NOW) have become an attractive alternative platform for high performance computing. Due to the commodity nature of workstations and interconnects and due to the multiplicity of vendors and platforms, the NOW environments are being gradually redefined as Heterogeneous Networks of Workstations (HNOW) environments. This paper presents a new framework for implementing collective communication operations (as defined by the Message Passing Interface (MPI) standard) efficiently for the emerging HNOW environments. We first classify different types of heterogeneity in HNOW and then focus on one important characteristic: communication capabilities of workstations. Taking this characteristic into account, we propose two new approaches (Speed-Partitioned Ordered Chain (SPOC) and Fastest-Node First (FNF)) to implement collective communication operations with reduced latency. We also investigate methods for deriving optimal trees for broadcast and multicast operations. Generating such trees is shown to be computationally intensive. It is shown that the FNF approach, in spite of its simplicity, can deliver performance within 1% of the performance of the optimal trees. Finally, these new approaches are comparedwith the approach used in the MPICH implementation on experimental as well as on simulated testbeds. On a 24-node existing HNOW environment with SGI workstations and ATM interconnection, our approaches reduce the latency of broadcast and multicast operations by a factor of up to 3:5 compared to the approach used in the existing MPICH implementation. On a 64-node simulated testbed, our approaches can reduce the latency of broadcast and multicast operations by a factor of up to 4:5. Thus, these results demonstrate that there is significant potential for our approaches to be applied towards designing scalable collective communication libraries for current and future generation HNOW environments. 
534 |a Proceedings of International Conference on Parallel Processing, pp. 460-467, 1998. 
650 4 |a ARQUITECTURAS PARALELAS  |9 45380 
650 4 |a RENDIMIENTO DE LOS SISTEMAS  |9 43890 
650 4 |a ESTACIONES DE TRABAJO  |9 46103 
700 1 |a Moorthy, Vijay  |9 46104 
700 1 |a Panda, Dhabaleswar K.  |9 46105 
856 4 0 |u http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=708518 
942 |c CP 
952 |0 0  |1 0  |4 0  |6 A0004  |7 3  |8 BD  |9 76449  |a DIF  |b DIF  |d 2025-03-11  |l 0  |o A0004  |r 2025-03-11 17:02:37  |u http://catalogo.info.unlp.edu.ar/meran/getDocument.pl?id=404  |w 2025-03-11  |y CP 
999 |c 52136  |d 52136