The increasing computational and communication demands of the scientific and industrial communities require a clear understanding of the performance trade-offs involved in multi-core computing platforms. Such analysis can help application and toolkit developers in designing better, topology aware, communication primitives intended to suit the needs of various high end computing applications. In…