Next: Conclusion
Up: AE 597: HW #3
Previous: Program listing
Here is the timing for 16 processor runs on the NPACI
IBM-SP2 (sp.npaci.edu).
| Case |
Parallel run time (sec) |
Elements comm. |
Bytes comm. |
| 1 |
0.03526 |
16384 |
131072 |
| 2 |
0.03004 |
0 |
0 |
| 3 |
0.05720 |
65536 |
524288 |
| 4 |
0.03524 |
16384 |
131072 |
| 5 |
0.03516 |
16384 |
131072 |
The NPACI SP2 has 128 thin node POWER2 Super Chip (P2SC) processors with
256 MBytes of memory on each processor running at 160 Mhz and are
capable of a peak performance of 640 MFLOPS each.
It is capable of a peak bi-directional data transfer rate of 110 MB/second
between each node pair.
Cases 1, 2 and 3 are for CSHIFT(A,1,1) with (block,block),
(
,block) and (block,
) distribution respectively.
Case 4 is for CSHIFT(A,3,1) with (block,block) distribution
and case 5 is for triplet notation with (block,block)
distribution. The program uses double precision elements in the
array each of 8 bytes, hence the bytes communicated is
simply the number of array elements communicated multiplied by
8.
| Case |
Time (msec) |
Bytes |
MB/sec |
of peak MB/s |
| 1 |
0.522 |
131072 |
25.11 |
22.8 |
| 2 |
0.000 |
0 |
- |
- |
| 3 |
2.716 |
524288 |
19.30 |
17.5 |
| 4 |
0.520 |
131072 |
25.21 |
22.9 |
| 5 |
0.512 |
131072 |
25.60 |
23.3 |
The communication time is calculated by the difference between
the timings of cases 1, 3, 4 and 5 with case 2 as it
is the case involving no communication.
Next: Conclusion
Up: AE 597: HW #3
Previous: Program listing
Anirudh Modi
3/20/1998