Clouds, Grids and Virtualisation
Learning Outcome 1: Characterise and critically evaluate high performance computing based architectures and their suitability for given applications.
Learning Outcome 2: Implement and execute applications using shared and distributed memory programming paradigms.
Learning Outcome 3: Describe and critically discuss the roles and applications of cloud and grid computing.
Complete all the tasks required:
Task 1
You are required to compute a temperature distribution for a rectangular 2D head conduction problem simulating a plate with boundary conditions set at top 10°C, bottom 30°C, left 40°C and right 50°C with a range of problem sizes. To do this you are required to modify the codes to:
reflect the boundary conditions described above
report the execution time Record the run-time of your code under a range of problem sizes using different levels of compiler optimization (e.g. -O1, -O2 etc).
Document your changes to your code in your report. Though you submit your code, you do not receive marks for separately so it is important you highlight changes in your report. Explain your changes.
Task 2
You are then required to modify the applications you created in step 1 to produce a basic parallel version of the codes using OpenMP. The following commands will compile your parallel version on a platform that has OpenMP installed:
gcc -fopenmp jacobiOpenmp.c -o jacobiOpenmp gcc -fopenmp gaussOpenmp.c -o gaussOpenmp
The parallel codes must include timers to report the parallel run-time of the code. This version must be tested to establish correct operation using 1, 2, 4, 8 and 16 threads, regardless of performance.
Include in your report, the print out of the temperatures for a 20x20 problem size for 1,2,4, 8 and 16 threads to demonstrate the code works correctly.
Document your changes to your code in your report. Though you submit your code, you do not receive marks for it so it is important you highlight changes in your report. Explain the changes made.
Run the Gauss-Seidel code for only 1 iteration using 1 and 2 threads for a 20x20 problem size. Output the temperatures along with the timings, include this in your report. Discuss the reasons for the differences in the solutions.
Task 3
Using the university HPC are to run performance tests with the OpenMP implementation you created in step 2. This will require that you remove most of the print output from the code and increase the problem size to provide sufficient work to demonstrate useful speedup. You are expected to provide speedup results:
for at least three problem sizes, you are unlikely to see much speedup for small domains, use at least a 100x100 grid and a consistent tolerance, maximum of 10-3.
for a range of number of threads (from 2 up to 8 threads) In calculating the speedup of your parallel code you should use the optimized single processor version of your code you produced in step 1 and compare to this. You will need to apply similar compiler optimizations to your parallel code. Please list your runtimes in a suitable unit.
Please report both tour timings and the speedup from the serial version. Comment on the speedup, how does it compare to the theoretical maximum.
Task 4
Using different OpenMP directives and clauses you are to further modify your OpenMP application to improve the parallel performance. You are expected to provide results that permit comparison with those you obtained in Step 3. Comment on the differences between optimising the Jacobi and Gauss-Seidel Methods. Make sure you document the changes made to the code and explain why you have done them.
Your report is required to provide details of your implementation of steps 1 to 4 as described above. The report should include discussion of your solutions and provide a clear description of; the code changes you have implemented, your compilation and execution processes and your test cases. For steps 3 and 4 you are expected to provide tabular and graphical results. Comment on the differences between the two methods and the effect on parallelisation. Your zip file should provide suitably named source code files for each of your implementations. The report should be approximately 2000 words, excluding code snippets and results tables.