http://www.ccs.uky.edu/~douglas/cs521

Syllabus    CSEP    Mailing List    Survey  ssh   

 

Homework 2

Introduction

Please read the introduction to sparse matrices.  When emailing me your solutions to the parts of this assignment, please use the email hyperlink associated with each part.

Part 1 Due January 22

Join a group of up to 5 and email me information (these groups are different from the talking teams)Loners beware.  Send only one message per group.

1. Who is in your group?

2. Cite a source for computing a triple matrix multiplication:  Author(s), Title, Journal or Book Publisher, Year, and Pages.

Part 2 Due February 14

Write a Matlab code that computes the nonzero structure of RAP, where each matrix is sparse.  A sample set of matrices is given by hw2-2.m.  Use simply linked lists to determine the nonzero structure.  Do not use full matrices.  In Matlab, a 2 by something array is useful in storing linked lists.  An ordered list of {(i,j) | Cij != 0} should be you solution to be emailed to me.

Part 3 Due February 19 (February 15th, too)

This is a group mini-swap assignment.  First, post your group's code(s) on your web site no later than 7am on February 15.  If you post more than one, indicate which is your primary one.

Find examples that break the primary code of your counterpart group (1/4 and 2/3).  Send the examples to the other group as you find them.  Fix your own code if you receive counterexamples to the correctness of your code.

Part 4 Due February 28

Write the numeric calculation in Matlab in a form that will be easy to translate into C, C++, or Fortran.  Use the compressed row storage scheme described in the introduction to sparse matrices.

Part 5 Due March 8

Write your code to work on a single processor in C.  Please demonstrate it on Linux and HP-UX.  Windows would be nice, too.  (More details on this part will be posted soon.)  If you want C++, write a wrapper class that calls your C implementation.

Part 6 Due April 11

You will use OpenMP to parallelize your code from Part 5.  There are a number of ways of parallelizing the multiplication.  Since the data is in shared memory on a single node of a computer (albeit with many processors), assigning rows to processors is fairly easy.  Load balancing is a fundamental concept in parallel computing.  It is also time consuming to do well the first time anyone does it.  Getting the right answer is more important, particularly when the deadline is immediate.

You should first assign the same number of rows to processors.  Start small: one processor.  Then see if you can get two processors to work.  Then try three to eight.  If you are ambitious, try 32 processors.

Part 7 Due April 25

You will attempt to use MPI.  This is far more difficult than OpenMP.  The data must be assigned to processors by rows.  You will be passing large chunks of data back and forth between processors in an unpleasant manner.  Load balancing is really crucial in this part, but once again, getting the right answer is far more important than near perfect load balancing.

 
 

Cheers,
Craig C. Douglas

Last modified: