Here are a few tips on coding in Matlab.
First, Matlab can represent matrices in both dense and sparse formats. The dense format is the first you would think of: it stores everything. The sparse format just stores the non-zeros. To an example of the difference in space required, try this:
>> a = eye(1000);
>> as = sparse(a); % this creates a sparse version
>> whos
Name Size Bytes Class Attributes
a 1000x1000 8000000 double
as 1000x1000 24008 double sparse
You can extract a list of the non-zero entries of a sparse (or dense) matrix by using the find command:
>> a = diag([1 1 1],1); a(1,4) = 1; a(4,4) = 0; a = a + a'
a =
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
>> [ai,aj,av] = find(a); [ai,aj,av]
ans =
2 1 1
4 1 1
1 2 1
3 2 1
2 3 1
4 3 1
1 4 1
3 4 1
The row "4 1 1" in this output corresponds to ai(2)=4, aj(2) = 1, av(2) =1.
It means that the entry in the 4th row and 1st column is a 1.
To learn more about the find command, just type "help find" or "doc find" in Matlab.
To just get the entries in the upper-triangular part of a, you would type
>> [ai,aj,av] = find(triu(a)); [ai,aj,av]
ans =
1 2 1
2 3 1
1 4 1
3 4 1
You can also assemble a sparse matrix from lists like this by using the sparse command:
>> b = sparse(ai,aj,av)
b =
(1,2) 1
(2,3) 1
(1,4) 1
(3,4) 1
>> full(b) % so you can see what you are getting
ans =
0 1 0 1
0 0 1 0
0 0 0 1
If you want to control the size of the matrix that sparse produces, you do it like this:
>> b = sparse(ai,aj,av,4,4); full(b)
ans =
0 1 0 1
0 0 1 0
0 0 0 1
0 0 0 0
Using sparse to create a matrix is much more efficient than filling in the entries one-by-one. I'll time both ways of doing this using tic and toc, while creating a random choose-1 graph. First, the fast way:
>> n = 10000;
>> a = sparse(n,n)
a =
All zero sparse: 10000-by-10000
>> ai = 1:n; aj = randperm(n);
>> tic; a = sparse(ai,aj,1,n,n); a = a + a'; toc
Elapsed time is 0.004473 seconds.
And, the slow way.
>> a = sparse(n,n)
a =
All zero sparse: 10000-by-10000
>> tic; for i = 1:n, a(i,aj(i)) = 1; a(aj(i),i) = 1; end; toc
Elapsed time is 0.289366 seconds.
If you ever want to figure out which parts of you code are taking the longest to run, the easiest way is to use the matlab profiler.