Decreasing Computational Time with Parfor and variable slicing

Question

0 votes

Good day,

I am fairly new to parallel computing and so far I feel like I have been successful. However I have written a code in parallel with parfor and I am using a HUGE data set on the magnitude of 34000 x n. I was wondering if there is a way to make my computations even more efficient. I also have a message saying that variable is indexed but not sliced in a parfor loop. This might result in unnecessary communication overhead. Here is a copy of my code

softTFIDFMat = zeros(n,n);
parfor i=1:n
  temp = zeros(1,n);
  for j=i:n
      score = tfidfn(i,:)'*tfidfn(j,:).*jMat;
      score = sum(score(:));
      temp(j) = score;
  end
  softTFIDFMat(i,:) = temp;
end

tfidfn is a sparse matrix that is 34303 x n, where n is generally > 2000 and jMat is also a sparse double. Any help would be appreciated. Computational time is a little under 24 hours as of now.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Jan on 17 Apr 2013

Open in MATLAB Online

0 votes

tfidfn(i,:)'*tfidfn(j,:)

This consumes much more time that a column oriented indexing:

tfidfnT = transpose(tfidfn);
...
score = tfidfnT(:, i) * tfidfnT(:, j)' .* jMat;

3 Comments
Show 1 older comment Hide 1 older comment

Adam Filion on 17 Apr 2013

Edited: Adam Filion on 17 Apr 2013

Open in MATLAB Online

It has to do with how MATLAB stores data in memory. For a matrix, it stores it column-wise, meaning that a matrix like

1 2 3
4 5 6

Is stored in memory as

So when you grab a column like tfidfnt(:,i), that is a contiguous chunk in memory. A row, like tfidfnt(i,:), is non-contiguous, which is more time consuming to work with particularly for larger data sets.

EDIT

I didn't notice that your data was sparse. Are you using the sparse data type? If you are then I'm not sure why it would make a difference, as I believe the sparse data type is stored differently than a normal matrix.

Ryan on 17 Apr 2013

I am yes and it actually did make a small difference.

Sign in to comment.

Answer 2

Edric Ellis on 17 Apr 2013

0 votes

It looks as though each iteration of your PARFOR loop accesses every element of "tfidfn", so you cannot slice it. Even if "tfidfn" were dense, it's still "only" about 0.5GB, and so the transfer time for that to each worker is very likely to be completely insignificant compared to 24 hours for the complete computation.

2 Comments
Show None Hide None

Ryan on 17 Apr 2013

Thank you Edric for taking the time to read and comment on my code. It does however take a long while to run and that is with matlabpool 12.

Ryan on 17 Apr 2013

I saw a post by you some time ago that suggested a solution that looked like this

for idx = 1:n*n [k,j] = ind2sub([n n],idx); A(idx) = sum(sum(tfidfn(:,k)*tfidfn(:,j).*jMat));

%in my case end I think this might actually work fairly well if i can do the for looping for only the upper triangular values. Do you know how I could do that?

Sign in to comment.

Answer 3

Ryan on 17 Apr 2013

0 votes

since the output matrix is symmetric perhaps I can break up the indexing for loop i or would this not help at all? For a tfidfn matrix that is 253x500 the computational time is about 2.39 seconds

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Decreasing Computational Time with Parfor and variable slicing

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

3 Comments
Show 1 older comment Hide 1 older comment

More Answers (2)

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Decreasing Computational Time with Parfor and variable slicing

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

3 Comments Show 1 older comment Hide 1 older comment

More Answers (2)

2 Comments Show None Hide None

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

3 Comments
Show 1 older comment Hide 1 older comment

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments