Indexing string in loop

39 views (last 30 days)
RP
RP on 28 Jun 2021
Commented: Cris LaPierre on 29 Jun 2021
Hi everyone!
I have a problem with a loop: I am trying to write a script that runs a few tasks for several files so I tried to index each name in the list, but when I print out
fullfile(datapath,sprintf('%s',author),'A',sprintf('%s',mask))
I see that a = 1 doesn't mean the whole name, but only the first letter. I tried different versions of this but I don't manage to index to the whole name. Any help would be very appreciated!
datapath = 'D:\DATA\test\';
author_list = ['Peterson', 'Jacobs'];
mask_list = ['mask_peterson.nii,1','mask_jacobs.nii,1'];
for a = 1:2
for m = 1:2
author = author_list(a)
mask = mask_list(m)
matlabbatch{1}.spm.util.imcalc.input = {fullfile(datapath,sprintf('%s',author),'A',sprintf('%s',mask))};
matlabbatch{1}.spm.util.imcalc.output = sprintf('1%s_thr',author);
matlabbatch{1}.spm.util.imcalc.outdir = {fullfile(datapath,sprintf('%s',author),'A')};
%...
%...
%...
end
end
  2 Comments
Stephen23
Stephen23 on 28 Jun 2021
Edited: Stephen23 on 28 Jun 2021
Square brackets are a concatenation operator, they concatenate arrays together. So your code:
['Peterson', 'Jacobs']
concatenates two character vectors into one vector, and so is exactly equivalent to writing this:
'PetersonJacobs'
I doubt that is very useful for you. Most likely you should be storing those character vectors in a cell array:
RP
RP on 29 Jun 2021
Thank you for the answer and the link, you are right, this works perfectly!

Sign in to comment.

Answers (2)

Walter Roberson
Walter Roberson on 28 Jun 2021
datapath = 'D:\DATA\test\';
author_list = {'Peterson', 'Jacobs'};
mask_list = {'mask_peterson.nii,1','mask_jacobs.nii,1'};
for a = 1:2
for m = 1:2
author = author_list{a}
mask = mask_list{m}
matlabbatch{1}.spm.util.imcalc.input = {fullfile(datapath,sprintf('%s',author),'A',sprintf('%s',mask))};
matlabbatch{1}.spm.util.imcalc.output = sprintf('1%s_thr',author);
matlabbatch{1}.spm.util.imcalc.outdir = {fullfile(datapath,sprintf('%s',author),'A')};
%...
%...
%...
end
end
  3 Comments
Stephen23
Stephen23 on 29 Jun 2021
@Rose Potter: good question. In general there is not a huge subjective difference in speed, so you should pick whichever one is best for your data processing. A few distinctions:
  • If you are manipulating characters, character arrays/vectors gives direct access to the character code.
  • The string class is a container class, one string element contains a character vector of any length.
  • The string class has a number of overloaded operators and convenience syntaxes designed to make it easier to work with.
  • The string class was introduced in R2016b, so if you need to run your code on earlier versions then use a cell array & character vectors.
Walter Roberson
Walter Roberson on 29 Jun 2021
N = 10000;
C = cell(1,N);
for K = 1 : N
C{K} = char(randi([32 127], 1, randi([5 200])));
end
fprintf('conversion speed\n');
conversion speed
tic
S = string(C);
toc
Elapsed time is 0.017607 seconds.
fprintf('basic indexing speed, cells\n');
basic indexing speed, cells
tic
for K = 1 : N
C{K};
end
toc
Elapsed time is 0.011222 seconds.
fprintf('last character indexing speed, cells\n');
last character indexing speed, cells
tic
for K = 1 : N
C{K}(end);
end
toc
Elapsed time is 0.011354 seconds.
fprintf('basic indexing speed, strings\n')
basic indexing speed, strings
tic
for K = 1 : N
S(K);
end
toc
Elapsed time is 0.019268 seconds.
fprintf('last character indexing speed, strings using {}\n')
last character indexing speed, strings using {}
tic
for K = 1 : N
S{K}(end);
end
toc
Elapsed time is 0.022054 seconds.
fprintf('last character indexing speed, strings extractAfter\n');
last character indexing speed, strings extractAfter
tic
for K = 1 : N
extractAfter(S(K), strlength(S(K)));
end
toc
Elapsed time is 0.034476 seconds.
fprintf('last character, cell regexp\n');
last character, cell regexp
tic
regexp(C, '.$', 'once', 'match');
toc
Elapsed time is 0.030094 seconds.
fprintf('last character, string regexp\n');
last character, string regexp
tic
regexp(S, '.$', 'once', 'match');
toc
Elapsed time is 0.032894 seconds.
fprintf('last character, string regexppattern')
last character, string regexppattern
tic
extractAfter(S, regexpPattern('.$'));
toc
Elapsed time is 0.032256 seconds.
So cell is anywhere from just barely faster, to a couple of times as fast (eg using the official extractAfter strlength() instead of using lower level {}(end) )

Sign in to comment.


Cris LaPierre
Cris LaPierre on 28 Jun 2021
One way is to use strings instead of char arrays.
datapath = "D:/DATA/test";
author_list = ["Peterson", "Jacobs"];
mask_list = ["mask_peterson.nii,1","mask_jacobs.nii,1"];
for a = 1:length(author_list)
for m = 1:length(mask_list)
input = fullfile(datapath,author_list(1),'A',mask_list(m))
end
end
input = "D:/DATA/test/Peterson/A/mask_peterson.nii,1"
input = "D:/DATA/test/Peterson/A/mask_jacobs.nii,1"
input = "D:/DATA/test/Peterson/A/mask_peterson.nii,1"
input = "D:/DATA/test/Peterson/A/mask_jacobs.nii,1"
  3 Comments
Walter Roberson
Walter Roberson on 29 Jun 2021
A = ['PQR'; 'STU']
A = 2×3 char array
'PQR' 'STU'
class(A)
ans = 'char'
size(A)
ans = 1×2
2 3
A(:,1)
ans = 2×1 char array
'P' 'S'
B = ["PQR"; "STU"]
B = 2×1 string array
"PQR" "STU"
class(B)
ans = 'string'
size(B)
ans = 1×2
2 1
B(:,1)
ans = 2×1 string array
"PQR" "STU"
Apostrophe creates character vectors, which are vectors of class char(). Indexing by a scalar gets you one character. A(:,1) is asking for the first column of A, which is a 2 x 1 array of char. Character vectors can be used as-if they are numeric in a number of different contexts: for example,
A(1,1)+0
ans = 80
Here the single numeric character code stored at A(1,1), corresponding to 'P', is automatically converted to decimal, and then 0 is added to the result.. so 'P' is character #80.
double-quote on the other hand, creates string() objects. Using () indexing on string objects gets you entire strings. B(:,1) gets you the first column of string objects. There are number of operations defined for string objects that are different than for character arrays. Such as
B(1,1)+0
ans = "PQR0"
what has happened here is that for string objects, the + operator is defined as concatenation. And also, + between a string object and a numeric value is defined as formatting the numeric value as its representable string object -- so numeric value 0 to "0" the representation. Then the "PQR" + "0" is appending, giving "PQR0"
There are uses for both methods of operating with characters.
Cris LaPierre
Cris LaPierre on 29 Jun 2021
"One way is to use strings instead of char arrays."

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!