# Extract number after specific words

Rachele Franceschini on 19 Aug 2021
Edited: DGM on 27 Aug 2021
I have an excel file with text and number. In the same cell I have text with value of latitude and longitude value. Exist one method to get numbers, specificing the text.
For example: you extract the number after the word "latitude", you extract number after word "longitude".
Rachele Franceschini on 19 Aug 2021
for example the column name is: entities.
Within there is text with also latitude 45,0000 and longitude 2,00000.
I would like the number after words.

DGM on 19 Aug 2021
Edited: DGM on 19 Aug 2021
Something like this?
C = {'latitude 45,0000','longitude 2,0000';
'latitude 47,5000','longitude 5,0000';
'latitude 50,8000','longitude 10,0000'};
D = regexp(C,'(?<=[latitude |longitude ])\d+,?\d*','match');
D = str2double(reshape(strrep(vertcat(D{:}),',','.'),size(C)))
D = 3×2
45.0000 2.0000 47.5000 5.0000 50.8000 10.0000
That's assuming that all the lines are formatted the same and that the comma is the decimal separator.
DGM on 20 Aug 2021
Edited: DGM on 27 Aug 2021
Without formatting, this is ambiguous.
C = {aa bbb cccccc dddddd, latitude 45,0000 longitude 2,0000;
aaa bbbb cc, latitude 46,00000 longitude 2,00000;
aaaaa bbbbb cccc ddd eeee fffffff, latitude 46,00000 longitude 2,00000 latitude 49,00000 longitude 9,00000}
I'm going to assume it's just a 3x1 array with a double entry on the third line
% extra prefix chars per line
% mixed comma/dot decimal sep
% multiple lat/lon per row
C = { 'aa bbb cccccc dddddd, latitude 45.0500 longitude 2.0500';
'aaa bbbb cc, latitude 46,00000 longitude 2,00000';
'aaaaa bbbbb cccc ddd eeee fffffff, latitude 46,00000 longitude 2,00000 latitude 49,00000 longitude 9,00000'};
D = regexp(C,'(?<=[latitude |longitude ])\d+[,|.]?\d*','match');
D = cellfun(@(x) reshape(x.',2,[]).',D,'uniform',false);
D = str2double(reshape(strrep(vertcat(D{:}),',','.'),[],2))
D = 4×2
45.0500 2.0500 46.0000 2.0000 46.0000 2.0000 49.0000 9.0000
Note that there are more rows in D than in C, since some lines have multiple entries.

