How to extract locations from strings of addresses?
3 views (last 30 days)
Show older comments
Hi all,
I have more than a thousand of strings containing addresses. How can I programmatically extract the locations, like cities (e.g. Rome, London, etc) and countries (e.g. Italy, UK, etc)?
Thanks
Pietro
8 Comments
Akira Agata
on 13 Jan 2019
Edited: Akira Agata
on 13 Jan 2019
How abou the following approach?
- Split each line with ';' character into 1 or 2 segments
- Split each segment(s) with ',' mark into several fragments
- Extract the last two fragments
The following is my try:
str = [...
"Sustainable Industrial Systems, School of Chemical Engineering and Analytical Science, The University of Manchester, Manchester, United Kingdom; Dipartimento di Scienze Agrarie e Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milan, Italy";
"Department of Agricultural and Environmental Sciences - Production, Landscape, Agroenergy, Università degli Studi di Milano, Via G. Celoria 2, Milano, Italy; Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, Milano, Italy";
"Department of Land and Agriculture and Forestry Systems, University of Padua, Viale dell'Università 16, 35020 Legnaro (PD), Italy";
"Dipartimento di Agraria, University of Sassari, Sassari, Italy"];
for kk1 = 1:numel(str)
strSplit = strsplit(str(kk1),';');
strOut = repelem("",1,numel(strSplit));
for kk2 = 1:numel(strSplit)
strTmp = strsplit(strSplit(kk2),', ');
strOut(kk2) = strjoin(strTmp(end-1:end),', ');
end
disp(strjoin(strOut,'; '))
end
The output is as follows:
Manchester, United Kingdom; Milan, Italy
Milano, Italy; Milano, Italy
35020 Legnaro (PD), Italy
Sassari, Italy
Accepted Answer
Stephen23
on 13 Jan 2019
Edited: Stephen23
on 13 Jan 2019
Using one simple regular expression:
>> C = {...
'Sustainable Industrial Systems, School of Chemical Engineering and Analytical Science, The University of Manchester, Manchester, United Kingdom; Dipartimento di Scienze Agrarie e Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milan, Italy';
'Department of Agricultural and Environmental Sciences - Production, Landscape, Agroenergy, Università degli Studi di Milano, Via G. Celoria 2, Milano, Italy; Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, Milano, Italy';
'Department of Land and Agriculture and Forestry Systems, University of Padua, Viale dell''Università 16, 35020 Legnaro (PD), Italy';
'Dipartimento di Agraria, University of Sassari, Sassari, Italy'};
>> M = regexp(C,'[^,]+,[^,]+(?=;|$)','match');
>> M{:}
ans =
' Manchester, United Kingdom'
' Milan, Italy'
ans =
' Milano, Italy'
' Milano, Italy'
ans =
' 35020 Legnaro (PD), Italy'
ans =
' Sassari, Italy'
0 Comments
More Answers (0)
See Also
Categories
Find more on Data Import in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!