Ismember and ways to implement it

Hey,I have two strings where every message from there has an arrival time. I have to compare the Messages from S1 with S2(strings both), as they are very big to compare with ismember I have to do it another way. Compare the first message from S1 with all the messages from S2 that came at the same time or 1 min before and after. Imagine one message from S1 has arrived at 5 min 40 sec. it should be compared with the message from S2 that came from the minut 4:40 and 6:40. The code that gave me the time from each message is:
N = size(AIS1,1)
p = size(AIS2,1)
TimeAIS1 = [];
TimeAIS2 = [];
for i=1:1:N
seq1=AIS1(i);
TimeAIS1 = [TimeAIS1,extractAfter(seq1,strlength(seq1)-4)];
DN = str2double(TimeAIS1);
dur1 = minutes(floor(DN/100)) + seconds(mod(DN,100));
end
for j=1:1:N
seq2=AIS2(j);
TimeAIS2 = [TimeAIS2,extractAfter(seq2,strlength(seq2)-4)];
DN2 = str2double(TimeAIS2);
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
end

7 Comments

Jan
Jan on 15 Sep 2021
Edited: Jan on 15 Sep 2021
What does "very big" mean in absolute values? Why is this a limitation for ismember?
"Imagine one message from S1 has arrived at 5 min 40 sec" - I have no idea what this means.
What does the shown code do? How is it connected to the problem you have? How do the inputs look like and what do you want to achieve?
The iterative growing of arrays is extremely expensive. Avoid this strictlya. You current code computes STR2DOUBLE for all values in each iteration. Do this once only:
TimeAIS1 = [];
for i=1:1:N
seq1=AIS1(i);
TimeAIS1 = [TimeAIS1,extractAfter(seq1,strlength(seq1)-4)];
DN = str2double(TimeAIS1); % Repeatedly for all former values?!
dur1 = minutes(floor(DN/100)) + seconds(mod(DN,100)); % Overwritten in each iteration?!
end
% Much better without a loop:
TimeAIS1 = extractAfter(AIS1, strlength(AIS1) - 4)
DN = str2double(TimeAIS1);
dur1 = minutes(floor(DN/100)) + seconds(mod(DN,100));
well that code gives the time every message from AIS1 and AIS2 was sended.
and then I have to compare the first message from AIS1 with the ones that are inside a rank of one minute from AIS2.
Do you recomend me do this comparison another way? bc I am stucked and do not how to continue
and very big means like 20000 lines each string
It feels like you've jumped into the description of your problem in the middle.
You have some strings and somehow a time is incorporated into each string. Do they look like this?
secondsPerDay = seconds(hours(24));
dt = datetime('now') + seconds(randi(secondsPerDay, 10, 1));
s = string(dt)
s = 10×1 string array
"16-Sep-2021 08:23:32" "16-Sep-2021 03:55:35" "16-Sep-2021 04:24:21" "16-Sep-2021 01:46:47" "16-Sep-2021 09:27:10" "15-Sep-2021 21:04:35" "16-Sep-2021 16:52:20" "16-Sep-2021 10:54:09" "16-Sep-2021 03:47:01" "16-Sep-2021 09:09:38"
If not, please post a small set of strings (no more than 10-20) that are representative of the strings with which you're working.
You have a second set of strings (same format?) that you want to compare with the first set (or vice versa.) Please show us a small set of those strings.
Now that we know the format of the data with which you're working, for those two small data sets describe what you want as output. Don't worry about how to implement it in MATLAB; explain your process for choosing what appears in the output. Is it just the timestamp information that's relevant for deciding whether something from a data set ends up in the output or are other parts of the string relevant?
okey, lets see the type of data I am working with is like this:
AIS1:
"!AIVDM,2,1,3,B,54hG=R82FP2e`LQc:208E8<v1HuT4LE:2222220U1pI446b;070PDPiC3kPH,0*720000"
"!AIVDM,2,2,3,B,88888888880,2*240000"
"!AIVDO,1,1,,,B3EkBN03wk?8mP=18D3Q3wv5sP06,0*230000"
"!AIVDM,1,1,,B,13GPhM0P01P9rGNGast>4?wn2@S7,0*7D0000"
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,A,33iMjv5P00P9wKdGcEOv4?v02DU:,0*460000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,A,13MAj;P00<P<:hJGQecr`K820@0J,0*280000"
AIS2:
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
If you notice they have four last digits that give us when it has arived to us the information they go from 0000 to 5959 (59 mins 59 second). What I have to do is to compare boths sets of strings but as they are huge, they have 18000 or 20000 each set I have to reduce the way they are compared. I was meant to do it by increasing or decreasing one minut. So the first message from set 1 that ends with 0000 would be compare with all the messages from set 2 that end from 0000 to 0100. And that with every message from set 1. Did I explained myself correctly?
So this message from your first set:
"!AIVDM,2,1,3,B,54hG=R82FP2e`LQc:208E8<v1HuT4LE:2222220U1pI446b;070PDPiC3kPH,0*720000"
would be compared with each of these messages from your second set:
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
In this case that first message is much longer than any of the messages from the second set, so there's no possibility of a match. Correct?
The second message in the first set also gets compared with the whole second set:
"!AIVDM,2,2,3,B,88888888880,2*240000"
By inspection, there's no match. A quick scan suggests the first message from the first set that matches is the 8th:
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
This matches the third message from the second set. So what do you want the output to look like? If you're emulating ismember you'd want the first seven elements of the first output to be false and the 8th true? For the second output you'd want the first seven elements to be 0 and the 8th 3?
Yes, thats it using inmember but with the restriction that I explained of the last 4 digits

Sign in to comment.

Answers (0)

Categories

Find more on MATLAB in Help Center and File Exchange

Asked:

on 15 Sep 2021

Commented:

on 15 Sep 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!