Extract rows of matrices with nearest days record: MATLAB - matlab

I have two matrices A and B. Both have different sizes and 1st, 2nd, 3rd, & 4th value show year, month, day and values in both matrices. I need to extract rows with same year and month however, day of +/-6 days from matrix A and related rows form matrix B. If two or more days are close in matrices A & B, I should choose the rows corresponding to highest value from both matrices.
A = 1954 1 16 2,3042
1954 12 5 2,116
1954 12 21 1,9841
1954 12 22 2,7411
1955 1 13 1,8766
1955 10 16 1,4003
1955 12 29 1,4979
1956 1 19 2,1439
1956 1 21 1,7666
1956 11 26 1,7367
1956 11 27 1,8914
1957 1 27 1,151
1957 2 2 1,1484
1957 12 29 1,1906
1957 12 30 1,3157
1958 1 10 1,6186
1958 1 20 1,1637
1958 2 6 1,1639
1958 10 16 1,1444
1959 1 3 1,7784
1959 1 24 1,1871
1959 2 20 1,2264
1959 10 25 1,2194
1960 6 29 1,2327
1960 12 4 1,7213
1960 12 5 1,373
1961 3 21 1,7149
1961 3 27 1,4404
1961 11 3 1,3934
1961 12 5 1,777
1962 2 12 2,1813
1962 2 16 3,5776
1962 2 17 1,9236
1963 9 27 1,6164
1963 10 13 1,786
1963 10 14 1,9203
1963 11 22 1,7575
1964 2 2 1,4402
1964 11 15 1,437
1964 11 17 1,7588
1964 12 4 1,6358
1965 2 13 1,874
1965 11 2 2,6468
1965 11 26 1,7163
1965 12 11 1,8283
1966 12 1 2,1165
1966 12 19 1,6672
1966 12 24 1,8173
1966 12 25 1,4923
1967 2 23 2,3002
1967 3 1 1,9614
1967 3 18 1,673
1967 11 12 1,724
1968 1 4 1,6355
1968 1 15 1,6567
1968 3 6 1,1587
1968 3 18 1,212
1969 9 29 1,5613
1969 10 1 1,5016
1969 11 20 1,9304
1969 11 29 1,9279
1970 10 3 1,9859
1970 10 28 1,4065
1970 11 4 1,4227
1970 11 9 1,7901
B = 1954 12 28 774
1954 12 29 734
1955 3 26 712
1955 3 27 648
1956 7 18 1030
1956 7 23 1090
1957 2 17 549
1957 2 28 549
1958 2 27 759
1958 2 28 798
1959 1 10 421
1959 1 24 419
1960 12 5 762
1960 12 8 829
1961 2 12 788
1961 2 13 776
1962 2 15 628
1962 4 9 628
1963 3 12 552
1963 3 13 552
1964 2 12 260
1964 2 13 253
1965 12 22 862
1965 12 23 891
1966 1 5 828
1966 12 27 802
1967 1 1 777
1967 1 2 787
1968 1 17 981
1968 1 18 932
1969 3 15 511
1969 3 16 546
1970 2 25 1030
1970 2 26 1030
The expected output is a new matrix C:
C = 1954 12 22 2,7411 1954 12 28 774
1959 1 3 1,7784 1959 1 10 421
1959 1 24 1,1871 1959 1 24 419
1960 12 4 1,7213 1960 12 8 829
1962 2 12 2,1813 1962 2 15 628
1966 12 24 1,8173 1966 12 27 802
1968 1 15 1,6567 1968 1 17 981
Any help how to code this?

I think the following should do what you want -
To deal with overlaps at year and month boundaries, it's useful to have the dates mapped to number of days since an epoch. The first function finds the earliest data in either dataset, and then formats it to be interpreted by the 'daysact' function.
function epoch_date_str = get_epoch_datestr(A,B)
Astr = int2str(A(:,1:3));
Bstr = int2str(B(:,1:3));
[epoch_Ay, epoch_Am, epoch_Ad] = earliest_date(A);
[epoch_By, epoch_Bm, epoch_Bd] = earliest_date(B);
[epoch_y, epoch_m, epoch_d] = earliest_date([epoch_Ay, epoch_Am, epoch_Ad; epoch_By, epoch_Bm, epoch_Bd]);
epoch_str = int2str([epoch_y, epoch_m, epoch_d]);
epoch_date_str = regexprep(epoch_str,'\s+','/')
end
This function then does the calculation of the number of days from the epoch to each date in the dataset, it's basically just wrangling data into a format accepted by the daysact function.
function ndays = days_since_epoch(A, epoch_date_str)
ndays = zeros(size(A,1),1);
Astr = int2str(A(:,1:3));
for i=1:size(Astr,1)
ndays(i) = daysact(epoch_date_str, regexprep(Astr(i,:),'\s+','/'));
end
end
And now we can get on with the actual calculations - I was a bit confused by the fifth column in the 'A' matrix you presented, I assume that is the score, but if not it's configured by the A_MATRIX_SCORE_COL variable. Similarly the 6 day window is configured by the WINDOW_SIZE.
ep_str = get_epoch_datestr(A,B);
ndaysA = days_since_epoch(A, ep_str);
ndaysB = days_since_epoch(B, ep_str);
C = [];
WINDOW_SIZE= 6;
A_MATRIX_SCORE_COL = 5;
for i=1:length(B)
% Find dates within the date window
overlaps = find(ndaysA >= (ndaysB(i) - window_size ) & (ndaysA <= (ndaysB(i) + window_size )));
% If there are multiple matches, choose the highest and append to C
if (length(overlaps) > 0)
[~, max_idx] = max(A(overlaps,A_MATRIX_SCORE_COL));
match_row = overlaps(max_idx);
C = [C; A(match_row,:) B(i,:)];
end
end
C = unique(C,'rows');
The output I get differs from yours:
C =
1954 12 22 2 7411 1954 12 28 774
1959 1 24 1 1871 1959 1 24 419
1960 12 4 1 7213 1960 12 5 762
1960 12 4 1 7213 1960 12 8 829
1962 2 16 3 5776 1962 2 15 628
1966 12 24 1 8173 1966 12 27 802
1968 1 15 1 6567 1968 1 17 981
1968 1 15 1 6567 1968 1 18 932
But your second row has a difference of 7 days, so I wouldn't expect it to be found. It can be included by increasing the window_size to 7.
As you can see, it's possible for a row in A to be included twice in C if it matches more than one date in B. This could be easily filtered from C if you want:
D = []
for i = 1:size(C,1)
% Find matching dates from A. Due to the way C was built, there won't be duplicates from B.
dupes = find((C(:,1) == C(i,1) & C( :,2) == C(i,2) & C( :,3) == C(i,3)))
% If there's only one match (i.e. it matches itself), then add to D
if (length(dupes) == 1)
D = [D; C(i,:)]
else
% If there are duplicates, then compare the scores from B and only add the highest score to D.
best = true;
for j=1:length(dupes)
if C(i,end) < C(dupes(j),end)
best = false;
end
end
if (best == true)
D = [D; C(i,:)]
end
end
end
The matrix 'D' is then your de-duplicated output.

Related

How does imfilter 'replicate' in matlab work for bigger filters?

I am reading this:
https://uk.mathworks.com/help/images/imfilter-boundary-padding-options.html
And I am trying to understand how it will work for 5x5, or 7x7 kernels. Let's say in a 5x5 kernel you will have an extra row and column, on the top and right side of the kernel compared to the one in the image in the link. What value will that take ? Just the closest one it can find ? And how about diagonal values (the ones in the corners) ?
From the documentation for the 'replicate' option in imfilter,
Input array values outside the bounds of the array are assumed to equal the nearest array border value.
You can actually see the exact array that imfilter uses by calling padarray with the proper arguments. Say we have a 5x5 array:
im = reshape(1:25, 5, 5)
im =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
We can pad this array by 2 on each side (the equivalent of using a 5x5 kernel):
padarray(im, [2 2], 'replicate')
ans =
1 1 1 6 11 16 21 21 21
1 1 1 6 11 16 21 21 21
1 1 1 6 11 16 21 21 21
2 2 2 7 12 17 22 22 22
3 3 3 8 13 18 23 23 23
4 4 4 9 14 19 24 24 24
5 5 5 10 15 20 25 25 25
5 5 5 10 15 20 25 25 25
5 5 5 10 15 20 25 25 25
Spacing out the rows/columns so you can see the original array more easily:
1 1 1 6 11 16 21 21 21
1 1 1 6 11 16 21 21 21
1 1 1 6 11 16 21 21 21
2 2 2 7 12 17 22 22 22
3 3 3 8 13 18 23 23 23
4 4 4 9 14 19 24 24 24
5 5 5 10 15 20 25 25 25
5 5 5 10 15 20 25 25 25
5 5 5 10 15 20 25 25 25
You can also verify this by creating a kernel with a single 1 value in one of the corners:
im = reshape(1:25, 5, 5)
im =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
k = zeros(5); k(1,1) = 1
k =
1 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
imfilter(im, k, 'replicate')
ans =
1 1 1 6 11
1 1 1 6 11
1 1 1 6 11
2 2 2 7 12
3 3 3 8 13
Naturally, this only shows the top-left 5x5 subarray of the 9x9 padded array, but by repeating the process with the 1 in different corners you can see the whole array.

Accumulate sliding blocks into a matrix

In MATLAB, we can use im2col and col2im to transform from columns to blocks and back, for example
>> A = floor(30*rand(4,6))
A =
8 5 2 13 15 11
22 11 27 13 24 24
5 18 23 9 23 15
20 23 14 15 19 10
>> B = im2col(A,[2 2],'distinct')
B =
8 5 2 23 15 23
22 20 27 14 24 19
5 18 13 9 11 15
11 23 13 15 24 10
>> col2im(B,[2 2],[4,6],'distinct')
ans =
8 5 2 13 15 11
22 11 27 13 24 24
5 18 23 9 23 15
20 23 14 15 19 10
my question is that: after using im2col with sliding mode
>> B = im2col(A,[2 2],'sliding')
B =
8 22 5 5 11 18 2 27 23 13 13 9 15 24 23
22 5 20 11 18 23 27 23 14 13 9 15 24 23 19
5 11 18 2 27 23 13 13 9 15 24 23 11 24 15
11 18 23 27 23 14 13 9 15 24 23 19 24 15 10
I wish to get a 4-by-6 matrix C from B(without knowing A) that the value at each site equals the original value multiple the times of sampling.
In other word, C(1,1)=A(1,1), C(1,2)=A(1,2)*2, C(2,2) = A(2,2)*4
Though we can easily implement with a for-loop, but the efficiency is critically low. So how to vectorize the implementation?
If I'm understanding correctly, you're desired output is
C = [ 8 10 4 26 30 11
44 44 108 52 96 48
10 72 92 36 92 30
20 46 28 30 38 10 ]
which I got by computing C = A.*S where
S = [ 1 2 2 2 2 1
2 4 4 4 4 2
2 4 4 4 4 2
1 2 2 2 2 1 ]_
The entries in S represent how many sliding blocks each entry is a member of.
I believe your question boils down to how to construct the matrix S.
Solution:
S = min(min(1:M,M:-1:1),x)'*min(min(1:N,N:-1:1),y)
C = A.*S
where A is size M-by-N, and your sliding block is size x-by-y.
Explanation:
In the given example, M=4, N=6, x=2, and y=2.
Notice the solution S can be written as the outer product of two vectors:
S = [1;2;2;1] * [1,2,2,2,2,1]
We construct each of these two vectors using the values of M,N,x,y:
min(1:M,M:-1:1)' == min(1:4,4:-1:1)'
== min([1,2,3,4], [4,3,2,1])'
== [1,2,2,1]'
== [1;2;2;1]
In this case, the extra min(...,x) does nothing since all entries are already <=x.
min(1:N,N:-1:1) == min(1:6,6:-1:1)
== min([1,2,3,4,5,6],[6,5,4,3,2,1])
== [1,2,3,3,2,1]
This time the extra min(...,y) does matter.
min(min(1:N,N:-1:1),y) == min([1,2,3,3,2,1],y)
== min([1,2,3,3,2,1],2)
== [1,2,2,2,2,1]

MATLAB: How to make 2D binary mask from mesh?

I'm using MESH2D in Matlab in order to mesh ROI (Region Of Interest) from images. Now I would like to make binary masks from these triangular meshes. The outputs from [p,t] = mesh2d(node) are:
p = Nx2 array of nodal XY co-ordinates.
t = Mx3 array of triangles as indicies into P, defined with a counter-clockwise node ordering.
Example of an initial code (feel free to improve it!):
mask= logical([0 0 0 0 0; 0 1 1 0 0; 0 1 1 1 1; 0 1 1 0 0]) %let's say this is my ROI
figure, imagesc(mask)
lol=regionprops(mask,'all')
[p,t] = mesh2d(lol.ConvexHull); %it should mesh the ROI
How to make masks from this triangular mesh?
Thank you in advance!
This is p:
1,50000000000000 2
1,50000000000000 2,50000000000000
1,50000000000000 3
1,50000000000000 3,50000000000000
1,50000000000000 4
1,93703949778653 2,56171771423604
1,96936200278303 3,98632617574682
2 1,50000000000000
2 4,50000000000000
2,00975325040940 3,53647067507122
2,01137717786904 2,05700769275495
2,05400996239344 3,03376821385856
2,41193753423879 2,49774899749798
2,45957145752038 3,46313210038859
2,50000000000000 1,50000000000000
2,50000000000000 4,50000000000000
2,51246316199066 3,99053096338726
2,56500321259084 1,97186739050944
2,64423955240966 2,98576823004855
3 1,50000000000000
3 4,50000000000000
3,00248771086621 2,47385860181019
3,01650848812758 3,52665319517610
3,08981230082503 3,98949609178151
3,12731558449295 2,02370031640169
3,36937385842331 2,99811446160210
3,50000000000000 1,75000000000000
3,50000000000000 4,25000000000000
3,85193739480358 3,46578962137238
3,85353024582881 2,53499308989903
4 2
4 4
4,42246720814684 3,00037409439956
4,50000000000000 2,25000000000000
4,50000000000000 3,75000000000000
4,97304775909580 2,99999314296989
5 2,50000000000000
5 3,50000000000000
5,50000000000000 3
and t:
9 5 7
20 18 15
1 8 11
8 15 11
11 15 18
11 2 1
6 2 11
20 27 25
25 18 20
27 30 25
17 10 14
7 10 17
24 21 17
9 7 17
29 35 32
26 30 29
23 19 26
14 19 23
26 29 23
23 29 24
23 17 14
24 17 23
6 11 13
13 11 18
34 30 31
31 30 27
3 2 6
12 19 14
14 10 12
6 13 12
12 13 19
12 3 6
28 21 24
28 29 32
24 29 28
9 17 16
16 17 21
38 35 33
35 29 33
33 29 30
34 37 33
33 30 34
19 13 22
26 19 22
18 25 22
22 13 18
22 30 26
22 25 30
4 7 5
4 10 7
4 12 10
3 12 4
38 33 36
36 33 37
39 38 36
36 37 39
To get the mask for the ix-th triangle, use:
poly2mask(p(t(ix,:),1),p(t(ix,:),2),width,height)
t is used to index n to get the data for one triangle.

How to Store Matrix/Vector and values in Matlab

I am trying to store vectors. When I run the program in the loop I see all the values, but when referred outside the loop only the last vector is evaluated and stored (the one that ends with prime number 953, see below). Any calculations done with the PVX vector are done only with the last entry. I want PVX to do calculations with all the results not just the last entry. How can I store these results to do calculations with?
This is the code:
PV=[2 3 5 7 11 13 17 19 23 29];
for numba=2:n
if mod(numba,PV)~=0;
xp=numba;
PVX=[2 3 5 7 11 13 17 19 23 29 xp]
end
end
The first few results looks like this:
PVX: Prime Vectors (Result)
PVX =
2 3 5 7 11 13 17 19 23 29 31
PVX =
2 3 5 7 11 13 17 19 23 29 37
PVX =
2 3 5 7 11 13 17 19 23 29 41
PVX =
2 3 5 7 11 13 17 19 23 29 43
PVX = ...........................................................
PVX =
2 3 5 7 11 13 17 19 23 29 953
If you want to store all PVX values, use a different row for each:
PV = [2 3 5 7 11 13 17 19 23 29];
PVX = [];
for numba=2:n
if mod(numba,PV)~=0;
xp = numba;
PVX = [PVX; 2 3 5 7 11 13 17 19 23 29 xp];
end
end
Of course if would be better to initiallize the PVX matrix to the appropriate size, but the number of rows is hard to predict.
Alternatively, build the PVX without loops:
xp = setdiff(primes(n), primes(29)).'; %'// all primes > 29 and <= n
PVX = [ repmat([2 3 5 7 11 13 17 19 23 29], numel(xp), 1) xp ];
As an example, for n=100, either of the above approaches gives
PVX =
2 3 5 7 11 13 17 19 23 29 31
2 3 5 7 11 13 17 19 23 29 37
2 3 5 7 11 13 17 19 23 29 41
2 3 5 7 11 13 17 19 23 29 43
2 3 5 7 11 13 17 19 23 29 47
2 3 5 7 11 13 17 19 23 29 53
2 3 5 7 11 13 17 19 23 29 59
2 3 5 7 11 13 17 19 23 29 61
2 3 5 7 11 13 17 19 23 29 67
2 3 5 7 11 13 17 19 23 29 71
2 3 5 7 11 13 17 19 23 29 73
2 3 5 7 11 13 17 19 23 29 79
2 3 5 7 11 13 17 19 23 29 83
2 3 5 7 11 13 17 19 23 29 89
2 3 5 7 11 13 17 19 23 29 97
I'm assuming you were going for this:
PVX=[2 3 5 7 11 13 17 19 23 29];
for numba=2:n
if mod(numba,PVX)~=0;
xp=numba;
PVX(end+1) = xp;
%// Or alternatively PVX = [PVX, xp];
end
end
but if you could get an estimate of how large PVX will be in the end, you should pre-allocate the array first for a significant speed up.
So, looks like you need all prime till n
As Dan said use this :
PVX=[2 3 5 7 11 13 17 19 23 29 ];
for numba=2:n
if mod(numba,PVX)~=0
xp=numba;
PVX=[ PVX xp];
end
end
Or why not simply use primes function ?
PVX = primes( n ) ;

In matlab, how to calculate elapsed time between rows in a matrix

I have a 21128x9 matrix in the following format:
x = ['Participant No.' 'yyyy' 'mm' 'dd' 'HH' 'MM' 'SS' 'question No.' 'response']
e.g.
x =
Columns 1 through 5
18 2011 10 26 15
18 2011 10 26 15
18 2011 10 26 15
18 2011 10 26 15
18 2011 10 26 15
19 2011 10 31 13
19 2011 10 31 13
19 2011 10 31 13
19 2011 10 31 13
19 2011 10 31 13
Columns 6 through 9
42 33 27 4
42 39 17 2
42 45 52 2
42 47 45 3
42 50 12 3
6 5 36 1
6 20 27 4
6 22 34 5
6 33 43 3
6 42 42 1
where columns 2-7 are date vectors.
The data are sorted by date/time.
I'd like to calculate the time taken to answer each question for each participant - i.e. the time elapsed between row 1 and 2, 2 and 3, 3 and 4, 4 and 5, and then 6 and 7, 7 and 8 etc. - to end up with a matrix, sorted by participant number, where I can then work out the mean time taken per question.
I've tried using the etime function, but to no avail.
EDIT: With regards to etime, just to see if it would work in practice, I tried to write:
etime(x(2,5:7),x(1,5:7))
to compare just columns 5-7 of rows 1 and 2, but i keep getting back:
??? Index exceeds matrix dimensions.
Error in ==> etime at 41
t = 86400*(datenummx(t1(:,1:3)) - datenummx(t0(:,1:3))) + ...
You were almost there! You needed to change the 5s to 2s, that's all:
etime(x(2,2:7),x(1,2:7))
Now to get them all lets make two matrices of the date vectors but one row out of synch with each other:
fisrt set up x:
x =[ 18 2011 10 26 15 42 33 27 4
18 2011 10 26 15 42 39 17 2
18 2011 10 26 15 42 45 52 2
18 2011 10 26 15 42 47 45 3
18 2011 10 26 15 42 50 12 3
19 2011 10 31 13 6 5 36 1
19 2011 10 31 13 6 20 27 4
19 2011 10 31 13 6 22 34 5
19 2011 10 31 13 6 33 43 3
19 2011 10 31 13 6 42 42 1]
now extract the times:
Tn = x(1:end-1, 2:7);
Tnplus1 = x(2:end, 2:7);
And no to get a vector of the difference in seconds between consecutive rows:
etime(Tnplus1, Tn)
Which results in:
ans =
6
6
2
3
422595
15
2
11
9
Also if you don't care about the year month day data just set them to zero i.e.
Tn(:, 1:3) = 0;
Tnplus1(:, 1:3) = 0;
etime(Tnplus1, Tn)
ans =
6
6
2
3
-9405
15
2
11
9
Here are some simple steps:
Calculate the difference between the two rows that you want to compare
Multiply with a vector that contains the number of seconds per unit
Small scale example:
% Hours Mins Secs:
difference = ([23 12 4] - [23 11 59]);
secvec = difference .* [3600 60 1];
secdiff = sum(secvec)