Convert hourly data to daily data in Matlab - matlab

We have two matrices. Name one of them "Date", And another name is "Data"
There are several columns in the Date matrix included:
year month day julusi hour
1951 1 1 1 0
1951 1 1 1 3
1951 1 1 1 6
1951 1 1 1 9
1951 1 1 1 12
1951 1 1 1 15
1951 1 1 1 18
1951 1 1 1 21
1951 1 2 2 0
1951 1 2 2 3
1951 1 2 2 6
1951 1 2 2 9
1951 1 2 2 12
1951 1 2 2 15
1951 1 2 2 18
1951 1 2 2 21
.... . . . .
.... . . . .
1951 12 30 364 0
1951 12 30 364 3
1951 12 30 364 6
1951 12 30 364 9
1951 12 30 364 12
1951 12 30 364 15
1951 12 30 364 18
1951 12 30 364 21
1951 12 31 365 0
1951 12 31 365 3
1951 12 31 365 6
1951 12 31 365 9
1951 12 31 365 12
1951 12 31 365 15
1951 12 31 365 18
1951 12 31 365 21
.... .. . .. .
2018 12 31 365 0
2018 12 31 365 3
2018 12 31 365 6
2018 12 31 365 9
2018 12 31 365 12
2018 12 31 365 15
2018 12 31 365 18
2018 12 31 365 21
In my Data matrix, there are 410 columns(198696*410).The size of my Date matrices is equal. "198696*1". I want to convert the "Data Matrix on basis the Date Matrix to daily data
I use the following code
N=0;
for year=1951:2018;
for Juliusi=1:365;
cxa=(Date(:,4)==Juliusi);
cxb=(Date(:,1)==year);
a=cxa & cxb;
N=N+1;
dayy(N,:)=nanmean(Data(a,:));
end;end;
The conversion result is correct, but the size of the matrix is not the same
198696/8=24837 is correct but my matrix 24820 is incorrect
Where is the problem?
What to do to consider leap days?

Since I recently learned from Luis Mendo, that convolution is the key to success, I came up with the following idea: If your data is complete, i.e. you can guarantee, that there are always 8 entries for each day, you can just simply use the following approach:
% Some test data.
Date = [
1951 1 1 1 0;
1951 1 1 1 3;
1951 1 1 1 6;
1951 1 1 1 9;
1951 1 1 1 12;
1951 1 1 1 15;
1951 1 1 1 18;
1951 1 1 1 21;
1952 1 2 2 0;
1952 1 2 2 3;
1952 1 2 2 6;
1952 1 2 2 9;
1952 1 2 2 12;
1952 1 2 2 15;
1952 1 2 2 18;
1952 1 2 2 21]
% Temporary result for convolution.
temp = conv2(Date, ones(8, 1)) / 8;
% Extract values of interest.
dayy = temp(8:8:end, :)
Output:
Date =
1951 1 1 1 0
1951 1 1 1 3
1951 1 1 1 6
1951 1 1 1 9
1951 1 1 1 12
1951 1 1 1 15
1951 1 1 1 18
1951 1 1 1 21
1952 1 2 2 0
1952 1 2 2 3
1952 1 2 2 6
1952 1 2 2 9
1952 1 2 2 12
1952 1 2 2 15
1952 1 2 2 18
1952 1 2 2 21
dayy =
1951.0000 1.0000 1.0000 1.0000 10.5000
1952.0000 1.0000 2.0000 2.0000 10.5000
If you need the year and day information, then these could be obtained separately. But in your original post, these information seemed to be unneeded.
Just to be sure: I DO know, I used the Date matrix in my example. But since, Date follows the same format as Data, and you can easily verify the results of the wanted mean operation, I used it as an example.

Related

My for loop won't output anything (Matlab)

I'm trying to get this for loop to work on Matlab so I can plot these three histograms. I'm guessing it won't output because it says that my variables such as a_M_S1 keep changing size on every loop iteration, so the process is essentially inefficient. Any help? Below is the code.
I'm basically trying to generate 500 samples of 100 readings so I can then plot a histogram using estimated parameter values.
clear
clc
% Importing Data
%a = 0.9575
for m=1:500
seed=m;
rng(seed);
syms x
F=((1/atanh(0.9575))*((0.9575^(2*x-1))/(2*x-1)));
for n=1:100
data_1(n)=ceil(vpasolve(F==rand(1)));
end
Data_1(m,:)=data_1;
end
clear
clc
Data_1=[49 1 3 17 13 3 5 51 7 1
9 3 67 1 3 1 1 1 1 99
5 13 21 17 41 1 1 9 23 1
1 5 1 1 41 1 13 1 5 27
5 37 99 1 1 33 1 1 9 1
1 3 47 11 7 1 1 41 21 27
5 1 1 11 45 7 3 5 1 17
13 5 3 3 1 99 1 59 1 13
3 5 1 35 1 1 1 1 5 19
5 1 1 1 79 3 1 1 1 1
31 3 1 1 1 21 69 39 1 29
3 3 1 1 5 1 3 1 1 15
1 1 9 1 7 1 1 1 1 11
27 9 1 3 39 5 1 5 7 1
1 1 7 5 1 1 3 1 3 23
5 1 21 1 1 7 1 17 1 3
11 11 5 1 9 1 1 1 1 37
33 1 9 7 1 1 31 27 1 1
5 5 1 17 3 31 1 45 37 1
1 1 19 47 9 7 5 1 9 1
11 1 61 5 29 1 95 1 1 1
13 19 1 1 13 1 23 7 73 1
1 1 11 1 5 1 3 1 7 1
15 1 9 53 3 7 3 21 7 3
1 7 1 1 23 7 5 1 3 1
1 7 1 3 1 1 1 7 3 5
1 1 1 43 7 3 1 1 21 5
1 39 1 5 13 3 1 5 1 3
1 11 1 1 29 17 25 1 9 1
17 9 13 11 1 5 29 3 3 1
65 5 63 1 1 3 5 1 7 1
21 3 7 1 1 1 27 11 15 3
1 1 1 1 21 1 5 3 1 11
5 1 3 7 1 5 43 5 7 75
29 7 83 1 3 5 15 1 1 3
1 1 9 1 13 1 17 23 1 5
99 1 1 1 5 7 9 3 7 1
1 11 1 11 21 1 5 9 5 1
33 49 3 9 15 1 1 5 1 1
1 17 1 1 1 1 13 1 1 9
5 13 1 1 5 3 1 1 67 1
5 1 1 1 7 27 1 21 47 1
1 1 1 21 3 17 1 5 5 1
1 1 17 29 99 1 9 1 5 15
17 5 1 13 1 1 1 1 1 21
1 21 1 1 1 11 9 35 31 15
99 15 1 1 9 3 1 21 1 1
1 1 9 33 1 1 31 9 29 47
41 99 1 7 17 5 9 3 3 13
1 29 9 5 11 1 1 7 37 15];
Data_2=[1 1 3 3 5 7 1 3 1 1
1 1 1 1 1 1 1 1 1 13
5 1 5 1 1 1 1 3 1 1
1 1 3 1 1 1 1 3 1 1
1 1 13 5 1 3 1 1 5 1
3 3 1 7 3 5 3 1 3 1
1 1 1 1 1 3 3 5 1 1
1 1 1 9 1 1 1 1 5 1
1 1 1 1 1 11 7 1 5 1
17 1 1 7 3 7 3 5 5 1];
for o=1:500
syms a
%Method of Moments (MM)
mean_S1 = mean(transpose(Data_1(o,:)));
a_MM_S1(o) = vpa(vpasolve((a)/((atanh(a))*(1-a.^2)) == mean_S1,a),4);
mean_S2 = mean(transpose(Data_2(o,:)));
a_MM_S2(o) = vpa(vpasolve((a)/((atanh(a))*(1-a.^2)) == mean_S2,a),4);
%Using Lower Quantile (OS)
lower_S1 = floor(quantile(Data_1(o,:),0.25));
a_LQ_S1(o) = vpa(vpasolve((a)/(atanh(a)) == 0.25,a),4);
lower_S2 = floor(quantile(Data_2(o,:),0.25));
a_LQ_S2(o) = vpa(vpasolve((a)/(atanh(a)) == 0.25,a),4);
%Using Median (OSM)
median_S1 = floor(quantile(Data_1(o,:),0.5));
a_M_S1(o) = vpa(vpasolve((a)/(atanh(a)) == 0.5,a),4);
median_S2 = floor(quantile(Data_2(o,:),0.5));
a_M_S2(o) = vpa(vpasolve((a)/(atanh(a)) == 0.5,a),4);
end
a_MM_S1=transpose(a_MM_S1);
a_LQ_S1=transpose(a_LQ_S1);
a_M_S1=transpose(a_M_S1);
a_MM_S2=transpose(a_MM_S2);
a_LQ_S2=transpose(a_LQ_S2);
a_M_S2=transpose(a_M_S2);
figure(1)
histogram([double(a_MM_S1),double(a_MM_S2)],20),title('Method of Moments')
figure(2)
histogram([double(a_LQ_S1),double(a_LQ_S2)],20),title('Using Lower Quartile as Estimator')
figure(3)
histogram([double(a_M_S1),double(a_M_S2)],20),title('Using Median as Estimator')

How to remove all the rows from a matrix that match values in another vector?

I am making an exclude vector, so that the rows containing any value present in the second column of the matrix user from the exclude list are removed. How do I do that efficiently, without using a for loop to iterate through user for each item in exclude one by one?
My code below does not work:
count=0;
% Just showing how I am constructing `exclude`, to show that it can be long.
% So, manually removing each item from `exclude` is not an option.
% And using a for loop to iterate through each element in `exclude` can be inefficient.
for b=1:size(user_cat,1)
if user_cat(b,4)==0
count=count+1;
exclude(count,1) = user_cat(b,1);
end
end
% This is the important line of focus. You can ignore the previous parts.
user = user(user(:,2)~=exclude(:),:);
The last line gives the following error:
Error using ~=
Matrix dimensions must agree.
So, I am having to use this instead:
for b=1:size(exclude,1)
user = user(user(:,2)~=exclude(b,1),:);
end
Example:
user=[1433100000.00000 26 620260 7 1433100000000.00 0 0 2 1 100880 290 23
1433100000.00000 26 620260 7 1433100000000.00 0 0 2 1 100880 290 23
1433100000.00000 25 620160 7 1433100000000.00 0 0 2 1 100880 7274 22
1433100000.00000 21 619910 7 1433100000000.00 24.1190000000000 120.670000000000 2 0 100880 53871 21
1433100000.00000 19 620040 7 1433100000000.00 24.1190000000000 120.670000000000 2 0 100880 22466 21
1433100000.00000 28 619030 7 1433100000000.00 24.6200000000000 120.810000000000 2 0 100880 179960 16
1433100000.00000 28 619630 7 1433100000000.00 24.6200000000000 120.810000000000 2 0 100880 88510 16
1433100000.00000 28 619790 7 1433100000000.00 24.6200000000000 120.810000000000 2 0 100880 12696 16
1433100000.00000 7 36582000 7 1433100000000.00 0 0 2 0 100880 33677 14
1433000000.00000 24 620010 7 1433000000000.00 0 0 2 1 100880 3465 14
1433000000.00000 4 36581000 7 1433000000000.00 0 0 2 0 100880 27809 12
1433000000.00000 20 619960 7 1433000000000.00 0 0 2 1 100880 860 11
1433000000.00000 30 619760 7 1433000000000.00 25.0060000000000 121.510000000000 2 0 100880 34706 10
1433000000.00000 33 619910 7 1433000000000.00 0 0 2 0 100880 15060 9
1433000000.00000 26 619740 6 1433000000000.00 0 0 2 0 100880 52514 8
1433000000.00000 18 619900 6 1433000000000.00 0 0 2 0 100880 21696 8
1433000000.00000 16 619850 6 1433000000000.00 24.9910000000000 121.470000000000 2 0 100880 10505 1
1433000000.00000 16 619880 6 1433000000000.00 24.9910000000000 121.470000000000 2 0 100880 1153 1
1433000000.00000 28 619120 6 1433000000000.00 0 0 2 0 100880 103980 24
1433000000.00000 21 619870 6 1433000000000.00 0 0 2 0 100880 1442 24];
exclude=[ 3
4
7
10
17
18
19
28
30
33 ];
Desired output:
1433100000.00000 26 620260 7 1433100000000.00 0 0 2 1 100880 290 23
1433100000.00000 26 620260 7 1433100000000.00 0 0 2 1 100880 290 23
1433100000.00000 25 620160 7 1433100000000.00 0 0 2 1 100880 7274 22
1433100000.00000 21 619910 7 1433100000000.00 24.1190000000000 120.670000000000 2 0 100880 53871 21
1433000000.00000 24 620010 7 1433000000000.00 0 0 2 1 100880 3465 14
1433000000.00000 20 619960 7 1433000000000.00 0 0 2 1 100880 860 11
1433000000.00000 26 619740 6 1433000000000.00 0 0 2 0 100880 52514 8
1433000000.00000 16 619850 6 1433000000000.00 24.9910000000000 121.470000000000 2 0 100880 10505 1
1433000000.00000 16 619880 6 1433000000000.00 24.9910000000000 121.470000000000 2 0 100880 1153 1
1433000000.00000 21 619870 6 1433000000000.00 0 0 2 0 100880 1442 24
Use ismember to find the indices of the second column of user where elements of exclude exist to get the indices of the rows to be removed. Negate these row indices to get the row indices to be kept and use matrix indexing to keep these rows.
user = user(~ismember(user(:,2),exclude),:);

How can I select rows with specific column values from a matrix?

I have a matrix train3.
1 2 3 4 5 6 7
2 12 13 14 15 16 17
3 62 53 44 35 26 17
4 52 13 24 15 26 37
I want to select only those rows of whose 1st columns contain specific values (in my case 1 and 2).
I have tried the following,
>> train3
train3 =
1 2 3 4 5 6 7
2 12 13 14 15 16 17
3 62 53 44 35 26 17
4 52 13 24 15 26 37
>> ind1 = train3(:,1) == 1
ind1 =
1
0
0
0
>> ind2 = train3(:,1) == 2
ind2 =
0
1
0
0
>> mat1 = train3(ind1, :)
mat1 =
1 2 3 4 5 6 7
>> mat2 = train3(ind2, :)
mat2 =
2 12 13 14 15 16 17
>> mat3 = [mat1 ; mat2]
mat3 =
1 2 3 4 5 6 7
2 12 13 14 15 16 17
>>
Is there any better way to do this?
Presumably you are trying to get mat3 in a single step which you can do with:
mat3 = train3(train3(:,1)==1 | train3(:,1)==2,:)
A more general way to do this would be to use ismember to get all of the rows that match the values in a list:
train3 =[
1 2 3 4 5 6 7
2 12 13 14 15 16 17
3 62 53 44 35 26 17
4 52 13 24 15 26 37];
chooseList = [1 2];
colIndex = ismember(train3(:, 1), chooseList);
subset = train3(colIndex, :);
subset =
1 2 3 4 5 6 7
2 12 13 14 15 16 17

Change orientation of buffer function

I need a function that splits a vector in smaller frames with an overlap, like buffer, but instead of column-wise, it should be done row-wise.
This is how buffer works:
x = 1:20
x = buffer(x, 10, 5);
x = 0 1 6 11
0 2 7 12
0 3 8 13
0 4 9 14
0 5 10 15
1 6 11 16
2 7 12 17
3 8 13 18
4 9 14 19
5 10 15 20
What I want would be this though:
x = 0 0 1 2
1 2 3 4
3 4 5 6
5 6 7 8
7 8 9 10
9 10 11 12
11 12 13 14
13 14 15 16
15 16 17 18
17 18 19 20
Is there any function or way to achieve that? Maybe combination of buffer + some rearranging?
First figure out the answer in columns, then transpose the resulting matrix:
buffer(x, 4, 2).'

comparing lines and columns for zero data

I have a file containing the following data.
File1:
Server counter 1:00 2:00 3:00 4:00
site1 serverdowntime 15 0 3 500
site1 serverdowntimesuc 15 0 3 500
...
site12 serverdowntime 2 7 8 5
site12 serverdowntimesuc 2 7 8 5
...
site50 serverdowntime 2 12 8 45
site50 serverdowntimesuc 2 0 0 45
...
site57 serverdowntime 2 12 8 45
site57 serverdowntimesuc 2 0 0 0
Each 2 lines are for the same site. First colum is equipment, second is problem and the third could contain as many columns for the amount of hours pulled. Im looking for a way to look under the time data and find each two lines that contain only single zeros.
Output after parsing data:
site57 serverdowntime 2 12 8 45
site57 serverdowntimesuc 2 0 0 0
site1 serverdowntime 15 0 3 500
site1 serverdowntimesuc 15 0 3 500
site50 serverdowntime 2 12 8 45
site50 serverdowntimesuc 2 0 0 45
$ awk 'NR==1{next} !(NR%2){line1=$0;next} {$0=line1"\n"$0} /\<0\>/' file
site1 serverdowntime 15 0 3 500
site1 serverdowntimesuc 15 0 3 500
site50 serverdowntime 2 12 8 45
site50 serverdowntimesuc 2 0 0 45
site57 serverdowntime 2 12 8 45
site57 serverdowntimesuc 2 0 0 0
This might work for you (GNU sed):
sed -r '$!N;/^(\S+)\s.*\n\1/!D;/(^|\n)(\S+\s+){2}[^\n]*\s0(\s+|\n|$)/p;d' file
This gets a pair of lines with the first field as key and then searches for a 0 pattern in the 3rd onwards fields.
perl -ne '($k)=/^(\w+)/; if (/\b0\b/){ print $v{$k}, $_ }else{ $v{$k}=$_ }' file