Resampling multiple data columns from minutes to hours in matlab - matlab

I got a big data set of minutly data with multiple columns that needs to be converted from minutes to hours.
I am new to matlab and tried
data_minute = rand(data); % synthetic data
data_hour = mean(reshape(data_minute, 60, []))
which only gives me the hourly data from one row.
I wasnt able to work through every column with something like:
for i = 1:n_columns
data_hour(:,i) = mean(reshape(data_minute(:,i),60, []));
end
Trying a For-Loop to sample every 60 data plots also didn't work out.
Looking at a solution in google didn't give me a result i understood.
Update:
For clarification the data looks something like this:
minute value
1 501
2 479
3 449
4 463
5 404
6 173
7 141
8 141
9 141
10 140
11 140
12 140
13 140
14 202
15 206
16 206
.. ...
525604 120

This sounds like a job for timetable and retime. First make a timetable, using a duration for the "time" variable - it's easy to create a duration array using the minutes function. For example:
>> tt = timetable(minutes(0:1000)', rand(1001, 1));
>> % Just look at the first few rows of 'tt':
>> head(tt)
ans =
8×1 timetable
Time Var1
_____ ________
0 min 0.31907
1 min 0.98605
2 min 0.71818
3 min 0.41318
4 min 0.09863
5 min 0.73456
6 min 0.63731
7 min 0.073842
>> % use 'retime' to get the hourly means:
>> rt = retime(tt, 'hourly', 'mean')
rt =
17×1 timetable
Time Var1
_______ _______
0 min 0.47755
60 min 0.47877
120 min 0.48007
180 min 0.55399
240 min 0.5142
300 min 0.5656
360 min 0.50957
420 min 0.48986
480 min 0.49568
540 min 0.55133
600 min 0.49981
660 min 0.53677
720 min 0.49343
780 min 0.53409
840 min 0.47901
900 min 0.55287
960 min 0.48173

We want to: Downsample the data with an aggregation or an interpolation of all the measurements grouped by hour.
If we take this example data matrice:
M = [10, 3,4,5,6;
2000, 3,4,3,5;
5000, 4,4,4,4]
And we say that the first column correspond to the time in second, and the other columns correspond to your measurements.
Solution 1: Aggregation with accumarray
% we start by calculating the time in hour (3600 seconds in one hour).
hour = ceil(M(:,1)/3600)
% We extract the measurements
val = M(:,2:end)
% nrow = How many different measurements ?
nrow = size(val,2);
% How many unique hour ?
[uid,~,id] = unique(hour);
% creation of a sub index grouping the measurements by hour and by column
sub = [repmat(id,nrow,1),kron(1:nrow,ones(1,length(id))).']
sub =
1 1
1 1
2 1
1 2
1 2
2 2
1 3
1 3
2 3
1 4
1 4
2 4
%We calculate the result using accumarray (first column = hour):
RES = [uid,accumarray(sub,val(:),[],#median)] %if you want the mean choose #mean
RES =
1.0000 3.0000 4.0000 4.0000 5.5000
2.0000 4.0000 4.0000 4.0000 4.0000
Solution 2: Interpolation with interp1
You can interpolate your data with interp1
interp_second = unique(floor(M(:,1)/3600))*3600
%création of an unique index
uid = unique(ceil(M(:,1)/3600))
% We extract the measurements
val = M(:,2:end)
% Result (first column = hour)
RES = [uid,interp1(M(:,1),val,interp_second)]
Conclusion
I would recommand the solution 1, because the method is more robust.

Related

Reshaping a vector into a larger matrix with arbitrary m and n

I'm attempting to create a function that takes a vector of any length and uses its entries to generate a matrix of size mxn, where m and n are arbitrary numbers. If the matrix has a greater number of entries that the original vector, the entries should repeat. E.g. A vector, (1,2,3,4) would make a 3x3 matrix (1,2,3;4,1,2;3,4,1).
So far I have this function:
function A = MyMatrix(Vector,m,n)
A = reshape([Vector,Vector(1:(m*n)-length(Vector))],[m,n]);
end
which is successful in some cases:
>> m=8;n=5;Vector=(1:20);
>> A = MyMatrix(Vector,m,n)
A =
1 9 17 5 13
2 10 18 6 14
3 11 19 7 15
4 12 20 8 16
5 13 1 9 17
6 14 2 10 18
7 15 3 11 19
8 16 4 12 20
However this only works for values of m and n that multiply to a number less than or equal to twice the number of entries in 'Vector', so 40 in this case. When mn is larger than 40, this code yields:
>> m=8;n=6;Vector=(1:20);
>> A = MyMatrix(Vector,m,n)
Index exceeds the number of array elements (20).
Error in MyMatrix (line 3)
A = reshape([Vector,Vector(1:(m*n)-length(Vector))],[m,n]);
I have tried to create a workaround using functions such as repmat, however, so far I have not been able to create a matrix with larger m and n.
You only need to
index the vector using "modular", 1-based indexing;
reshape it taking into account that Matlab is column-major, so you need to swap m and n;
transpose to swap m and n back.
V = [10 20 30 40 50 60]; % vector
m = 4; % number of rows
n = 5; % number of columns
A = reshape(V(mod(0:m*n-1, numel(V))+1), n, m).';
This gives
A =
10 20 30 40 50
60 10 20 30 40
50 60 10 20 30
40 50 60 10 20

Calculate Interval mean of column in matlab (interval not fixed)

I have an array (2000x2) with two variables and want to calculate the mean of column 2 at intervals determined by column 1. How can i do this?
speed=(:,1); %values range from 0-100 cm/s
press=(:,2);
I want to calculate mean pressure at at 5 cm/s intervals of speed. So that I get 20 values for pressure that correspond to 20 intervals of speed.
Should be simple, but I'm still a beginner in Matlab.
The accumarray function does just that:
data = [0 20 33 44 22 56 25 47 81 90; 3 5 4 3 2 4 5 5 6 0].';
speed = data(:,1);
press = data(:,2);
sz = 5; % interval size
fill = NaN; % fill value in the result, for empty groups
group = floor(speed/sz)+1; % compute index of group for each value
result = accumarray(group, press, [], #mean, NaN); % compute mean of each group

Weighted Random number?

How can I randomize and generate numbers from 0-50 in matrix of 5x5 with SUM or each row printed on the right side?
+
is there any way to give weight to individual numbers before generating the numbers?
Please help
Thanks!
To generate a random matrix of integers between 0 and 50 (sampled with replacement) you could use
M = randint(5,5,[0,50])
To print the matrix with the sum of each row execute the following command
[M sum(M,2)]
To use a different distribution there are a number of techniques but one of the easiest is to use the datasample function from the Statistics and Machine Learning toolbox.
% sample from a truncated Normal distribution. No need to normalize
x = 0:50;
weights = exp(-0.5*(x-25).^2 / 5^2);
M = reshape(datasample(x,25,'Weights',weights),[5,5])
Edit:
Based on your comment you want to perform random sampling without replacement. You can perform such a random sampling without replacement if the weights are non-negative integers by simulating the classic ball-urn experiment.
First create an array containing the appropriate number of each value.
Example: If we have the values 0,1,2,3,4 with the following weights
w(0) = 2
w(1) = 3
w(2) = 5
w(3) = 4
w(4) = 1
Then we would first create the urn array
>> urn = [0 0 1 1 1 2 2 2 2 2 3 3 3 3 4];
then, we would shuffle the urn using randperm
>> urn_shuffled = urn(randperm(numel(urn)))
urn_shuffled =
2 0 4 3 0 3 2 2 3 3 1 2 1 2 1
To pick 5 elements without replacement we would simple select the first 5 elements of urn_shuffled.
Rather than typing out the entire urn array, we can construct it programatically given an array of weights for each value. For example
weight = [2 3 5 4 1];
urn = []
v = 0
for w = weight
urn = [urn repmat(v,1,w)];
v = v + 1;
end
In your case, the urn will contain many elements. Once you shuffle you would select the first 25 elements and reshape them into a matrix.
>> M = reshape(urn_shuffled(1:25),5,5)
To draw random integer uniformly distributed numbers, you can use the randi function:
>> randi(50,[5,5])
ans =
34 48 13 28 13
33 18 26 7 41
9 30 35 8 13
6 12 45 13 47
25 38 48 43 18
Printing the sum of each row can be done by using the sum function with 2 as the dimension argument:
>> sum(ans,2)
ans =
136
125
95
123
172
For weighting the various random numbers, see this question.

Matlab loop to convert N x 1 matrix to 60 x 4718 matrix

I am a novice at Matlab and am struggling a bit with creating a loop that will a convert a 283080 x 2 matrix - column 1 lists all stockID numbers (each repeated 60 times) and column 2 contains all lagged monthly returns (60 observations for each stock) into a 60 x 4718 matrix with a column for each stockID and its corresponding lagged returns falling in 60 rows underneath each ID number.
My aim is to then try to calculate a variance-covariance matrix of the returns.
I believe I need a loop because I will be repeating this process over 70 times as I have multiple data sets in this same current format
Thanks so much for the help!
Let data denote your matrix. Then:
aux = sortrows(data,1); %// sort rows according to value in column 1
result = reshape(aux(:,2),60,[]); %// reshape second column as desired
If you need to insert the stockID values as headings (first row of result), add this as a last line:
result = [ unique(aux(:,1)).'; result ];
A simple example, replacing 60 by 2:
>> data = [1 100
2 200
1 101
2 201
4 55
3 0
3 33
4 56];
>> aux = sortrows(data,1);
>> result = reshape(aux(:,2),2,[])
>> result = [ unique(aux(:,1)).'; result ];
result =
1 2 3 4
100 200 0 55
101 201 33 56

How can I divide each row of a matrix by a fixed row?

Suppose I have a matrix like:
100 200 300 400 500 600
1 2 3 4 5 6
10 20 30 40 50 60
...
I wish to divide each row by the second row (each element by the corresponding element), so I'll get:
100 100 100 100 100 100
1 1 1 1 1 1
10 10 10 10 10 10
...
Hw can I do it (without writing an explicit loop)?
Use bsxfun:
outMat = bsxfun (#rdivide, inMat, inMat(2,:));
The 1st argument to bsxfun is a handle to the function you want to apply, in this case right-division.
Here's a couple more equivalent ways:
M = [100 200 300 400 500 600
1 2 3 4 5 6
10 20 30 40 50 60];
%# BSXFUN
MM = bsxfun(#rdivide, M, M(2,:));
%# REPMAT
MM = M ./ repmat(M(2,:),size(M,1),1);
%# repetition by multiplication
MM = M ./ ( ones(size(M,1),1)*M(2,:) );
%# FOR-loop
MM = zeros(size(M));
for i=1:size(M,1)
MM(i,:) = M(i,:) ./ M(2,:);
end
The best solution is the one using BSXFUN (as posted by #Itamar Katz)
You can now use array vs matrix operations.
This will do the trick :
mat = [100 200 300 400 500 600
1 2 3 4 5 6
10 20 30 40 50 60];
result = mat ./ mat(2,:)
which will output :
result =
100 100 100 100 100 100
1 1 1 1 1 1
10 10 10 10 10 10
This will work in Octave and Matlab since R2016b.