Problem with loop MATLAB - matlab

no time scores
1 10 123
2 11 22
3 12 22
4 50 55
5 60 22
6 70 66
. . .
. . .
n n n
Above a the content of my txt file (thousand of lines).
1st column - number of samples
2nd column - time (from beginning to end ->accumulated)
3rd column - scores
I wanted to create a new file which will be the total of every three sample of the scores divided by the time difference of the same sample.
e.g. (123+22+22)/ (12-10) = 167/2 = 83.5
(55+22+66)/(70-50) = 143/20 = 7.15
new txt file
83.5
7.15
.
.
.
n
so far I have this code:
fid=fopen('data.txt')
data = textscan(fid,'%*d %d %d')
time = (data{1})
score= (data{2})
for sample=1:length(score)
..... // I'm stucked here ..
end
....

If you are feeling adventurous, here's a vectorized one-line solution using ACCUMARRAY (assuming you already read the file in a matrix variable data like the others have shown):
NUM = 3;
result = accumarray(reshape(repmat(1:size(data,1)/NUM,NUM,1),[],1),data(:,3)) ...
./ (data(NUM:NUM:end,2)-data(1:NUM:end,2))
Note that here the number of samples NUM=3 is a parameter and can be substituted by any other value.
Also, reading your comment above, if the number of samples is not a multiple of this number (3), then simply discard the remaining samples by doing this beforehand:
data = data(1:fix(size(data,1)/NUM)*NUM,:);
Im sorry, here's a much simpler one :P
result = sum(reshape(data(:,3), NUM, []))' ./ (data(NUM:NUM:end,2)-data(1:NUM:end,2));

%# Easier to load with importdata
data = importdata('data.txt',' ',1);
%# Get the number of rows
n = size(data,1);
%# Column IDs
time = 2;score = 3;
%# The interval size (3 in your example)
interval = 3;
%# Pre-allocate space
new_data = zeros(numel(interval:interval:n),1);
%# For each new element in the new data
index = 1;
%# This will ignore elements past the closest (floor) multiple of 3 as requested
for i = interval:interval:n
%# First and last elements in a batch
a = i-interval+1;
b = i;
%# Compute the new data
new_data(index) = sum( data(a:b,score) )/(data(b,time)-data(a,time));
%# Increment
index = index+1;
end

For what it's worth, here is how you would go about to do that in Python. It is probably adaptable to Matlab.
import numpy
no, time, scores = numpy.loadtxt('data', skiprows=1).T
# here I assume that your n is a multiple of 3! otherwise you have to adjust
sums = scores[::3]+scores[1::3]+scores[2::3]
dt = time[2::3]-time[::3]
result = sums/dt

I suggest you use the importdata() function to get your data into your variable called data. Something like this:
data = importdata('data.txt',' ', 1)
replace ' ' by the delimiter your file uses, the 1 specifies that Matlab should ignore 1 header line. Then, to compute your results, try this statement:
(data(1:3:end,3)+data(2:3:end,3)+data(3:3:end,3))./(data(3:3:end,2)-data(1:3:end,2))
This worked on your sample data, should work on the real data you have. If you figure it out yourself you'll learn some useful Matlab.
Then use save() to write the results back to a file.
PS If you find yourself writing loops in Matlab you are probably doing something wrong.

Related

Only Import File when it contains certain numbers from a Table

I got a couple 100 sensor measurement files all containing the date and time of measurement. All the files have names that include date and time. Example:
07-06-2016_17-58-32.wf
07-06-2016_18-02-32.wf
...
...
08-06-2016_17:48-26.wf
I have a function (importfile) and a loop that imports my data. The loop looks like this:
Files = dir('C:\Osci\User\*.waveform');
numFiles = length(Files);
Data = cell(1, numFiles);
for fileNum = 1:numFiles
Data{fileNum} = importfile(Files(fileNum).name);
end
Not all of these waveform files are useful. The measurement files are only useful if they were generated in a certain time period. I got a table that shows my allowed time periods:
07-Jun-2016 18:00:01
07-Jun-2016 18:01:31
07-Jun-2016 18:02:01
...
I want to modify my loop, so that the files (.waveform files) are only imported if the numbers for day (first number), hour (4th number) and minute (5th number) from the files match the numbers of the table containing the allowed time periods.
EDIT: Rather than a scalar hour, minute, and second, there is a vector of each. In my case, MyDay, MyHour and MyMinute are 1100x1 matrices while fileTimes only consists of 361 rows.
So, using the provided example the loop should only import file
07-06-2016_18-02-32.wf
since it is the only one where the numbers match (in this case 7, 18, 02).
EDIT2: Using #erfan's answer (and changing some directories and variable names) I have the following working code:
fmtstr = 'O:\\Basic_Research_All\\Lange\\Skripe ISAT\\Rohdaten\\*_%02i-*-*_%02i-%02i-*.wf';
Files = struct([]);
n = size(MyDayMyHourMyMinute);
for N = 1:n;
Files = [Files; dir(sprintf(fmtstr, MyDayMyHourMyMinute(N,:)))];
end
numFiles = length(Files);
WaveformData = cell(1, numFiles);
for fileNum = 1:numFiles
WaveformData{fileNum} = importfile(Files(fileNum).name);
end
Since your filenames are pretty well defined as dates and times, you can prefilter your list by turning them into actual dates and times:
% Get the file list
Files = dir('C:\Osci\User\*.waveform');
% You only need the names
Files = {Files.name};
% Get just the filename w/o the extension
[~, baseFileNames] = cellfun(#(x) fileparts(x), Files, 'UniformOutput', false);
% Your filename is just a date, so parse it as such
fileTimes = datevec(baseFileNames, 'mm-dd-yyyy_HH-MM-SS');
% Now pick out the files you want
% goodFiles = fileTimes(:, 4) == myHour & fileTimes(:, 5) == myMinute & fileTimes(:, 6) == mySecond;
goodFiles = ismember(fileTimes(:, 4:6), [myHour(:), myMinute(:), mySecond(:)], 'rows');
% Pare down your list of filenames
Files = Files(goodFiles);
% Preallocate your data cell
Data = cell(1, numel(Files));
% Now do your loop
for idx = 1:numel(Data)
Data{idx} = importfile(Files{idx});
end
You will, of course, need to define myHour, myMinute and mySecond. Of course, using the logical indexing in goodFiles, you could impose any sort of time criteria, like time or date range. If you find that your filenames aren't so well defined, you could parse out the filename using textscan or strfind to get the bits you want. The important thing is that cell arrays can be indexed into in much the same way as numerical or string arrays and it's often better to vectorize your filter criteria and then only do the loop on the parts you have to.
The OP indicated in a comment below that rather than a scalar hour, minute, and second, there is a vector of each. In that case, use ismember to match the two time vectors and return a logical index vector. With 2015a, MathWorks introduced the function ismembertol, which allows one to check membership within a certain tolerance.
You can apply your selection from the beginning. Imagine the acceptable values for day, hour and minute are saved in acc as an n*3 matrix. If you replace the first line of your code with:
fmtstr = 'C:\Osci\User\%02i-*-*_%02i-%02i-*.wf';
Files = struct([]);
for ii = 1:n
Files = [Files; dir(sprintf(fmtstr, acc(ii,:)))];
end
Then you have already applied your criteria to Files. The rest is the same.

MATLAB data file when it overs its memory

There is a Matrix of 500000000 X 5.
And the sample of this data is like this :
1 01 06:0 48407
1 01 06:1 48407
.
.
.
865850 31 23:5 1586884493
Each column means [area_number date hour:minute amount_of_data]
I want to load them entirely, after that make another 865850 X 4464 matrix from their 5th column values. In this new matrix, row insists area_number. And each column means amout_of_data according to time priority.
This is what I wrote.
clear all; close all;
fileID=fopen('data2.txt','r');
Data=fscanf(fileID, '%d %d %d:%d %d',[5 100000]);
Data = Data';
Zeros = zeros(4000, 4464);
DataA = Data(:,1); % indexs
DataB = Data(:,2); % dates
DataC = Data(:,3); % hours
DataD = Data(:,4); % minutes
DataE = Data(:,5); % data
for m=1:40000
r = DataA(m);
c = (DataB(m)-1)*24*6 + DataC(m)*6 + DataD(m);
Zeros(r,c) = DataE(m);
end
I can't finish it because the matrix too big to load it at once.
It overs memory limitation of MATLAB.
Please help me...
Thank you~!
To solve your problem, using the matfile command is probably the best choice. It allows you to write data directly to a mat-file on the filesystem but access it like a variable.
If I understood your data right, all lines with the same index are next to each other, and all data with the same index is small enough to fit your memory.
Read all data with index 1
process it like you did above, creating one row of your intended matrix
Write this row to your matfile
Proceed with the next index until you reach the end

Average across many files on matlab

I was wondering if I can get help with this. I have many .mat files with an array in each and I want to average each cell individually (average all (1,1)s, all (1,2)s, ... (2,1)s etc.) and store them.
Thanks for the help!
I'm not quite sure how your data is organised, but you can do something like this:
% Assume you know the size of the arrays and that the variables r and c
% hold the numbers of rows and columns respectively.
xTotals = zeros(r, c);
xCount = 0;
% for each file: assume the data is loaded into a variable called x, which is
% r rows by c columns
for ...
xTotals = xTotals + x;
xCount = xCount + 1;
end
xAvg = xTotals / xCount;
And xAvg will contain the average for each array cell. Note that you probably know xCount without having to count each time you go round the loop, but it depends on where you are getting your data. Hopefully you get the idea!

MATLAB: Creating a matrix from for loop values?

I have the following code:
for i = 1450:9740:89910
n = i+495;
range = ['B',num2str(i),':','H',num2str(n)];
iter = xlsread('BrokenDisplacements.xlsx' , range);
displ = iter;
displ = [displ; iter];
end
Which takes values from an Excel file from a number of ranges I want and outputs them as matricies. However, this code just uses the final value of displ and creates the total matrix from there. I would like to total these outputs (displ) into one large matrix saving values along the way, how would I go about doing this?
Since you know the size of the block of data you are reading, you can make your code much more efficient as follows:
firstVals = 1450:9740:89910;
displ = zeros((firstVals(end) - firstVals(1) + 1 + 496), 7);
for ii = firstVals
n = ii + 495;
range = sprintf('B%d:H%d', ii, ii+495);
displ((ii:ii+495)-firstVals(1)+1,:) = xlsread('BrokenDiplacements.xlsx', range);
end
Couple of points:
I prefer not to use i as a variable since it is built in as sqrt(-1) - if you later execute code that assumes that to be true, you're in trouble
I am not assuming that the last value of ii is 89910 - by first assigning the value to a vector, then finding the last value in the vector, I sidestep that question
I assign all space in iter at once - otherwise, as it grows, Matlab keeps having to move the array around which can slow things down a lot
I used sprintf to generate the string representing the range - I think it's more readable but it's a question of style
I assign the return value of xlsread directly to a block in displ that is the right size
I hope this helps.
How about this:
displ=[];
for i = 1450:9740:89910
n = i+495;
range = ['B',num2str(i),':','H',num2str(n)];
iter = xlsread('BrokenDisplacements.xlsx' , range);
displ = [displ; iter];
end

matlab averages of cell array

I have the following script for importing text files into matlab which include hourly data, where I am then trying to convert them into daily averages:
clear all
pathName = ...
TopFolder = pathName;
dirListing = dir(fullfile(TopFolder,'*.txt'));%Lists the folders in the directory specified by pathName.
for i = 1:length(dirListing);
SubFolder{i} = dirListing(i,1).name;%obtain the name of each folder in
%the specified path.
end
%import data
for i=1:length(SubFolder);
rawData1{i} = importdata(fullfile(pathName,SubFolder{i}));
end
%convert into daily averages
rawData2=cell2mat(rawData1);
%create one matrix for entire data set
altered=reshape(rawData2,24,(size(rawData2,2)*365));
%convert into daily values
altered=mean(altered)';
%take the average for each day
altered=reshape(altered,365,size(rawData2,2));
%convert back into original format
My problem lies in trying to convert the data back into the same format as 'rawData1' which was a cell for each variable (where each variable is denoted by 'SubFolder'. The reason for doing this is that all but one of the variables are vectors, where the remaining variable is a matrix (8760*11).
So, an example of this would be:
clear all
cell_1 = rand(8760,1);
cell_2 = rand(8760,1);
cell_3 = rand(8760,1);
cell_4 = rand(8760,1);
cell_5 = rand(8760,1);
cell_6 = rand(8760,11);
cell_7 = rand(8760,1);
cell_8 = rand(8760,1);
cell_9 = rand(8760,1);
data = {cell_1,cell_2,cell_3,cell_4,cell_5,cell_6,cell_7,cell_8,cell_9};
Where I need to convert each cell in 'data' from hourly values into daily averages (i.e. 365 rows).
Any advice would be much appreciated.
I think this does what you want.
data = cellfun(#(x) reshape(mean(reshape(x,24,[]))',365,[]),data,'uniformoutput',false);
However that is kind of confusing so I will explain a little.
This part mean(reshape(x,24,[]))' inside of the cellfun will reshape each cell in data into a 24 by 365, compute the mean, then turn it back into a single column. This works fine when the original data only has 1 column ... but for cell_6 with 11 columns it puts all the data end to end. So I added an addition reshape(...) wrapper around the mean(...) part to put it back into the original 11 columns ... or more precises N columns that are 365 rows in length.
Note: This is going to give you errors if you ever have data sets dimensions are not 8760 by X.