Read all .csv-files in folder and plot their content - matlab

By an old post (https://stackoverflow.com/a/13744310/3900582) I have been able to read all the .csv-files in my folder into a cell array. Each .csv-file has the following structure:
0,1024
1,427
2,313
3,492
4,871
5,1376
6,1896
7,2408
8,2851
9,3191
Where the left column is the x-value and the right column is the y-value.
In total, there are almost 200 files and they are each up to 100 000 lines long. I would like to plot the contents of the files in one figure, to allow the data to be more closely inspected.

I was able to use the following code to solve my problem:
dd = dir('*.csv');
fileNames = {dd.name};
data = cell(numel(fileNames),2);
data(:,1) = regexprep(fileNames, '.csv','');
for i = 1:numel(fileNames)
data{i,2} = dlmread(fileNames{i});
end
fig=figure();
hold on;
for j = 1:numel(fileNames)
XY = data{j,2};
X = XY(:,1);
Y = XY(:,2);
plot(X,Y);
end

Related

Open multiple folders which have about 500 files under them and extract vtk files under them

I am trying to open multiple folders which have about 500 files under them and then use a function called vtkread to read the files in those folders. I am not sure how to set that up.
So here is my function but I am stuggling with setting up the mainscript to select files from a folder
function [Z_displacement,Pressure] = Processing_Code2_Results(filename, reduce_time, timestep_total)
fid = fopen(filename,'r');
Post_all = [];
vv=[1:500];
DANA0 = vtkRead('0_output_000000.vtk'); %extract all data from the vtk file including disp, pressure, points, times
C = [DANA0.points,reshape(DANA0.pointData.displacements,size(DANA0.points)),reshape(DANA0.pointData.pressure,[length(DANA0.points),1])];
disp0 = reshape(DANA0.pointData.displacements,[1,size(C,1),3]);
points = DANA0.points; % This is a matrix of the xyz points
for i = 1:reduce_time:timestep_total %34
DANA = vtkRead(sprintf('0_output_%06d.vtk',i)); % read in each successive timestep
disp(i,:,:) = DANA.pointData.displacements; % store displacement for multiple timesteps
pressure(i,:) = DANA.pointData.pressure; % store pressure for multiple timesteps
% press = pressure';
end
...
I have tried something like this:
clc; clear;
timestep_total = 500;
reduce_time = 100;
cd 'C:\Users\Admin\OneDrive - Kansas State University\PhD\Project\Modeling\SSGF_Model\New_Model_output'
for i = 1:3
filename = sprintf("Gotherm_%d",i)
[Z_displacement_{i},Pressure_{i}] = Processing_Code2_Results(filename, reduce_time, timestep_total);
end

Matlab: Error using readtable (line 216) Input must be a row vector of characters or string scalar

I gave the error Error using readtable (line 216) Input must be a row vector of characters or string scalar when I tried to run this code in Matlab:
clear
close all
clc
D = 'C:\Users\Behzad\Desktop\New folder (2)';
filePattern = fullfile(D, '*.xlsx');
file = dir(filePattern);
x={};
for k = 1 : numel(file)
baseFileName = file(k).name;
fullFileName = fullfile(D, baseFileName);
x{k} = readtable(fullFileName);
fprintf('read file %s\n', fullFileName);
end
% allDates should be out of the loop because it's not necessary to be in the loop
dt1 = datetime([1982 01 01]);
dt2 = datetime([2018 12 31]);
allDates = (dt1 : calmonths(1) : dt2).';
allDates.Format = 'MM/dd/yyyy';
% 1) pre-allocate a cell array that will store
% your tables (see note #3)
T2 = cell(size(x)); % this should work, I don't know what x is
% the x is xlsx files and have different sizes, so I think it should be in
% a loop?
% creating loop
for idx = 1:numel(x)
T = readtable(x{idx});
% 2) This line should probably be T = readtable(x(idx));
sort = sortrows(T, 8);
selected_table = sort (:, 8:9);
tempTable = table(allDates(~ismember(allDates,selected_table.data)), NaN(sum(~ismember(allDates,selected_table.data)),size(selected_table,2)-1),'VariableNames',selected_table.Properties.VariableNames);
T2 = outerjoin(sort,tempTable,'MergeKeys', 1);
% 3) You're overwriting the variabe T2 on each iteration of the i-loop.
% to save each table, do this
T2{idx} = fillmissing(T2, 'next', 'DataVariables', {'lat', 'lon', 'station_elevation'});
end
the x is each xlsx file from the first loop. my xlsx file has a different column and row size. I want to make the second loop process for all my xlsx files in the directory.
did you know what is the problem? and how to fix it?
Readtable has one input argument, a filename. It returns a table. In your code you have the following:
x{k} = readtable(fullFileName);
All fine, you are reading the tables and storing the contents in x. Later in your code you continue with:
T = readtable(x{idx});
You already read the table, what you wrote is basically T = readtable(readtable(fullFileName)). Just use T=x{idx}

How to add standrad deviation and moving average

What I want to is:
I got folder with 32 txt files and 1 excle file, each file contain some data in two columns: time, level.
I already managed to pull the data from the folder and open each file in Matlab and get the data from it. What I need to do is create plot for each data file.
each of the 32 plots should have:
Change in average over time
Standard deviation
With both of this things I am straggling can't make it work.
also I need to make another plot this time the plot should have the average over each minute from all the 32 files.
here is my code until now:
clc,clear;
myDir = 'my path';
dirInfo = dir([myDir,'*.txt']);
filenames = {dirInfo.name};
N = numel(filenames);
data=cell(N,1);
for i=1:N
fid = fopen([myDir,filenames{i}] );
data{i} = textscan(fid,'%f %f','headerlines',2);
fclose(fid);
temp1=data{i,1};
time=temp1{1};
level=temp1{2};
Average(i)=mean(level(1:find(time>60)));
AverageVec=ones(length(time),1).*Average(i);
Standard=std(level);
figure(i);
plot(time,level);
xlim([0 60]);
hold on
plot(time, AverageVec);
hold on
plot(time, Standard);
legend('Level','Average','Standard Deviation')
end
the main problam with this code is that i get only average over all the 60 sec not moving average, and the standard deviation returns nothing.
few things you need to know:
*temp1 is 1x2 cell
*time and level are 22973x1 double.
Apperently you need an alternative to movmean and movstd since they where introduced in 2016a. I combined the suggestion from #bla with two loops that correct for the edge effects.
function [movmean,movstd] = moving_ms(vec,k)
if mod(k,2)==0,k=k+1;end
L = length(vec);
movmean=conv(vec,ones(k,1)./k,'same');
% correct edges
n=(k-1)/2;
movmean(1) = mean(vec(1:n+1));
N=n;
for ct = 2:n
movmean(ct) = movmean(ct-1) + (vec(ct+n) - movmean(ct-1))/N;
N=N+1;
end
movmean(L) = mean(vec((L-n):L));
N=n;
for ct = (L-1):-1:(L-n)
movmean(ct) = movmean(ct+1) + (vec(ct-n) - movmean(ct+1))/N;
N=N+1;
end
%mov variance
movstd = nan(size(vec));
for ct = 1:n
movstd(ct) = sum((vec(1:n+ct)-movmean(ct)).^2);
movstd(ct) = movstd(ct)/(n+ct-1);
end
for ct = n+1:(L-n)
movstd(ct) = sum((vec((ct-n):(ct+n))-movmean(ct)).^2);
movstd(ct) = movstd(ct)/(k-1);
end
for ct = (L-n):L
movstd(ct) = sum((vec((ct-n):L)-movmean(ct)).^2);
movstd(ct) = movstd(ct)/(L-ct+n);
end
movstd=sqrt(movstd);
Someone with matlab >=2016a can compare them using:
v=rand(1,1E3);m1 = movmean(v,101);s1=movstd(v,101);
[m2,s2] = moving_ms(v,101);
x=1:1E3;figure(1);clf;
subplot(1,2,1);plot(x,m1,x,m2);
subplot(1,2,2);plot(x,s1,x,s2);
It should show a single red line since the blue line is overlapped.

Using MATLAB to stack several 2D plots generated from .csv into a 3D plot

I have code to generate a 2D plot from data stored in several .csv files:
clearvars;
files = dir('*.csv');
name = 'E_1';
set(groot, 'DefaultLegendInterpreter', 'none')
set(gca,'FontSize',20)
hold on;
for file = files'
csv = xlsread(file.name);
[n,s,r] = xlsread(file.name);
des_cols = {'Stress','Ext.1(Strain)'};
colhdrs = s(2,:);
[~,ia] = intersect(colhdrs, des_cols);
colnrs = flipud(ia);
file.name = n(:, colnrs);
file.name = file.name(1:end-500,:);
plot(file.name(:,2),file.name(:,1),'DisplayName',s{1,1});
end
ylabel({'Stress (MPa)'});
xlabel({'Strain (%)'});
title({name});
legend('show');
What I would like to do is modify the code in order to concatenate 2D plots made from the .csv data into a 3D plot where one of the axis is the index of the .csv in files kind of like the picture at the top of this post. I got the idea of using plot3 from that post but I'm not sure how to get it to work.
From what I understood I need to create 3 new matrices xMat, yMat, zMat. The columns of each matrix contain the data from the csv file and the yMat contains columns that are just the index of the csv but I'm not entirely sure where to go from here.
Thanks for any help!
You could call plot3 in the loop like something like the following. Basically change the Y-values to Z-Values. Then increment Y by one for each iteration of the loop.
figure;
a = axes;
grid on;
hold(a,'on');
x = 0:.1:4*pi;
for ii = 1:10
plot3(a,x,ones(size(x))*ii,sin(x));
end
view(40,40)
Modifying your code would look something like the following. Note that since I don't have your CSVs I can't test any of this.
clearvars;
files = dir('*.csv');
name = 'E_1';
set(groot, 'DefaultLegendInterpreter', 'none')
set(gca,'FontSize',20)
a = gca;
hold on;
ii = 1;
for file = files'
csv = xlsread(file.name);
[n,s,r] = xlsread(file.name);
des_cols = {'Stress','Ext.1(Strain)'};
colhdrs = s(2,:);
[~,ia] = intersect(colhdrs, des_cols);
colnrs = flipud(ia);
file.name = n(:, colnrs);
file.name = file.name(1:end-500,:);
plot3(a,file.name(:,2),ones(size(file.name(:,2))).*ii,file.name(:,1),'DisplayName',s{1,1});
ii = ii+1;
end
view(40,40);
ylabel({'Stress (MPa)'});
xlabel({'Strain (%)'});
title({name});
legend('show');

How to store .csv data and calculate average value in MATLAB

Can someone help me to understand how I can save in matlab a group of .csv files, select only the columns in which I am interested and get as output a final file in which I have the average value of the y columns and standard deviation of y axes? I am not so good in matlab and so I kindly ask if someone to help me to solve this question.
Here what I tried to do till now:
clear all;
clc;
which_column = 5;
dirstats = dir('*.csv');
col3Complete=0;
col4Complete=0;
for K = 1:length(dirstats)
[num,txt,raw] = xlsread(dirstats(K).name);
col3=num(:,3);
col4=num(:,4);
col3Complete=[col3Complete;col3];
col4Complete=[col4Complete;col4];
avgVal(K)=mean(col4(:));
end
col3Complete(1)=[];
col4Complete(1)=[];
%columnavg = mean(col4Complete);
%columnstd = std(col4Complete);
% xvals = 1 : size(columnavg,1);
% plot(xvals, columnavg, 'b-', xvals, columnavg-columnstd, 'r--', xvals, columnavg+columstd, 'r--');
B = reshape(col4Complete,[5000,K]);
m=mean(B,2);
C = reshape (col4Complete,[5000,K]);
S=std(C,0,2);
Now I know that I should compute mean and stdeviation inside for loop, using mean()function, but I am not sure how I can use it.
which_column = 5;
dirstats = dir('*.csv');
col3Complete=[]; % Initialise as empty matrix
col4Complete=[];
avgVal = zeros(length(dirstats),2); % initialise as columnvector
for K = 1:length(dirstats)
[num,txt,raw] = xlsread(dirstats(K).name);
col3=num(:,3);
col4=num(:,4);
col3Complete=[col3Complete;col3];
col4Complete=[col4Complete;col4];
avgVal(K,1)=mean(col4(:)); % 1st column contains mean
avgVal(K,2)=std(col4(:)); % 2nd column contains standard deviation
end
%columnavg = mean(col4Complete);
%columnstd = std(col4Complete);
% xvals = 1 : size(columnavg,1);
% plot(xvals, columnavg, 'b-', xvals, columnavg-columnstd, 'r--', xvals, columnavg+columstd, 'r--');
B = reshape(col4Complete,[5000,K]);
meanVals=mean(B,2);
I didn't change much, just initialised your arrays as empty arrays so you do not have to delete the first entry later on and made avgVal a column vector with the mean in column 1 and the standard deviation in column 1. You can of course add two columns if you want to collect those statistics for your 3rd column in the csv as well.
As a side note: xlsread is rather heavy for reading files, since Excel is horribly inefficient. If you want to read a structured file such as a csv, it's faster to use importdata.
Create some random matrix to store in a file with header:
A = rand(1e3,5);
out = fopen('output.csv','w');
fprintf(out,['ColumnA', '\t', 'ColumnB', '\t', 'ColumnC', '\t', 'ColumnD', '\t', 'ColumnE','\n']);
fclose(out);
dlmwrite('output.csv', A, 'delimiter','\t','-append');
Load it using csvread:
data = csvread('output.csv',1);
data now contains your five columns, without any headers.