Read text file in MATLAB for data analysis - matlab

I have uploaded the file here. These are some lines from my txt file:
RSN1146_KOCAELI_AFY000 1.345178e-02
RSN1146_KOCAELI_AFY090 1.493577e-02
RSN1146_KOCAELI_AFYDWN 5.350641e-03
RSN4003_SANSIMEO_25862-UP 4.869095e-03
RSN4003_SANSIMEO_25862090 1.199087e-02
RSN4003_SANSIMEO_25862360 1.181286e-02
I would like to remove the data with DWN on 3rd line and -UP in 4th line. So the data will only have:
RSN1146_KOCAELI_AFY000 1.345178e-02
RSN1146_KOCAELI_AFY090 1.493577e-02
RSN4003_SANSIMEO_25862090 1.199087e-02
RSN4003_SANSIMEO_25862360 1.181286e-02
Then, I want to obtain the maximum value for RSN1146 & RSN4003.
I tried to read the file with the code below:
Data=fopen('maxPGA.txt','r');
readfile=fscanf(Data,'%c %s')
It is weird as I cannot perform further analysis as the data is not imported as 2 column in MATLAB, any solution for this?
I tried:
Data= importdata('maxPGA.txt')
as well, but the data are grouped into 2 different table in this case.

Related

How to perform processinng on .csv files used by my code for each person in one go without defining functions for each person in Matlab?

I have a main code named as process.m in which I define the path to 4 different .csv files for calculating values for each person. If I have a list of 30 persons and I don't want to define process.m as a function for each person, how can I do the processing for all the persons in one go. I want some idea by which process.m itself picks files for one person, then generate the results, then move to other person, pick his .csv files, generate the result and so on.
A breif outline of my code is attached here that would project the problem.
file1='Joseph_front.csv';
file2='Joseph_side.csv';
file3='Joseph_knee.csv';
file4='Joseph_back.csv';
A1=initiate2(file1); %initiate2 function reads the csv and perfoms some filtering operations on image in .csv format
A2=initiate2(file2);
A3=initiate2(file3);
A4=initiate2(file4);
%%omitted large part of code
cal(1) = p+q+r*s;
cal(2) = p+q+r+s;
cal(3) = p+q+r-s;
cal=cal'
%code to write the calculation in excel file
excelfile= 'test.xlsx';
xlswrite(excelfile,ValuesInInches,'Joseph_data',posColumn);
Describing more i want my code to process for 30 people all at once by selecting and picking the files itself, although i have done this operation by making the same code as a function for each person, but that is not very efficient as when I have to make a small change I have to make it in every function that means one change needs to be edited in 30 functions. Please suggest an efficient way to do it.
Note: All persons .csv files are named in the same manner and exist in the current folder.
I am assuming all of your files in one directory and there's no other files .
This portion of code will get the available filenames.
listFiles = dir('path of the directory');
filenames = strings; % an empty string array to save the filenames
j = 1;
for i = 1:1:length(listFiles)
if ~listFiles(i).isdir % to avoid the directory names
filenames(j,1) = listFiles(i).name;
j = j+1;
end
end
Now, there's 4 file for each person. So the loop should take 4 files at a time.
for ii = 1:4:length(filenames)
file1=filename(i);
file2=filename(i+1);
file3=filename(i+2);
file4=filename(i+3);
%% continue with your code
end

Export data to same CSV file for multiple runs of .m file

I am running a MATLAB program and storing the results in two matrices. For each run of the program, those matrices are written to the same .csv file.
How can I continue to store data to the same file for future runs of the program? Is there a function that checks for data already being present to avoid overwriting cells?
t = 0.0001*[0:70];
v = B_2*R_R.*exp(-alpha.*t).*sin(omega_d.*t);
tv = [t; v].';
csvwrite('thedata.csv',tv,3,0)
I couldn't resist rewriting your code a bit.
This should be equivalent to what you have, and print both vectors to thedata.csv.
t = 0.0001*[0:70];
v = B_2*R_R.*exp(-alpha.*t).*sin(omega_d.*t);
tv = [t; v].';
csvwrite('thedata.csv',tv,3,0)
Due to the way csv-files are stored, you can only append data at the end of the file, which happens to be the bottom row, not the last column. What you should do, is concatenate all data before writing to the csv-file. That way you avoid multiple calls to csvwrite or dlmwrite (they are time consuming).
If that's impossible, then I suggest reading the data from the csv-file, using csvread, append the new data to the data you retrieve, and write it all back again.
csvwrite('thedata.csv',tv)
mydata = csvread('thedata.csv');
mydata2 = [mydata, tv2];
csvwrite('thedata.csv',mydata2)

Matlab save sequence of mat files from convertTDMS stored in cell array to sequence of mat files

I have data stored in the .tdms format, gathering the data of many sensors, measured every second, every day. A new tdms file is created every day, and stored in a folder per month. Using the convertTDMS function, I have converted these tdms files to mat files.
As there are some errors in some of the measurements(e.g. negative values which can not physically occur), I have performed some corrections by loading one mat file at a time, do the calculations and then save the data into the original .mat file.
However, when I try to do what I described above in a loop (so: load .mat in folder, do calculations on one mat file (or channel therein), save mat file, repeat until all files in the folder have been done), I end up running into trouble with the limitations of the save function: so far I save all variables (or am unable to save) in the workspace when using the code below.
for k = 1:nFiles
w{k,1} = load(wMAT{k,1});
len = length(w{k,1}.(x).(y).(z));
pos = find(w{k,1}.(x).(y).(z)(1,len).(y)<0); %Wind speed must be >0 m/s
for n = 1:length(pos)
w{k,1}.(x).(y).(z)(1,len).(y)(pos(n)) = mean([w{k,1}.(x).(y).(z)(1,len).(y)(pos(n)+1),...
w{k,1}.(x).(y).(z)(1,len).(y)(pos(n)-1)],2);
end
save( name{k,1});
%save(wMAT{k,1},w{k,1}.(x),w{k,1}.ConvertVer,w{k,1}.ChanNames);
end
A bit of background information: the file names are stored in a cell array wMAT of length nFiles in the folder. Each cell in the cell array wMAT stores the fullfile path to the mat files.
The data of the files is loaded and saved into the cell array w, also of length nFiles.
Each cell in "w" has all the data stored from the tdms to mat conversion, in the format described in the convertTDMS description.
This means: to get at the actual data, I need to go from the
cell in the cell array w{k,1} (my addition)
to the struct array "ConvertedData" (Structure of all of the data objects - part of convertTDMS)
to the struct array below called "Data" (convertTDMS)
to the struct array below called "MeasuredData" (convertTDMS) -> at this level, I can access the channels which store the data.
to finally access/manipulate the values stored, I have to select a channel, e.g. (1,len), and then go via the struct array to the actual values (="Data"). (convertTDMS)
In Matlab format, this looks like "w{1, 1}.ConvertedData.Data.MeasuredData(1, len).Data(1:end)" or "w{1, 1}.ConvertedData.Data.MeasuredData(1, len).Data".
To make typing easier, I took
x = 'ConvertedData';
y = 'Data';
z = 'MeasuredData';
allowing me to write instead:
w{k,1}.(x).(y).(z)(1,len).(y)
using the dot notation.
My goal/question: I want to load the values stored in a .mat file from the original .tdms files in a loop to a cell array (or if I can do better than a cell array: please tell me), do the necessary calculations, and then save each 'corrected' .mat file using the original name.
So far, I have gotten a multitude of errors from trying a variety of solutions, going from "getfieldnames", trying to pass the name of the (dynamically changing) variable(s), etc.
Similar questions which have helped me get in the right direction include Saving matlab files with a name having a variable input, Dynamically Assign Variables in Matlab and http://www.mathworks.com/matlabcentral/answers/4042-load-files-containing-part-of-a-string-in-the-file-name-and-the-load-that-file , yet the result is that I am still no closer than doing manual labour in this case.
Any help would be appreciated.
If I understand your ultimate goal correctly, I think you're pretty much there. I think you're trying to process your .mat files and that the loading of all of the files into a cell array is not a requirement, but just part of your solution? Assuming this is the case, you could just load the data from one file, process it, save it and then repeat. This way you only ever have one file loaded at a time and shouldn't hit any limits.
Edit
You could certainly make a function out of your code and then call that in a loop, passing in the file name to modify. Personally I'd probably do that as I think it's neater solution. If you don't want to do that though, you could just replace w{k,1} with w then each time you load a file w would be overwritten. If you wanted to explicitly clear variables you can use the clear command with a space separated list of variables e.g. clear w len pos, but I don't think that this is necessary.

Reading structured variable from MAT file

I am performing an analysis which involves simulation of over 1000 cases. I extracting lots of data for each case as well (about 70MB). Currently I am saving the results for each case as:
Vessel.TotalForce
Vessel.WindForce
Vessel.CurrentForce
Vessel.WaveForce
Vessel.ConnectionForce
...
Line1.EffectiveTension
Line1.X
Line1.Y
Line2.EfectiveTension
Line2.X
Line2.Y
...
save('CaseNo1.mat')
Now, I need to perform my analysis for CaseNo1.mat to CaseNo1000. Initially I planned to create a Database.mat file by loading all cases in it and then accessing any variable using h5read. This way Matlab doesn't need to load all the data at a time. However, I am concerned now that my database file will be too big.
Is there any way I can read the structured variables from individual case files for example CaseNo1.mat without loading the CaseNo1.mat file in memory.
Matlab examples shows loading just the variables directly from MAT file without loading the whole MAT file. But I am not sure how to read structures data the same way.
x=load('CaseNo1.mat','Line1.X')
says Line1.X not found. But it's there. The command is not correct to access the data. Also tried using h5read, but it says CaseNo1.mat is not an HDF5 file.
Can anyone help with this.
Apart from this, I would also appreciate if there is any suggestion about performing such data intensive analysis.
I was wrong! I'm leaving my old answer for context, though I've edited it to reference this one. I thought I had used matfile() in that way before, but I hadn't. I just did a thorough search and ran a few test cases. You've actually run into a limitation of the way Matlab handles and references structures stored in .mat files. There is, however, a solution. It does involve some refactoring of your original code, but it shouldn't be too egregious.
Vessel_TotalForce
Vessel_WindForce
Vessel_CurrentForce
Vessel_WaveForce
Vessel_ConnectionForce
...
Line1_EffectiveTension
Line1_X
Line1_Y
Line2_EfectiveTension
Line2_X
Line2_Y
...
save('CaseNo1.mat')
Then to access, just use matfile (or load) as you were before. Like so:
Vessel_WaveForce = load('CaseNo1.mat'', 'Vessel_WaveForce')
It's important to note that this restriction doesn't appear to be caused by anything you've chosen to do in your program, but rather is imposed by the way Matlab interacts with it's native storage files when they contain structures.
EDIT: This answer works, but doesn't actually solve the problem posed in OP's question. I thought I had used matfile to generate a handle that I could access, but I was wrong. See my other answer for details.
You could use matfile, like so:
myMatFileHandle = matfile('caseNo1.mat');
thisVessel = myMatFileHandle.vessel;
Also, from the little bit I can see, you seem to be on the right track for high-volume analysis. Just remember to use sparse when applicable, and generally avoid conditionals inside of loops if possible.
Good luck!
The objective of storing data in structured format is:
To be organized
Easy scripting post processor where looping through data under one data set it required.
To store structured dataset containing integer, floating and string variables in MAT file and to be able to read just the required variable using h5read command was sought. Matlab load command is not able to read variable beyond first level from stored data in a MAT file. The h5write couldn't write string variables. Hence needed a work around to solve this problem.
To do this I have used following method:
filename = 'myMatFile';
Vessel.TotalForce = %store some data
Vessel.WindForce = %store some data
Vessel.CurrentForce = %store some data
Vessel.WaveForce = %store some data
Vessel.ConnectionForce = %store some data
...
Lin1.LineType = 'Wire'
Line1.ArcLength_0.EffectiveTension = %store some data
Line1.ArcLength_50.EffectiveTension= %store some data
Line1.ArcLength_100.EffectiveTension= %store some data
Lin2.LineType = 'Chain'
Line2.ArcLength_0.EffectiveTension= %store some data
Line2.ArcLength_50.EffectiveTension= %store some data
Line2.ArcLength_100.EffectiveTension= %store some data
save([filename '_temp.mat']);
PointToMat=matfile([filename '.mat'],'Writable',true);
PointToMat.(char(filename)) = load([filename '_temp.mat']);
delete([filename '_temp.mat']);
Now to read from the MAT file created, we can use h5read as usual. To extract the EffectiveTension for Line1, ArcLength_0:
EffectiveTension = h5read([filename '.mat'],['/' filename '/Line1/ArcLength_0/EffectiveTension']);
For string variables, h5read returns decimal values corresponding to each character. To obtain the actual string I used:
name = char(h5read([filename '.mat'],['/' filename '/Line1/LineType']));
Tried this method on my data set which is about 200MB and I could process them pretty fast. Hope this would help someone someday.
Short answer:
Having saved the data into a MAT file with the '-v7.3' option, use something like h5read(filename, '/Line2/X') to read just one structure field. You can even read an array partially, for example:
s.a = 1:100;
save('test.mat', '-v7.3', 's');
clear
h5read('test.mat', '/s/a', [1 10], [1 5], [1 3])
returns each third element of the 1:100 array, starting with the 10th element and returning 5 values:
10 13 16 19 22
Long answer:
See answer by #Amitava for the more elaborate code and topic coverage.

extracting data from within xml files using MATLAB

I am a complete programming beginner trying to learn MATLAB. I want to extract numerical data from a bunch of different xml files. The numerical data items are bounded by the tags and . How do I write a program in MATLAB?
My algorithm:
1. Open the folder
2. Look into each of 50 xml files, one at a time
3. Where the tag <HNB.1></HNB.1> exists, copy numerical contents between said tag and write results into a new file
4. The new file name given for step 3 should be the same as the initial file name read in Step 2, being appended with "_data extracted"
example:
FileName = Stewart.xml
Contents = blah blah blah <HNB.1>2</HNB.1> blah blah
NewFileName = Stewart_data extracted.txt
Contents = 2
The fundamental function in MATLAB to read xml data is xmlread; but if you're a complete beginner, it can be tricky to work with just that. Try this series of blog postings that show you how to put it all together.
Suppose you want to read this file:
<PositiveSamples numImages="14">
<image numSubRegions="2" filename="TestingScene.jpg">
<subregion yStart="213" yEnd="683" xStart="1" xEnd="236"/>
<subregion yStart="196" yEnd="518" xStart="65" xEnd="226"/>
</image>
</PositiveSamples>
Then in matlab, read the file contents as follows:
%read xml file
xmlDoc = xmlread('PositiveSamples.xml');
%Get root element
root = xmlDoc.getDocumentElement();
%Read attributevale
numOfImages = root.getAttribute('numImages');
numOfImages = char(numOfImages);
numOfImages = uint16(eval(numOfImages));