Writing .mat files to .nc - matlab

I have a code that creates a bunch of .mat files, but I want to save them as netcdf files (csv or txt would be fine as well) so people who can't use MATLAB can access them. This is what I have so far
%% Use function to read in
data = read_mixed_csv(filename,'"'); % Creates cell array of data
data = regexprep(data, '^"|"$',''); % Gets rid of double quotes at the start and end of the string
data = data(:,2:2:41); % Keep only the even cells because the odd ones are just commas
%% Sort data based on date (Column 1)
[Y,I] = sort(data(:,1)); % Create 1st column sorted
site_sorted = data(I,:); % Sort the entire array
%% Find unique value in the site data (Column 2)
% Format of site code is state-county-site
u_id = unique(site_sorted(:,2)); % get unique id
for i = 1:length(u_id)
idx=ismember(site_sorted(:,2),u_id{i}); % extract index where the second column matches the current id value
site_data = site_sorted(idx,:);
save([u_id{i} '.mat'],'site_data');
cdfwrite([u_id{i} '.nc'], 'site_data');
end
Everything works until the second to last line. I want to write each 'site_data' as a netcdf file with the same name as save([u_id{i} '.mat'],'site_data');, which is a string from the second column.

Try
cdfwrite([u_id{i}],{'site_data',site_data})
The extension will be '.cdf'. I am not sure if this can be changed while using cdfwrite.
Edit: Corrected Typo

Related

Save the data in a form of three columns in text files

This function reads the data from multiple mat files and save them in multiple txt files. But the data (each value) are saved one value in one column and so on. I want to save the data in a form of three columns (coordinates) in the text files, so each row has three values separated by space. Reshape the data before i save them in a text file doesn't work. I know that dlmwrite should be modified in away to make newline after three values but how?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar.Bodies.Positions(1,:,:);
%pos=reshape(pos,2,3,2000);
filename = sprintf('data%d.txt', q);
dlmwrite(filename , pos(q,:,:), 'delimiter','\t','newline','pc')
end
My data structure:
These data should be extracted from each mat file and stored in the corresponding text files like this:
332.68 42.76 42.663 3.0737
332.69 42.746 42.655 3.0739
332.69 42.75 42.665 3.074
A TheMathWorks-trainer once told me that there is almost never a good reason nor a need to use eval. Here's a snippet of code that should solve your writing problem using writematrix since dlmwrite is considered to be deprecated.
It further puts the file-handling/loading on a more resilient base. One can access structs dynamically with the .(FILENAME) notation. This is quite convenient if you know your fields. With who one can list variables in the workspace but also in .mat-files!
Have a look:
% path to folder
pFldr = pwd;
% get a list of all mat-files (returns an array of structs)
Lst = dir( fullfile(pFldr,'*.mat') );
% loop over files
for Fl = Lst.'
% create path to file
pFl = fullfile( Fl.folder, Fl.name );
% variable to load
[~, var2load, ~] = fileparts(Fl.name);
% get names of variables inside the file
varInfo = who('-file',pFl);
% check if it contains the desired variables
if ~all( ismember(var2load,varInfo) )
% display some kind of warning/info
disp(strcat("the file ",Fl.name," does not contain all required varibales and is therefore skipped."))
% skip / continue with loop
continue
end
% load | NO NEED TO USE eval()
Dat = load(pFl, var2load);
% DO WHATEVER YOU WANT TO DO
pos = squeeze( Dat.(var2load)(1,:,1:2000) );
% create file name for text file
pFl2save = fullfile( Fl.folder, strrep(Fl.name,'.mat','.txt') );
writematrix(pos,pFl2save,'Delimiter','\t')
end
To get your 3D-matrix data into a 2D matrix that you can write nicely to a file, use the function squeeze. It gets rid of empty dimensions (in your case, the first dimension) and squeezes the data into a lower-dimensional matrix
Why don't you use writematrix() function?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar(1,:,1:2000);
filename = sprintf('data%d.txt', q);
writematrix(pos(q,:,:),filename,'Delimiter','space');
end
More insight you can find here:
https://www.mathworks.com/help/matlab/ref/writematrix.html

Importing a txt file into Matlab and reading the data

I need to import a .txt datafile into Matlab. The file has been made into 3 columns. Each column has specific numbers for a given variable. The script code must be able to do the following,
Requirement
1) import the data from txt into Matlab
2) Matlab should remove the values from the columns if the values are out of a certain range
3) Matlab should tell which line and what type of error.
My Approach
I have tried using the following approach,
function data = insertData(filename)
filename = input('Insert the name of the file: ', 's');
data = load(filename);
Column1 = data(:,1);
Column2 = data(:,2);
Column3 = data(:,3);
%Ranges for each column
nclm1 = Column1(Column1>0);
nclm2 = Column2(Column2 >= 10 & Column2 <= 100);
nclm3 = Column3(Column3>0);
%Final new data columns within the ranges
final = [nclm1, nclm2, nclm3];
end
Problem
The above code has the following problems:
1) Matlab is not saving the imported data as 'data' after the user inserts the name of the file. Hence I don't know why my code is wrong.
filename =input('Insert the name of the file: ', 's');
data = load(filename);
2) The columns in the end do not have the same dimensions because I can see that Matlab removes values from the columns independently. Therefore is there a way in which I can make Matlab remove values/rows from a matrix rather than the three 'vectors', given a range.
1) Not sure what you mean by this. I created a sample text file and Matlab imports the data as data just fine. However, you are only returning the original unfiltered data so maybe that is what you mean??? I modified it to return the original data and the filtered data.
2) You need to or the bad indices together so that they are removed from each column like this. Note I made some other edits ... see comments in the code below:
function [origData, filteredData]= insertData(filename)
% You pass in filename then overwrite it ...
% Modified to only prompt if not passed in.
if ~exist('filename','var') || isempty(filename)
filename = input('Insert the name of the file: ', 's');
end
origData = load(filename);
% Ranges check for each column
% Note: return these if you want to know what data was filter for
% which reason
badIdx1 = origData(:,1) > 0;
badIdx2 = origData(:,2) >= 10 & origData(:,2) <= 100;
badIdx3 = origData(:,3)>0;
totalBad = badIdx1 | badIdx2 | badIdx3;
%Final new data columns within the ranges
filteredData = origData(~totalBad,:);
end
Note: you mentioned you want to know which line for which type of error. That information is now contained in badIDx1,2, 3. So you can return them, print a message to the screen, or whatever you need to display that information.

Adding size information of dataset to file name

I have several datasets, called '51.raw' '52.raw'... until '69.raw' and after I run these datasets in my code the size of these datasets changes from 375x91x223 to sizes with varying y-dimensions (i.e. '51.raw' output: 375x45x223; '52.raw' output: 375x50x223, ... different with each dataset).
I want to later save the '.raw' file name with this information (i.e. '51_375x45x223.raw') and also want to use the new dataset size to later reshape the dataset within my code. I have attempted to do this but need help:
for k=51:69
data=reshape(data,[375 91 223]); % from earlier in the code after importing data
% then executes code with dimensions of 'data' chaging to 375x45x223, ...
length=size(data); dimensions.([num2str(k)]) = length; %save size in 'dimensions'.
path=['C:\Example\'];
name= sprintf('%d.raw',k);
write([path name], data);
% 'write' is a function to save the dat in specified path and name (value of k). I don't know how to add the size of the dataset to the name.
Also later I want to reshape the dataset 'data' for this iteration and do a reshape with the new y dimensions value.
i.e. data=reshape(data,[375 new y-dimension 223]);
Your help will be appreciated. Thanks.
You can easily convert your dimensions to a string which will be saved as a file.
% Create a string of the form: dim1xdim2xdim3x...
dims = num2cell(size(data));
dimstr = sprintf('%dx', dims{:});
dimstr = dimstr(1:end-1);
% Append this to your "normal" filename
folder = 'C:\Example\';
filename = fullfile(folder, sprintf('%d_%s.raw', k, dimstr));
write(filename, data);
That being said, it is far better include this dimension information within the file itself rather than relying on the filename.
As a side note, avoid using names of internal functions as variable names such as length, and path. This can potentially result in strange and unexpected behavior in the future.
Update
If you need to parse the filename, you could use textscan to do that:
filename = '1_2x3x4.raw';
ndims = sum(filename == 'x') + 1;
fspec = repmat('%dx', [1 ndims]);
parts = textscan(filename, ['%d_', fspec(1:end-1)]);
% Then load your data
% Now reshape it based on the filename
data = reshape(data, parts{2:end});

Name each variable differently in a loop

I have created a .dat file of file names. I want to read into MATLAB each file in that list and give the data a different name. Currently, each iteration just overwrites the last one.
I found that a lot of people give this answer:
for i=1:10
A{i} = 1:i;
end
However, it isn't working for my problem. Here's what I am doing
flist = fopen('fnames.dat'); % Open the list of file names
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:6 % Number of filenames in the .dat file
% For each file
fname = fgetl(flist); % Reads next line of list, which is the name of the next data file
disp(fname); % Stores name as string in fname
nt = nt+1; % Time index
% Save data
data{i} = read_mixed_csv(fname, '\t'); % Reads in the CSV file% Open file
data{i} = data(2:end,:); % Replace header row
end
end
The code runs with no errors, but only one data variable is saved.
My fnames.dat contains this:
IA_2007_MDA8_O3.csv
IN_2007_MDA8_O3.csv
MI_2007_MDA8_O3.csv
MN_2007_MDA8_O3.csv
OH_2007_MDA8_O3.csv
WI_2007_MDA8_O3.csv
If possible, I would really like to name data something more intuitive. Like IA for the first file, IN for the second and so on. Is there any way to do this?
The last line of the loop is the problem:
data{i} = data(2:end,:);
I don't know what exactly happens I did not run your code, but data(2:end,:) refers to the second to last dataset, not the second to last line.
Try:
thisdata = read_mixed_csv(fname, '\t');
data{i} = thisdata(2:end,:);
If you want to keep track of what data came from which file, save out a second cell array with the names:
thisdata = read_mixed_csv(fname, '\t');
data{i} = thisdata(2:end,:);
names{i} = fname(1:2); % presuming you only need first two letters.
If you need a specific part of the filename that's not always the same length look into strtok or fileparts. Then you can use things like strcmp to check the cell array names for where the data labelled IA or whichever is stored.
As mentioned by #Daniel the simple way to store data of various sizes in a cell array.
data{1} = thisdata(2:end,:)
However, if the names are really important, you could consider using a struct instead. For example:
dataStruct(1).numbers= thisdata(2:end,:);
dataStruct(1).name= theRelevantName
Of course you could also just add them to the cell array:
dataCell{1,1} = thisdata(2:end,:);
dataCell{1,2} = theRelevantName

Converting a comma separated filed to a matlab matrix

I have a comma separated file in the format:
Col1Name,Col1Val1,Col1Val2,Col1Val3,...Col1ValN,Col2Name,Col2Val1,...Col2ValN,...,ColMName,ColMVal1,...,ColMValN
My question is, how can I convert this file into something Matlab can treat as a matrix, and how would I go about using this matrix in a file? I supposed I could some scripting language to format the file into matlab matrix format and copy it, but the file is rather large (~7mb).
Thanks!
Sorry for the edit:
The file format is:
Col1Name;Col2Name;Col3Name;...;ColNName
Col1Val1;Col2Val2;Col3Val3;...;ColNVal1
...
Col1ValM;Col2ValM;Col3ValM;...;VolNValM
Here is some actual data:
Press;Temp.;CondF;Cond20;O2%;O2ppm;pH;NO3;Chl(a);PhycoEr;PhycoCy;PAR;DATE;TIME;excel.date;date.time
0.96;20.011;432.1;431.9;125.1;11.34;8.999;134;9.2;2.53;1.85;16.302;08.06.2011;12:01:52;40702;40702.0.5
1;20.011;433;432.8;125;11.34;9;133.7;8.19;3.32;2.02;17.06;08.06.2011;12:01:54;40702;40702.0.5
1.1;20.012;432.7;432.4;125.1;11.34;9;133.8;8.35;2.13;2.2;19.007;08.06.2011;12:01:55;40702;40702.0.5
1.2;20.012;432.8;432.5;125.2;11.35;9.001;133.8;8.45;2.95;1.95;21.054;08.06.2011;12:01:56;40702;40702.0.5
1.3;20.012;432.7;432.4;125.4;11.37;9.002;133.7;8.62;3.17;1.87;22.934;08.06.2011;12:01:57;40702;40702.0.5
1.4;20.007;432.1;431.9;125.2;11.35;9.003;133.7;9.48;4.17;1.6;24.828;08.06.2011;12:01:58;40702;40702.0.5
1.5;19.997;432.3;432.2;124.9;11.33;9.003;133.8;8.5;3.84;1.79;27.327;08.06.2011;12:01:59;40702;40702.0.5
1.6;20;432.8;432.6;124.5;11.29;9.003;133.6;8.57;3.22;1.86;30.259;08.06.2011;12:02:00;40702;40702.0.5
1.7;19.99;431.9;431.9;124.4;11.28;9.002;133.6;8.79;3.7;1.81;35.152;08.06.2011;12:02:02;40702;40702.0.5
1.8;19.994;432.1;432.1;124.4;11.28;9.002;133.6;8.58;3.41;1.84;39.098;08.06.2011;12:02:03;40702;40702.0.5
1.9;19.993;433;432.9;124.6;11.3;9.002;133.6;8.59;3.45;5.53;45.488;08.06.2011;12:02:04;40702;40702.0.5
2;19.994;433;432.9;124.8;11.32;9.002;133.5;8.6;2.76;1.99;50.646;08.06.2011;12:02:05;40702;40702.0.5
If you don't know number of rows and columns up front, you can't use previous solution. Use this instead.
7 Mb is not large, it is small. This is the 21st century.
To read in to a matlab matrix:
text = fileread('file.name'); % a string with the entire file contents in it. 7 Mb is no big deal.
NAMES = {}; % we'll record column names here
VALUES = []; % this will be the matrix of values
while text(end) = ','
text(end)=[]; % elimnate any trailing commas
end
commas = find(text==','); % Index all the commas
commas = [0;commas(:);length(commas)+1] % put fake commas before and after text to simplify loop
col = 0; % which column are we in
I = 1;
while I<length(commas)
txt = text(commas(I)+1:commas(I+1)-1);
I = I+1;
num = str2double(txt);
if isnan(num) % this means it must be a column name
NAMES{end+1,1} = txt;
col = col+1; % can you believe Matlab doesn't support col++ ???
row = 1; % back to the top at each new column
continue % we have dealt with this txt, its not a num so ... next
end
% if we made it here we have a number
VALUES(row,col) = num;
end
Then you can save your matlab matrix VALUES and also the header names if you want them in matlab format NAMES into matlab format file
save('mymatrix.mat','VALUES','NAMES'); % saves matrix and column names to .mat file
You get stuff back in to matlab when you want it from the file by:
load mymatrix.mat; % loads VALUES and NAMES from .mat file
Some limitations:
You can't use commas in your column header names.
You cannot "name" a column something like "898.2" or anything which can be read as a double number, it will be read in as a number.
If your columns have different lengths, the shorter ones will be padded with zeros to the length of the longest column.
That's all I can think of.