Constructing a structure in matlab - matlab

Could someone please help me in constructing this structure?
As one can see, I have to import this file from excel and convert it into a structure as shown. I have been able to develop a code which can create structures by manually entering Id numbers. For instance if I enter 150 it gives a structure for 150 as shown.
However I have to automate the process, i.e. I want matlab to identify the unique Ids and create the entire structure with all its data and timing in each of its ID.
Here's my code
function [ DynData ] = myfunction( filename )
[dat1, dat2, dat3] = xlsread(filename);
flds = dat3(1,:);
InputData = cell2struct(dat3(2:end,:),flds,2);
uIDs = unique( cell2mat(dat3(2:end, 2))) ;
for j = 1:length(dat3)
uIDs = dat3(j);
i=1;
for k = 2:length(dat3(:,1))
if dat3{k,2} == {uIDs}
IDnumber = ['ID',num2str(uIDs)];
DynData.(IDnumber).time(1,i) = dat3{k,1};
DynData.(IDnumber).ID(1,i) = dat3{k,2};
DynData.(IDnumber).data(1,i) = dat3(k,3);
i=i+1;
end
end
end
end
Any help will be really appreciated!

I prefer to load my .xlsx into tables for easier access.
Try this function:
UPDATE
if id is a number, you cannot use it directly as a field name in your structure. The first character in a field name must be a letter. You can fix this by using a prefix:
function dynData=tableImport(file)
% Imports data from .xls file with columns 'Time' 'Id' 'Data' into
% structure dynData with fieldnames=unique(id) each of which is in turn a
% struct array with fields Time and Data.
% Get data
tab=readtable(file);
% Group by id
G=findgroups(tab.Id);
% Extract id and structure
[id, st]=splitapply(#groupTable,tab,G);
% Build dynData
fieldNames=strsplit(sprintf('id%i\n',id));
for i=1:length(id)
dynData.(fieldNames{i})=st{i};
end
end
function [ID, struc]=groupTable(time,id,data)
% Function for finding id-name and splitting time and data into a struct.
ID=id(1);
struc={struct('Time',num2cell(time),'Data',data)}; % stored in cell
end
Although if I were you I'd rather store my data in a structure array with fieldnames
id (string)
time (array)
data (array/cellarray)
for easier access in for-loops
UPDATE 2
This is what I tried to suggest as a better structure:
function dynData=tableImport2(file)
% Imports data from .xls file with columns 'Time' 'Id' 'Data' into
% structure dynData with fieldnames=unique(id) each of which is in turn a
% struct array with fields Time and Data.
% Get data
tab=readtable(file);
% Group by id
G=findgroups(tab.Id);
% Extract id and structure
dynData=splitapply(#(time,id,data)struct('Id',id(1),'Time',time,'Data',{data}),tab,G);
end

Related

Extract structures in a loop

I have a structure ('data'), with 26 fields (A, B, C, D, etc.). Each field contains 1x30 cells (one for each participant), and each cell contains a structure.
I would like to extract all the structures (i.e., one structure per field) corresponding to each participant. That is, I would like to obtain 30 new ‘data’, each with 26 fields, and each field containing 1x1 structure, with the structure corresponding to the participant. I have tried the following code:
data = load('D:\filepath\mydata.mat'); %load file with data. 1x1 struct.
all_fields = fieldnames(data); %store the fields of the structure. 26x1 cell.
forStr = length(all_fields); %26
n_ppts = 30; %total number of participants.
%for each participant, extract the corresponding structure in each field.
for nn = 1:n_ppts
for idx_field = 1:forStr
name_field = all_fields{idx_field};
data2 = data;
data2.(name_field) = data.(name_field){nn};
end
%save the 'data' for each participant. The 'data' should include 26 fields, and 1 structure for each field.
name = ppt_fname(nn); %Generate the new name for saving
savename =string(regexprep(name,'_oldname.set','_newname.mat'));
save(savename, '-struct', 'data');
end
The code doesn’t give any error. However, it doesn’t run as I expected.
‘data2’ still contains 26 fields, but only the last field contains 1 structure corresponding to the participant. The other fields contain 1x30 cell.
I guess it is because every time I run the loop it overwrites the previous fields, leaving only the last field correct. So, I think I might need a temporary variable where to store each iteration of the loop.
I thought to create as the temporary storage for each field
structure = [];
namelist = {‘A’;’B’;’C’;’D’;’E’;’F’;’G’;’H’;’I’;’J’;’K’;’L’;’M’;’N’;’O’;’P’;’Q’;’R’;’S’;’T’;’U’;’V’;’W’;’X’;’Y’;’Z’};
for i = 1:length(namelist)
structure.(namelist{i})={};
end
But cannot figure out how to make it work.
You need to take the line data2 = data; out of the for loops.
Antoine T is right, you always copy the original structure data again to data2 in every loop. That is why it won't get changed (except for the very last step of the loop, where you add a single field name to it.)
Regarding your other problem:
% create empty struct:
S = struct();
% loop
for i = 1:25
% create field name
nm = char( double('A') +i );
% create new field with empty cell.
S.(nm) = {};
end
It is just nice to convert number to chars as fieldnames. Your primary error was that you used the wrong inverted comma to create chars.
A minor flaw though was that you allocated strucutre = [] as an empty matrix rather than as an empty struct

Importing a txt file into Matlab and reading the data

I need to import a .txt datafile into Matlab. The file has been made into 3 columns. Each column has specific numbers for a given variable. The script code must be able to do the following,
Requirement
1) import the data from txt into Matlab
2) Matlab should remove the values from the columns if the values are out of a certain range
3) Matlab should tell which line and what type of error.
My Approach
I have tried using the following approach,
function data = insertData(filename)
filename = input('Insert the name of the file: ', 's');
data = load(filename);
Column1 = data(:,1);
Column2 = data(:,2);
Column3 = data(:,3);
%Ranges for each column
nclm1 = Column1(Column1>0);
nclm2 = Column2(Column2 >= 10 & Column2 <= 100);
nclm3 = Column3(Column3>0);
%Final new data columns within the ranges
final = [nclm1, nclm2, nclm3];
end
Problem
The above code has the following problems:
1) Matlab is not saving the imported data as 'data' after the user inserts the name of the file. Hence I don't know why my code is wrong.
filename =input('Insert the name of the file: ', 's');
data = load(filename);
2) The columns in the end do not have the same dimensions because I can see that Matlab removes values from the columns independently. Therefore is there a way in which I can make Matlab remove values/rows from a matrix rather than the three 'vectors', given a range.
1) Not sure what you mean by this. I created a sample text file and Matlab imports the data as data just fine. However, you are only returning the original unfiltered data so maybe that is what you mean??? I modified it to return the original data and the filtered data.
2) You need to or the bad indices together so that they are removed from each column like this. Note I made some other edits ... see comments in the code below:
function [origData, filteredData]= insertData(filename)
% You pass in filename then overwrite it ...
% Modified to only prompt if not passed in.
if ~exist('filename','var') || isempty(filename)
filename = input('Insert the name of the file: ', 's');
end
origData = load(filename);
% Ranges check for each column
% Note: return these if you want to know what data was filter for
% which reason
badIdx1 = origData(:,1) > 0;
badIdx2 = origData(:,2) >= 10 & origData(:,2) <= 100;
badIdx3 = origData(:,3)>0;
totalBad = badIdx1 | badIdx2 | badIdx3;
%Final new data columns within the ranges
filteredData = origData(~totalBad,:);
end
Note: you mentioned you want to know which line for which type of error. That information is now contained in badIDx1,2, 3. So you can return them, print a message to the screen, or whatever you need to display that information.

Adding size information of dataset to file name

I have several datasets, called '51.raw' '52.raw'... until '69.raw' and after I run these datasets in my code the size of these datasets changes from 375x91x223 to sizes with varying y-dimensions (i.e. '51.raw' output: 375x45x223; '52.raw' output: 375x50x223, ... different with each dataset).
I want to later save the '.raw' file name with this information (i.e. '51_375x45x223.raw') and also want to use the new dataset size to later reshape the dataset within my code. I have attempted to do this but need help:
for k=51:69
data=reshape(data,[375 91 223]); % from earlier in the code after importing data
% then executes code with dimensions of 'data' chaging to 375x45x223, ...
length=size(data); dimensions.([num2str(k)]) = length; %save size in 'dimensions'.
path=['C:\Example\'];
name= sprintf('%d.raw',k);
write([path name], data);
% 'write' is a function to save the dat in specified path and name (value of k). I don't know how to add the size of the dataset to the name.
Also later I want to reshape the dataset 'data' for this iteration and do a reshape with the new y dimensions value.
i.e. data=reshape(data,[375 new y-dimension 223]);
Your help will be appreciated. Thanks.
You can easily convert your dimensions to a string which will be saved as a file.
% Create a string of the form: dim1xdim2xdim3x...
dims = num2cell(size(data));
dimstr = sprintf('%dx', dims{:});
dimstr = dimstr(1:end-1);
% Append this to your "normal" filename
folder = 'C:\Example\';
filename = fullfile(folder, sprintf('%d_%s.raw', k, dimstr));
write(filename, data);
That being said, it is far better include this dimension information within the file itself rather than relying on the filename.
As a side note, avoid using names of internal functions as variable names such as length, and path. This can potentially result in strange and unexpected behavior in the future.
Update
If you need to parse the filename, you could use textscan to do that:
filename = '1_2x3x4.raw';
ndims = sum(filename == 'x') + 1;
fspec = repmat('%dx', [1 ndims]);
parts = textscan(filename, ['%d_', fspec(1:end-1)]);
% Then load your data
% Now reshape it based on the filename
data = reshape(data, parts{2:end});

Copy a data from a file into a structure

I need help in designing a data structure to store the storm data from the file. I am going to use these field names: code, amount, duration, and intensity. The intensity is the rainfall amount divided by the duration.(what should I do to in order to calculate the intensity?) I loaded the data into the variable "mydata", then copied my data into a vector of structs called "vecdata".
My final vector of structures should have the same number of elements as the number of rows in the data file. Additionally, it should have the 4 fields with the field names I mentioned above.
% Creating an example data file
anum = randi([3,10]);
thedata = [randi([1,350],anum,1),rand(anum,1)*5,rand(anum,1)*15];
save mydata.dat thedata -ascii
clear
% loaded the data using the load function into "mydata":
mydata = load('mydata.dat')
% Tried to copy the data from "mydata" into a vector of structures called "vecdata":
vecdata = [struct('code','amount','duration','intensity')];
This is a very general question. How can I copy the data from the file above? Rows of mydata must match the # of elements in vecdata. How do I check this?
%first of all, calculate the missing intensity
mydata(:,end+1)=mydata(:,3)./mydata(:,2);
%define the fieldnames for your struct
labels={'code','amount','duration','intensity'};
%there is no array2struct, so we convert to cell first and use cell2struct.
cell2struct(num2cell(mydata),labels,2)

Writing .mat files to .nc

I have a code that creates a bunch of .mat files, but I want to save them as netcdf files (csv or txt would be fine as well) so people who can't use MATLAB can access them. This is what I have so far
%% Use function to read in
data = read_mixed_csv(filename,'"'); % Creates cell array of data
data = regexprep(data, '^"|"$',''); % Gets rid of double quotes at the start and end of the string
data = data(:,2:2:41); % Keep only the even cells because the odd ones are just commas
%% Sort data based on date (Column 1)
[Y,I] = sort(data(:,1)); % Create 1st column sorted
site_sorted = data(I,:); % Sort the entire array
%% Find unique value in the site data (Column 2)
% Format of site code is state-county-site
u_id = unique(site_sorted(:,2)); % get unique id
for i = 1:length(u_id)
idx=ismember(site_sorted(:,2),u_id{i}); % extract index where the second column matches the current id value
site_data = site_sorted(idx,:);
save([u_id{i} '.mat'],'site_data');
cdfwrite([u_id{i} '.nc'], 'site_data');
end
Everything works until the second to last line. I want to write each 'site_data' as a netcdf file with the same name as save([u_id{i} '.mat'],'site_data');, which is a string from the second column.
Try
cdfwrite([u_id{i}],{'site_data',site_data})
The extension will be '.cdf'. I am not sure if this can be changed while using cdfwrite.
Edit: Corrected Typo