How to import selected fields from matlab struct into Simulink? - matlab

I have a .mat file containing a struct with a bunch of data (about 80 fields, each thousands of rows long). I also have a simulink model to process some of the data. Is there a way to import only some of the fields of the struct into simulink? (That is, the data from a specific field would go into an inport or something that allows me to process it as a signal). I only need about 8 of the fields, not all 80 of them.
I thought about using a timetable but then I read that that requires there to be a single column of data so that wouldn't work as I would have multiple columns, each representing a different signal/variable.

Related

data processing of the .mat result from Dymola simulation

I am trying to do data processing with the .mat result from Dymola. My plan is to use MATLAB. I got a few questions about the .mat file:
If I load the .mat file into MATLAB directly, the data structure is very strange, I have to use the MATLAB scripts shipped with Dymola to load the .mat file. Is there an explanation about how the data is stored in the .mat file?
when plotting the diagram with the result, I wanna change the unit, but I am not sure how to make Dymola output the data with the unit I want to use. Is there any setting that allows me to change the unit when Dymola output data into the .mat file?
Regarding the file format, note that there is a utility to convert the MAT files to a simple HDF5-based format, if that makes post-processing easier. There are scripts for both MATLAB and Python to read such files (extension SDF).
You can get an explanation for the basic data-structure of the result file if you generate a textual result file (it might also be somewhere in the documentation), and the most relevant part is:
Matrix with 4 columns defining the data of the signals:
dataInfo(i,1)= j: name i data is stored in matrix "data_j".
(1,1)=0, means that name(1) is used as abscissa
for ALL data matrices!
dataInfo(i,2)= k: name i data is stored in column abs(k) of matrix
data_j with sign(k) used as sign.
And to simplify things: there are at most two data-matrices, and the abscissa used for ALL data matrices is "Time".
You cannot currently directly output mat-files in specific units.
However, you can output csv-files using specific units.
The structure of the .mat files created by Modelica Dymola is introduced here briefly. But what you should know about this file format is that Dymola keeps the simulation variables in two different mat-file variables in this way:
The name and description for ALL of the variables are kept in two different separate variables inside the .mat file, i.e. name and description.
If a variable has a constant value and its value is not changed over time (like a scalar variable), it is kept inside data_1 variable inside the .mat file.
Otherwise, it is kept inside the data_2 variable inside the .mat file.
Keeping variable data like this is a technique used my Dymola to gain the best performance while saving simulation data in large files.
For reading these mat files created by Dymola without using Dymola itself, you can use a .mat reader library like MATIO to read the data and then interpret the results by your own.

How can I import ground truth data into Matlab for the training of a (faster) R-CNN?

I have a large, labelled, dataset which I have created and I would like to provide it to Matlab to train an R-CNN (using the faster R-CNN algorithm).
How can this be done?
The built-in labeller provided by Matlab requires that the user manually load each data sample and label it with a graphical user interface.
This is not practical for me as the set is already labelled and it contains 500,000 samples.
It should be noted, that I can control the format in which the data set is stored. So, I can create .csv files or excel files if needed.
I have tried two directions:
1. Creating a mat file, similar to the one created by the labeller.
2. Looked for ways within Matlab to import the data from .csv or excel files.
I have had no success with either methods.
For Direction 1:
Though there are many libraries that can open mat files, they are not able to open or create files similar to the Matlab ground truths because these are not simple matrices (the cells themselves contain matrices of varying dimensions that represent the bounding boxes of each classified object). Moreover, though the Matlab Level 5 file format is open source I have not been successful in using it to write my own code (C# or C++) to parse and write such files.
For Direction 2:
There are generic methods in Matlab to load .csv and excel files but I do not know how to organize these files in such a way as to produce the structure that the labeller creates and that is consumed by the fasterRCNN trainer.

Load netcdf subset in Matlab

G'day,
I have ocean model output in the form of netCDF files. The netCDF files are approximately 21GB, and the variables that I want to load are also pretty big (~ 120 * 31 * 300 * 400 sized matrices).
I want to load some of these variables from a netCDF file into MATLAB. Usually, I would do this via:
ncload('filename.nc',var1)
Which would load the variables var1 into similarly named MATLAB variables. However, since I only need a single column of var1, I only want to load a subset of var1 - This should speed up the loading process. For example, say,
size(var1)
>> var1 120x31x260x381
I only want the 31st column, and loading the other 30 columns, and discarding the information seems like a waste of time. In other words, this is what I want to accomplish: ncload('filename.nc',var1(:,31,:,:)).
I know there are a few different netCDF toolboxes floating around, and I have heard that one can use a stride flag to only load every xth entry... but I'm not sure if it's possible to do what I want. Does anyone know of a way to do this?
Cheers
If you have a current version of MATLAB, look for NCREAD and the example therein.

Need a method to store a lot of data in Matlab

I've asked this before, but I feel I wasn't clear enough so I'll try again.
I am running a network simulation, and I have several hundreds output files. Each file holds the simulation's test result for different parameters.
There are 5 different parameters and 16 different tests for each simulation. I need a method to store all this information (and again, there's a lot of it) in Matlab with the purpose of plotting graphs using a script. suppose the script input is parameter_1 and test_2, so I get a graph where parameter_1 is the X axis and test_2 is the Y axis.
My problem is that I'm not quite familier to Matlab, and I need to be directed so it doesn't take me forever (I'm short on time).
How do I store this information in Matlab? I was thinking of two options:
Each output file is imported separately to a different variable (matrix)
All output files are merged to one output file and imprted together. In the resulted matrix each line is a different output file, and each column is a different test. Problem is, I don't know how to store the simulation parameters
Edit: maybe I can use a dataset?
So, I would appreciate any suggestion of how to store the information, and what functions might help me fetch the only the data I need.
If you're still looking to give matlab a try with this problem, you can iterate through all the files and import them one by one. You can create a list of the contents of a folder with the function
ls(name)
and you can import data like this:
A = importdata(filename)
if your data is in txt files, you should consider this Prev Q
A good strategy to avoid cluttering your workspace is to import them all into a single matrix. SO if you have a matrix called VAR, then VAR{1,1}.{1,1} could be where you put your test results and VAR{1,1}.{2,1} could be where you put your simulation parameters of the first file. I think that is simpler than making a data structure. Just make sure you uniformly place the information in the same indexes of the arrays. You could also organize your VAR row v col by parameter vs test.
This is more along the lines of your first suggestion
Each output file is imported separately to a different variable
(matrix)
Your second suggestion seems unnecessary since you can just iterate through your files.
You can use the command save to store your data.
It is very convenient, and can store as much data as your hard disk can bear.
The documentation is there:
http://www.mathworks.fr/help/techdoc/ref/save.html
Describe the format of text files. Because if it has a systematic format then you can use dlmread or similar commands in matlab and read the text file in a matrix. From there, you can plot easily. If you try to do it in excel, it will be much slower than reading from a text file. If speed is an issue for you, I suggest that you don't go for Excel.

Learning decision trees on huge datasets

I'm trying to build a binary classification decision tree out of huge (i.e. which cannot be stored in memory) datasets using MATLAB. Essentially, what I'm doing is:
Collect all the data
Try out n decision functions on the data
Pick out the best decision function to separate the classes within the data
Split the original dataset into 2
Recurse on the splits
The data has k attributes and a classification, so it is stored as a matrix with a huge number of rows, and k+1 columns. The decision functions are boolean and act on the attributes assigning each row to the left or right subtree.
Right now I'm considering storing the data on files in chunks which can be held in memory and assigning an ID to each row so the decision to split is made by reading all the files sequentially and the future splits are identified by the ID numbers.
Does anyone know how to do this in a better fashion?
EDIT: The number of rows m is around 5e8 and k is around 500
At each split, you are breaking the dataset into smaller and smaller subsets. Start with the single data file. Open it as a stream and just process one row at a time to figure out which attribute you want to split on. Once you have your first decision function, split the original data file into 2 smaller data files that each hold one branch of the split data. Recurse. The data files should become smaller and smaller until you can load them in memory. That way, you don't have to tag rows and keep jumping around in a huge data file.