Extract correct data of a file in Matlab - matlab

During an integration process with variable stepsize (in concrete, ode113 integrator is used) the position of a body is determined through its acceleration which is stored in a file along with time (i.e. two columns, one for time and the remaining for the acceleration). However, because of there are failed integration steps through the mentioned process, the file for the acceleration data has more rows than the corresponding one for the position data. How I could to extract the correct data of the acceleration data and create a new file for it with the same rows as the position data file?

Is the incorrect data always distributed in a logical way? You can create a search function, looking at the indices of the matrix. Thus cut out the correct data by copying them to a new variable of the now correct length. possibly expensive interms of run time and data storage but definately surefire.

Related

How to read every n miliseconds a double value from .txt file in Simulink for a real-time application?

What I'm doing is a real-time application for visual servoing that needs to read an X and Y values of a coordinate during running time. As the simulation runs, these values are changed in time by a Python Script and then, I need to read every n milliseconds (say 20ms) this changing values in Simulink to move some DC motors based on the said (X,Y) coordinates.
To sum up, what I need is how to read in run-time values from an external file in Simulink. I could be a .txt, or any other kind of file but it has to read the changing values in real-time.
I'd totally appreciate your help!
I've tried to read a .txt file from a user-defined simulink block, but the functions I used are not supported in simulink, like fscanf.

What's the most efficient way to use large data from Excel in my C# code?

I ran a computer simulation for my Pendulum, to measure time taken to reach the lowest point, for every velocity and every angle.
As you can imagine there is a lot of data, thousands of lines for all angles and velocity.
On every frame, I will be measuring the velocity and angle of the pendulum, and will look for the closest data in my Excel spreadsheet.
How can I go about this to make sure it's not too CPU-intensive?
Should I create a massive array where every element corresponds to a certain angle: for example, myArray[30] will be for all velocities and times for all my data between 30.0 degrees and 30.999. (That way it will be avoid lots of if statements)
Or should I keep everything in my Excel spreadsheet?
Any suggestion?
The best approach in my opinion would be dividing your data into intervals based on distribution since you have to access that data in every frame. Then when you measure the velocity and angle you can go look for the interval and access only that part of your data.
I would find maximum and minimum of your data points while importing to Unity and then divide that part based on (maximum - minimum) / NumOfIntervals. Lets say your interval size is 5 for each Angle. When you got an angle of 17 you can do (int)15/5 = 3(Assuming indexes start from zero) and go for third item in your structure. This can be a dictionary or Array of an Arbitrary class instances based on your data.
I can try to help further if you can share the structure of your data. But in my opinion evenly distribution of data to every interval is important.

Simulink: Simulating using from file block. Choosing a variable start point

I have a control model in Simulink which consists of two blocks. One which takes some inputs and generates three signals, x,y,z as arrays (trajectory) and feeds them to the second block as reference for the control.
I would like to be able to run this using a recorded trajectory. I have simulated the trajectory (by running the simulation once) and written the data to a mat file (signals plus timestamp). I can remove the first block and feed the mat file to the second control block and it works fine.
The trajectory is a loop. My question is, I would like to be able to start the simulation at any point in the file and I am not familiar with how Simulink manages time. If I want to start from a different point what do I need to do and can I make it continuous so that if I start from point N-1 in the file it will proceed through N and back to 1,2,3 etc.
Thanks,
Bryan
If you want to start at a different time point you won't be able to directly use the .mat file that you have created. You'll need to load the data into MATLAB and change the time vector so that t=0 corresponds to the data that you do want to start with.
Since you want to repeat the sequence, you most likely want to use the Repeating Sequence block. This would require you to load the data into MATLAB (and do the time alignment) anyway.

Simulink From Workspace: can't use timestamps from matrix

I'm using the simulink block From Workspace to read in some audio data provided by a script. I have formatted the data in a matrix with 2 columns, the first is the timestamp and the second is the data.
In the configuration paramaters, I have specified Fixed-Step and Discrete solver. The Start time and Stop also need to be configured manually and don't seem to come from the data.
Also, in the From Workspace block configuration, I need to specify the sample time (1/44100) or I get a warning if I specify -1, to inherit from the data and then get strange sample times.
So, how can I get simulink to use only the sample times in the matrix and use the first and last timestamps as the start and stop time of the simulation?
You should be able to do what you want by doing the following:
Firstly note that your problem is by definition not fixed step, hence you cannot use a fixed-step solver, which by definition is ... fixed-step.
You must use a variable step solver.
Assuming your (2 column) input data is called simin then set the start and stop times to be simin(1,1) and simin(end,1) respectively.
In your From Workspace block set the sample time to be 0 (which should have been the default).
Also de-select the Interpolate data option; and set "Form the output after final data value by:" to zero (you won't be using anything past the end of your data set so this should be OK.
Then you need to tell the solver to take additional steps to those that it would naturally want to take.
Do this on the Data Import/Export pane of the Model Configuration Parameters.
Near the bottom of the pane there is a selection box and an edit box for doing this.
Note however that this does not prevent the solver from taking steps at other time points, it just forces it to take additional steps at the times you specify.
But because you have your From WOrkspace block to not interpolate this shouldn't be a problem either. You should put simin(:,1) in here so that the solver is guaranteed to take steps at the time points in your input data.
Note that if you want an input block that only samples at the time points in the simin time vector then the only way to do this is to write an S-function that uses the mdlGetTimeOfNextVarHit method to tell the solver what the next sample time (for this block) should be.

Learning decision trees on huge datasets

I'm trying to build a binary classification decision tree out of huge (i.e. which cannot be stored in memory) datasets using MATLAB. Essentially, what I'm doing is:
Collect all the data
Try out n decision functions on the data
Pick out the best decision function to separate the classes within the data
Split the original dataset into 2
Recurse on the splits
The data has k attributes and a classification, so it is stored as a matrix with a huge number of rows, and k+1 columns. The decision functions are boolean and act on the attributes assigning each row to the left or right subtree.
Right now I'm considering storing the data on files in chunks which can be held in memory and assigning an ID to each row so the decision to split is made by reading all the files sequentially and the future splits are identified by the ID numbers.
Does anyone know how to do this in a better fashion?
EDIT: The number of rows m is around 5e8 and k is around 500
At each split, you are breaking the dataset into smaller and smaller subsets. Start with the single data file. Open it as a stream and just process one row at a time to figure out which attribute you want to split on. Once you have your first decision function, split the original data file into 2 smaller data files that each hold one branch of the split data. Recurse. The data files should become smaller and smaller until you can load them in memory. That way, you don't have to tag rows and keep jumping around in a huge data file.