What's the most efficient way to use large data from Excel in my C# code? - unity3d

I ran a computer simulation for my Pendulum, to measure time taken to reach the lowest point, for every velocity and every angle.
As you can imagine there is a lot of data, thousands of lines for all angles and velocity.
On every frame, I will be measuring the velocity and angle of the pendulum, and will look for the closest data in my Excel spreadsheet.
How can I go about this to make sure it's not too CPU-intensive?
Should I create a massive array where every element corresponds to a certain angle: for example, myArray[30] will be for all velocities and times for all my data between 30.0 degrees and 30.999. (That way it will be avoid lots of if statements)
Or should I keep everything in my Excel spreadsheet?
Any suggestion?

The best approach in my opinion would be dividing your data into intervals based on distribution since you have to access that data in every frame. Then when you measure the velocity and angle you can go look for the interval and access only that part of your data.
I would find maximum and minimum of your data points while importing to Unity and then divide that part based on (maximum - minimum) / NumOfIntervals. Lets say your interval size is 5 for each Angle. When you got an angle of 17 you can do (int)15/5 = 3(Assuming indexes start from zero) and go for third item in your structure. This can be a dictionary or Array of an Arbitrary class instances based on your data.
I can try to help further if you can share the structure of your data. But in my opinion evenly distribution of data to every interval is important.

Related

More Efficient Way of Calculating Population from Data Grid and overlapping Polygon?

folks! Apologies if this is a duplicate question and I've done some research on the topic but don't know if I'm heading in the right direction.
I have converted gridded data of population density to a MongoDB collection using a geometry object defining the population density cell as a five node polygon (the fifth node matching the first) and a float value consisting of the population in that geographic region. Even though the database is huge in size, I can quickly retrieve the "records" of the population regions as they are indexed as a 2D Sphere when it intersects a geo-polygon indicating some type of weather event or other geofence polygon.
The issue comes when I try to add all of the boxes up. It takes an exceedingly long amount of time, especially if the polygon is of a significant geographic area. The population data I have are 1km^2 cells. The adding of the data can take several seconds or, in worse case scenario, minutes!
I had a thought of creating a type of quadtree structure in the database by a lower resolution node set as a separate collection and so on and so on. Then when calculating population, I could start with the lowest res set and work my way down the node "tree" by making several database calls until there are no more matches. While I'd increase my database calls significantly, I'd reduce the sheer number of elements that I would need to add up at the end - which is taking the most computational time.
I could try to create these data using bottom-up neighbor finding whilst adding up the four population values that would make up the next lower-resolution node set. This, of course, will explode the database size and will increase the number of queries to the database for a single population request.
I haven't seen too much of this done with databases. I'd like to have it in a database (could also be PostgreSQL) since it gives me the ability to quickly geo-query by point or area. And, I'm returning the result as an API call so the efficiency of time is of the essence!
Any advice or places to research would be greatly appreciated!!!

compare all data in database at same time ( real time)

I have a problem with my android app, I have x value (whatever it is) and I have data in the database, I want to compare the value of x with all the data in the database at the same time in real time
the app is using sqlite.
I used a loop but when the database is large in this case my app lags in comparing all the data.my code is
public void Check_Distance(Location Current_Location,ArrayList<Location> LocationArrayList1)
{
double Distance;
for(int i=0;i<LocationArrayList1.size();i++)
{
Distance=distanceBetween(Current_Location,LocationArrayList1.get(i));
if(Distance<=0.1*1000){ // if distance is less then 100m give a sound
Notification_Sound();
}
}
}
You can't look at every record in the database at the exact same time. That's called quantum computing, and is currently an active research area where people far smarter than you or I are currently spending millions of dollars to try and create a machine that can do this kind of parallel processing.
That being said, you can make your algorithm more efficient, but that takes some effort to do. Both of the below are based on the idea of eliminating the majority of the locations that are obviously too far away very quickly, and performing more in-depth checks on those that could be in range.
One method is to sort the locations in ascending order in two arrays - one by North/South and the other by East/West. Find all entries within a given distance of the current position in each list, then combine the results to get a list of points within a box of X distance from the location. This box will have a much smaller number of points within it that you can then apply an iterative, circular, distance based approach to.
Another is to create a quadtree. This would subdivide the map area into a set of bounding volumes, where each volume would have a set of points, or additional bounding volumes. You can then place down your search area and find all the quadtree volumes that intersect with your circular search area, greatly minimizing the number of locations you need to do a true distance check on.

Remove Spikes from Periodic Data with MATLAB

I have some data which is time-stamped by a NMEA GPS string that I decode in order to obtain the single data point Year, Month, Day, etcetera.
The problem is is that in few occasions the GPS (probably due to some signal loss) goes boinks and it spits out very very wrong stuff. This generates spikes in the time-stamp data as you can see from the attached picture which plots the vector of Days as outputted by the GPS.
As you can see, the GPS data are generally well behaved, and the days go between 1 and 30/31 each month before falling back to 1 at the next month. In certain moments though, the GPS spits out a random day.
I tried all the standard MATLAB functions for despiking (such as medfilt1 and findpeaks), but either they are not suited to the task, either I do not know how to set them up properly.
My other idea was to loop over differences between adjacent elements, but the vector is so big that the computer cannot really handle it.
Is there any vectorized way to go down such a road and detect those spikes?
Thanks so much!
you need to filter your data using a simple low pass to get rid of the outliers:
windowSize = 5;
b = (1/windowSize)*ones(1,windowSize);
a = 1;
FILTERED_DATA = filter(b,a,YOUR_DATA);
just play a bit with the windowSize until you get the smoothness you want.

Tunning gain table to match two-curves

I have two data set, let us name them "actual speed" and "desired speed". My main objective is to match actual speed with the desired speed.
But for doing that in my case, I need to tune FF(1x10), Integral(10x8) and Proportional gain table(10x8).
My approach till now was as follows:-
First, start the iteration with having 0.1 as the initial value in the first cells(FF[0]) of the FF table
Then find the R-square or Co-relation between two dataset( i.e. Actual Speed and Desired Speed)
Increment the value of first cell(FF[0]) by 0.25 and then again compute R-square or Co-relation of two data set.
Once the cell(FF[0]) value reaches 2(Gains Maximum value. Already defined by the lab). Evaluate R-square and re-write the gain value in FF[0] which gives min. error between the two curve.
Then tune the Integral and Proportional table in the same way for the same RPM Range
Once It is tune then go for higher RPM range and repeat step 2-5 (RPM Range: 800-1000; 1000-1200;....;3000-3200)
Now the problem is that this process is taking way too long time to complete. For example it takes around 1 Hr. time to tune one cell of FF. Which is actually very slow.
If possible, Please suggest any other approach which I can try to tune the tables. I am using MATLAB R2010a and I can't shift to any other version of MATLAB because my controller can communicate with this version only and I can't use any app for tuning since my GUI is already communicating with the controller and those two datasets are being made in real-time
In the given figure, lets us take (X1,Y1) curve as Desired speed and (X2,Y2) curve as Actual speed
UPDATE

Finding where data levels off and rise times

My data looks like this:
I am trying to find the following info about my data:
Rate of rise on the "transient" portion
time to steady state and steady state average
I think that stepinfo is my best bet to do this, but it seems to want take the final value as the steady state value which isn't giving me the best result. And I cannot find the average value of the steady time until I know when it is.... Is there a way to set some bounds on the steady state search? On the picture i linked it could be data within +/- 0.25 for 50 data points is steady state?
What you can do is:
Decide what the slope of the curve should be in the intersection between transient and steady state
Smoothen you signal
Find the difference between each point on the graph
Find the first place where the difference between points is lower then the value you selected in
To do this, keep in mind:
The difference in the beginning is on average zero, thus you have to skip these values.
One way to do this is simply: x(x < 0.1 * max(x)) = []; That way you remove the entire start of the curve. You won't need it for this part anyway. Remember to keep a backup of x.
A simple way to smoothen the signal is: smooth_x = arrayfun(#(t) mean(x(t:t+k)), 1:numel(x-k)). You need to find an appropriate value for k.
Even a smoothed curve will have "bumps", thus you might want to compare points that are not adjacent, for instance check diff(x(k),x(k+10)). If the average incline between those two points are lower than the value you selected in 1, you're happy. A combination of find and diff should do the trick here.
Once you have smoothed it, you can use diff to find the average inclination for both the transient and the stable part.
There is no way for me to tell where the curve goes from transient to stable. That's a decision you need to make. It could for instance be less than 0.2 l/min per 10 seconds.