Unity - relate wall clock time to physics time (in fixed update) - unity3d

I am involved in a project that is building software for a robot that uses ROS2 to support the robot's autonomy code. To streamline development, we are using a model of our robot built in Unity to simulate the physics. In Unity, we have analogues for the robot's sensors and actuators - the Unity sensors use Unity's physics state to generate readings that are published to ROS2 topics and the actuators subscribe to ROS2 topics and process messages that invoke the actuators and implement the physics outcomes of those actuators within Unity. Ultimately, we will deploy the (unmodified) autonomy software on a physical robot that has real sensors and actuators and uses the real world for the physics.
In ROS2, we are scripting with python and in Unity the scripting uses C#.
It is our understanding that, by design, the wall clock time that a Unity fixed update call executes has no direct correlation with the "physics" time associated with the fixed update. This makes sense to us - simulated physics can run out of synchronization with the real world and still give the right answer.
Some of our planning software (ROS2/python) wants to initiate an actuator at a particular time, expressed as floating point seconds since the (1970) epoch. For example, we might want to start decelerating at a particular time so that we end up stopped one meter from the target. Given the knowledge of the robot's speed and distance from the target (received from sensors), along with an understanding of the acceleration produced by the actuator, it is easy to plan the end of the maneuver and have the actuation instructions delivered to the actuator well in advance of when it needs to initiate. Note: we specifically don't want to hold back sending the actuation instructions until it is time to initiate, because of uncertainties in message latency, etc. - if we do that, we will never end up exactly where we intended.
And in a similar fashion, we expect sensor readings that are published (in a fixed update in Unity/C#) to likewise be timestamped in floating point seconds since the epoch (e.g., the range to the target object was 10m at a particular recent time). We don't want to timestamp the sensor reading with the time it was received because of unknown latency from the time the sensor value was current and the time it was received in our ROS2 node.
When our (Unity) simulated sensors publish a reading (based on the physics state during a fixed update call), we don't know what real-world/wall clock timestamp to associated with it - we don't know which 20ms of real time that particular fixed update correlates to.
Likewise, when our our Unity script that is associated with an actuator is holding a message that says to initiate actuation at a particular real-world time, we don't know if that should happen in the current fixed update because we don't know the real-world time that the fixed update correlates to.
The Unity Time methods all seem to deal with time relative to the start of the game (basically, a dynamically determined epoch).
We have tried capturing the wall clock time and time since game start in a MonoBehavior's Start, but this seems to put us off by a handful of seconds when the fixed updates are running (with the exact time shift being variable between runs).
How to crosswalk between the Unity game-start-based epoch and a fixed-start epoch (e.g., 1970)?
An example: This code will publish the range to the target, along with the time of the measurement. This gets executed every 20ms by Unity.
void FixedUpdate()
{
RangeMsg targetRange = new RangeMsg();
targetRange.time_s = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds() / 1000.0;
targetRange.range_m = Vector3.Distance(target.transform.position, chaser.transform.position);
ros.Publish(topicName, targetRange);
}
On the receiving end, let's say that we are calculating the speed toward the target:
def handle_range(self, msg):
if self.last_range is not None:
diff_s = msg.time_s - self.last_range.time_s
if diff_s != 0.0:
diff_range_m = self.last_range.range_m - msg.range_m
speed = Speed()
speed.time_s = msg.time_s
speed.speed_mps = diff_range_m / diff_s
self.publisher.publish(speed)
self.last_range = msg
If the messages are really published exactly every 20ms, then this all works. But if Unity gets behind and runs several fixed updates one after another to get caught up, then the speed gets computed as much higher than it should (because each cycle, 20ms of movement is applied, but the cycles may be executed within a millisecond of each other).
If instead we use Unity's time for timestamping the messages with
targetRange.time_s = Time.fixedTimeAsDouble;
then the range and time stay in synch and the speed calculation works great, even in the face of some major hiccup in Unity processing. But, then the rest of our code that lives in the 1970 epoch has no idea what time targetRange.time_s really is.

Related

Android app dev: Finding the best way to synchronize the timestamps of two sensors

There's already a good answer on the technical details and constraints of timing the gyro measurement:
Movesense, timestamp source of imu data, and timing issues in general
However, I would like to ask more practical question from the Android app developer perspective working with two sensors and requirement for high accuracy with Gyro measurement timing.
What would be the most accurate way to synchronize/consolidate the timestamps from two sensors and put the measurements on the same time axis?
The sensor SW version 1.7 introduced Time/Detailed API to check the internal time stamp and the UTC time set on the sensor device. This is how I imagined it would play out with two sensors:
Before subscribing anything, set the UTC time (microseconds) on the sensor1 and sensor2 based on Android device time (PUT /Time)
Get the difference of the "Time since sensor turned on" (in milliseconds) and "UTC time set on sensor" (in microseconds) (on sensor1 and sensor2) (GET /Time/Detailed).
Calculate the difference of these two timestamps (in milliseconds)(for both sensors).
Get the gyro values from the sensor with the internal timestamp. Add the calculated value from step 3 to the internal timestamp to get the correct/global UTC time value.
Is this procedure correct?
Is there a more efficient or accurate way to do this? E.g. the GATT service to set the time was mentioned in the linked post as the fastest way. Anything else?
How about the possible drift in the sensor time for gyro? Are there any tricks to limit the impact of the drift afterwards? Would it make sense to get the /Time/Detailed info during longer measurements and check if the internal clock has drifted/changed compared to the UTC time?
Thanks!
Very good guestion!
Looking at the accuracy of the crystals (+- 20 ppm) it means that typical drift between sensors should be no more than 40 ppm. That translates to about 0.14 seconds over an hour. for longer measurements and or better accuracy, a better synchronization is needed.
Luckily the clock drift should stay relatively constant unless the temperature of the sensor is changing rapidly. Therefore it should be enough to compare the mobile phone clock and each sensor UTC at the beginning and end of the measurement. Any drift of each of sensors should be visible and the timestamps easily compensated.
If there is need to even more accurate timestamps, taking regular samples of /Time/Detailed from each sensor and comparing it to the phone clock should provide a way to estimate possible sensor clock drift.
Full Disclosure: I work for the Movesense team

How to make an reinforcement learning agent learn an endless runner?

I'm tried to train a reinforcement learning agent to play an endless runner game using Unity-ML.
The game is simple: an obstacle is approaching from the side and the agent has to jump at the right timing to overcome it.
As the observation, I have the distance to the next obstacle. Possible actions are 0 - idle; 1 - jump. Rewards are given for longer playtime.
Unfortunately, the agent fails to learn to overcome even the 1st obstacle reliable. I guess this is due too high imbalance on the two actions as the ideal policy would be doing nothing (0) most of the time and jump (1) only at very specific points in time. Additionally, all actions during a jump are meaningless since the agent cannot jump while in the air.
How can I improve the learning such that it convergence nevertheless? Any suggestions what to look into?
Current trainer config:
EndlessRunnerBrain:
gamma: 0.99
beta: 1e-3
epsilon: 0.2
learning_rate: 1e-5
buffer_size: 40960
batch_size: 32
time_horizon: 2048
max_steps: 5.0e6
Thanks!
It's difficult to say without seeing the exact code that's being used for the reinforcement learning algorithm. Here are some steps worth exploring:
How long are you letting the agent train? Depending on the complexity of the game environment, it very well may take thousands of episodes for the agent to learn to avoid its first obstacle.
Experiment with the Frameskip property of the Academy object. This permits the agent to only take an action after a number of frames have passed. Increasing this value may increase the speed of learning in more simple games.
Adjust the learning rate. The learning rate determines how heavily the agent weights new information versus old information. You're using a very small learning rate; try increasing it by a couple decimal places.
Adjust epsilon. Epsilon determines how often a random action is taken. Given a state and an epsilon rate of 0.2, your agent will take a random action 20% of the time. The other 80% of the time, it will choose the (state, action) pair with the highest associated reward. You can try reducing or increasing this value to see if you get better results. Since you know you'll want more random actions in the beginning of training, you can even "decay" epsilon with each episode. If you start with an epsilon value of 0.5, after each game episode is completed, reduce epsilon by a small value, say 0.00001 or so.
Change the way the agent is rewarded. Instead of rewarding the agent for each frame it stays alive, perhaps you could reward the agent for each obstacle it successfully jumps over.
Are you sure that the given time_horizon and max_steps provide enough runway for the game to complete an episode?
Hope this helps, and best of luck!

Why are computer/game physics engines often non-deterministic?

Developing with various game physics engines over the years, I've noticed that on the same machine I observe widely different results in physics simulations between runs. Most recently, the Unity engine does this, even though physics are calculated at set intervals of time (FixedUpdate) -- as far as I can determine it should be completely independent of frame-rate.
I've asked this question on game forums before, and was told it was due to chaotic motion: see double pendulum. But, even the double pendulum is deterministic if the starting conditions are exactly controlled, right? On the same machine, shouldn't floating point math behave the same way?
I understand that there are problems with floating point math accuracy, but I understand those problems (as outlined here) to not be problems on the same hardware -- isn't floating point inaccuracy still deterministic? What am I missing?
tl;dr: If running a simulation on the same machine, using the same floating point math(?), shouldn't the simulation be deterministic?
Thank you very much for your time.
Yes, you are correct. A program executed on the same machine will give identical results each time (at least ideally---there might be cosmic rays or other external things that affect memory and what not, but I would say these are not of our concern). All calculations on a computer are deterministic, and so all algorithms of a computer will necessarily be deterministic (which is the reason it's so hard to make random number generators)!
Most likely the randomness you see is implemented in the program with some random number generator, and the seed for the random numbers varies from run to run. Should you start the simulation with the same seed, you will see the same result.
Edit: I'm not familiar with Unity, but doing some more research seems to indicate that the FixedUpdate routine might be the problem.
Except for the audio thread, everything in unity is done in the main thread if you don't explicitly start a thread of your own. FixedUpdate runs on the very same thread, at the same interval as Update, except it makes up for lost time and simulates a fixed time step.
source
If this is the case, and the function itself looks somewhat like:
void physicsUpdate(double currentTime, double lastTime)
{
double deltaT = currentTime - lastTime;
// do physics using `deltaT`
}
Here we will naturally get different behaviour due to deltaT not being same from two different runs. This is determined from what other processes are running in the background, as they could delay the main thread. This function would be called irregularly and you would observe different results from runs. Note that these irregularities will mostly not be due to floating point inprecision, but due to inaccuracies when doing integration. (E.g. velocity is often calculated by v = a*deltaT, which assumes a constant acceleration since last update. This is in general not true).
However, if the function would look like this:
void physicsUpdate(double deltaT)
{
// do physics using `deltaT`
}
Every time you do simulations using this you will always get the exact same result.
I've not got much experience with Unity or its physics simulations, but I've found the following forum post which also links to an article which seems to indicate it's down to precision with the floating point calculations.
As you've mentioned, a lot of people seem to keep rehashing this question!
The forum also links to this blog post which may shed some light on the issue.

How to balance start time for a multiplayer game?

I'm making a multiplayer game with GameKit. My issue is that when two devices are connected the game starts running with a slight time difference. On of the devices starts running the game a bit later. But this is not what i want. i want it to start simultaneously on both devices. So the first thing that i do is i check time of the beginning on both devices like this:
startTime = [NSDate timeIntervalSinceReferenceDate];
and this is how it looks:
361194394.193559
Then I send startTime value to the other device and then i compare received value with startTime of the other device.
- (void)balanceTime:(double)partnerTime
{
double time_diff = startTime - partnerTime;
if (time_diff < 0)
startTimeOut = -time_diff;
}
So if difference between two start times is negative it means that this device is starting earlier and therefore it has to wait for exactly the difference assigned to startTimeOut variable, which is a double and usually is something like 2.602417. So then i pause my game in my update method
- (void)update:(ccTime)dt
{
if (startTimeOut > 0)
{
NSLog(#"START TIME OUT %f", startTimeOut);
startTimeOut -= dt;
return;
}
}
But it unfortunately it doesn't help. Moreover it even extends the difference between start times of the devices. I just can't get why. Seems like everything i'm doing is reasonable. What am i doing wrong? How do i correct it? What would you do? Thanks a lot
As Almo commented, it is not possible to synchronize two devices to the same time. At the lowest level you will gnaw your teeth out on the Heisenberg Uncertainty Principle. Even getting two devices to synchronize to within a tenth of a second is not a trivial task. In addition, time synchronization would have to happen more or less frequently since the clocks in each device run ever so slightly asynchronous (ie a teeny bit faster or a weeny bit slower).
You also have to consider the lag introduced by sending data over Wifi, Blutooth or over the air. This lag is not a constant, and can be 10ms in one frame and 1000ms in another. You can't cancel out lag, nor can you predict it. But you can predict player movements.
The solution for games, or at least one of them, is client-side prediction and dead reckoning. This SO question has a few links of interest.

real-time in context of a game

I have a problem grokking the concept of real-time (IMO badly named, different meaning in different contexts). I understand real-time software as a software where time is a key variable. Events must occur at given time. Say, railway switch change at 15:02 and the next one must be at 15:05 no matter what.
But how about this example. In game, when player's FPS drops below 16 game exits and tell user to upgrade his hardware or kill other applications. So when one iteration of the game loop takes more than 1/16 of a second the output of the program is completely different.
Is it real-time(ish)? Can it be considered as a Real Time Computing?
Your question is hard to understand, are you referring to Real Time Computing, or simulating real time, or something completely different?
Simulating real time: It is possible to simulate real-time in a game by polling for events. Store the time of an event, and then when it comes time to render a frame, the game should repeatedly 'fast forward' by moving the current time to the time of the next event and handle the event. This should repeat until there are no more events, or the time is 'current'.
This requires you to have anything that is a function of time (such as velocity, position, acceleration) be calculated according to the current time. This means you would not have these attributes periodically updated, and allows your game to be deterministic, as the 'game time' is no longer dependent upon real time. It also makes things like game speed and pausing very simple to implement.
If you're referring to the concept of real-time systems, then I would say there's not enough information to determine whether that 'game loop' is 'real-time'. It depends on the operating environment of the game, and the logic in the 'game loop'. According to wikipedia, a real-time deadline must be met, regardless of system load.
In the rapidly approaching canonical article Fix your Timestep!, Glenn Fielder addresses numerous ways to handle this issue. While the article focuses primarily on physics, the key points are applicable to any system that represents a function of time, to wit, things dealing with moving things.
The executive summary of that article (which is well worth reading) is this:
You can make your physics deterministic (well, as much as can be achieved with imperfect input) by using discrete physics timesteps. It looks like this:
Render as fast as possible
Pass in a time delta that represents how long steps previous took this frame
Process delta time modulo timestep number of physics steps
Store the remainder of delta that you weren't able to process in an accumulator
That accumulator gets added to the next frame's time buffer. This requires some fine tuning such that temporary lag spikes due to e.g. a rapidly spinning player (which necessitates a lot of visibility determination over time) don't end up putting you in an inescapable time debt. If you wanted to intelligently guard against such an occurrence, you could have a sentry look for dangerous levels of accumulated time, which you could respond to by perhaps dropping a video frame.
Another advantage to using discrete timesteps is that they behave well in multiplayer games. If you have an authoritative server or node in a peer-to-peer configuration, the server can ensure that all clients' physics simulations are running at the same physics timeline. Discrete time blocks also simplifies things in rollback based multiplayer.
Edit:
Disclaimer: I've never written software for real-time myself, only worked in a company that had!
In response to really-real real-life Real Time software, it's unlikely that anyone has made a game that could be qualified as this, at least in software. (I'm not sure how one would qualify games on ROMs or games that don't run under a host OS?) While your example would be an attempt at real-time software, most real-time software goes through a period of certification in which the maximum amount of time spent per instruction or on a logical block of operation is determined. Games might come close to this in a sense when, for example, platform licensors have requirements (as I believe XBLA does) regarding minimum 30fps or similar. However, these certifications are usually established through a period of testing rather than through mathematical proof.