Usage of gradient in the updation of weight and bias in neural network - neural-network

Why do we use the gradient of the loss function to update the weight and bias of a neural network?
for example:
new_weight = old_weight - learning_rate * gradient
In other words how does gradient help us in updating the weight and bias correctly.

This is a gross simplification: the gradient is multidimensional derivate compacted into a "single structure". The derivate tells you, where the is a local change.
The gradient, tells you where there are local changes in all the dimensions you are considering.
Consider the 3-Dimensional case: our world. Image you are climbing a hill. Suppose your sight is limited at 3 meters for your position. Your goal is to reach the top.
You start from a point, and you look around. As you see 3 meters from your position in your direction, you choose to go where you see that the slope is steeper.
The action of looking around is computing the gradient and correcting your speed.
In your equation, remember this is a gross example, you are saying "Uhm, the first time I checked my direction was 124 degrees, now, looking at the gradient my direction should be 10 degrees. Which direction should I take now".
The learning rate or your equation is a coefficient that you can interpret as "friction", or "trust": you don't want to change your direction of 114 degres in one shot, instead you want to change with respect to the magnitude of the new measure.
You detect that the new direction should be 114 degrees (124-10) less the current one. So, if your learning rate is low, your new direction will impact less than in the case when the learning rate is higher.
This example gets generalized on multiple dimensions.

Related

How to find distance mid point of bezier curve?

I am making game in Unity engine, where car is moving along the bezier curve by percentage of bezier curve legth.
On this image you can see curve with 8 stop points (yellow spheres). Between each stop point is 20% gap of total distance.
On the image above everything is working correctly, but when I move handles, that the handles have different length problem occurs.
As you can see on image above, distances between stop points are not equal. It is because of my algorithm, because I am finding point of segment by multiplying segment length by interpolation (t). In short problem is that: if t=0.5 it is not in the 50% percent of the segment. As you can see on first image, stop points are in half of segment, but in the second image it is not in half of segment. This problem will be fixed, if there is some mathematical formula, how to find distance middle point.
As you can see on the image above, there are two mid points. T param mid point can be found by setting t to 0.5 (it is what i am doing now), but it is not half of the distance.
How can I find distance mid point (for cubic bezier curve, that have handles in different distance)?
You have correctly observed that the parameter value t=0.5 is generally not the point in the middle of the length. That is a good start but the difficulty lies in the mathematics beneath.
Denoting the components of your parametric curve by x(t) and y(t), the length of the curve
between t=0 (the beginning) and a chosen parameter value t = u is equal to
What you are trying to do is to find u such that l(u) is one half of l(1). This is sometimes possible but often difficult or impossible. So what can you do?
One possibility is to approximate the point you want. A straightforward way is to approximate your Bezier curve by a piecewise linear curve (simply by choosing many parameter values 0 = t_0 < t_1 < ... < t_n = 1 and connecting the values in these parameters by line segments). Now it is easy to compute the entire length (Pythagoras Theorem is your friend) as well as the middle point (walk along the piecewise linear curve the prescribed length). The more points you sample, the more precise you will be and the more time your computation will take, so there is a trade-off. Of course, you can use a more complicated numerical scheme for the approximation but that is beyond the scope of the answer.
The second possibility is to restrict yourself to a subclass of Bezier curves that will let you do what you want. These are called Pythagorean-Hodograph (shortly PH) curves. They have the extremely useful property that there exists a polynomial sigma(t) such that
This means that you can compute the integral above and search for the correct value of u. However, everything comes at a price and the price here is that you will have less freedom, where to put the control points (for me as a mathematician, a cubic Bézier curve has four control points; computer graphics people often speak of "handles" so you might have to translate into your terminology). For the cubic case you can find the conditions on slide 15 of this seminar talk by Vito Vitrih.
Denote:
the control points,
;
then the Bézier curve is a PH curve if and only if
.
It is up to you to figure out, if you can enforce this condition in your situation or if it is too restrictive for your application.

How to measure floor area using ARCore?

I'm a beginner, take it easy on me.
How would you measure the area of the floor using ARCore/Unity. I figure you somehow measure the area of the plane visualiser, or measure the area of each individual triangle, but I have no idea how to attack it.
The closest thing I can find is measuring distance...
You can get a (somewhat imprecise) estimate by multiplying
plane.getExtentX() * plane.getExtentZ();
This is most likely too high because it assumes the plane to be rectangular and
will suffer from measurement errors in both directions. But it might be good enough depending on your usecase.
A slightly more precise alternative would be to get the plane's polygon
plane.getPolygon()
and then computing the polygon's area

Getting a trajectory from accelerometer and gyroscope (IMU)

I am well aware of the existence of this question but mine will differ. I also know that there could be significant errors with this approach but I want to understand the configuration also theoretically.
I have some basic questions which I find hard to answer for myself clearly. There is a lot of information about accelerometers and gyroscopes but I still haven't found an explanation "from first principles" of some basic properties.
So I have a plate sensor that contains an accelerometer and gyroscope. There is also a magnetometer which I skip for now.
The accelerometer gives information in each time t about the temporary acceleration vector a = (ax, ay, az) in m/s^2 according to the fixed coordinate system to the sensor.
The gyroscope gives a 3D vector in deg/s which says the temporary speed of rotation of the three axes (Ox, Oy and Oz). From this information, one can get a rotation matrix that corresponds to an infinitesimal rotation of the coordinate system (according to the previous moment). Here is some explanation how to obtain a quaternion, that represents R.
So we know that the infinitesimal movement can be calculated considering that the acceleration is the second derivative of the position.
Imagine that your sensor is attached to your hand or leg. In the first moment we can consider its point in 3D space as (0,0,0) and the initial coordinate system also attached in this physical point. So for the very first time step we will have
r(1) = 0.5a(0)dt^2
where r is the infinitesimal movement vector, a(0) is the acceleration vector.
In each of the following steps we will use the calculations
r(t+1) = 0.5a(t)dt^2 + v(t)dt + r(t)
where v(t) is the speed vector which will be estimated in some way, for example as (r(t)-r(t-1)) / dt.
Also, after each infinitesimal movement we will have to take into account the data from the gyroscope. We will use the rotation matrix to rotate the vector r(t+1).
In this way, probably with tremendous error I will get some trajectory according to the initial coordinate system.
My queries are:
Am I principally correct with this algorithm? If not, where am I wrong?
I would very much appreciate some resources with a working example where the first principles are not skipped.
How should I proceed with using the Kalman's filter to obtain a better trajectory? In what way exactly do I pass all the IMU data (accelerometer, gyroscope and magnetometer) to the Kalman filter?
Your conceptual framework is correct, but the equations need some work. The acceleration is measured in the platform frame, which can rotate very quickly, so it is not advisable to integrate acceleration in the platform frame and rotate the position change. Rather, the accelerations are transformed into a relatively slowly rotating frame and the integration to velocity change and position change is done there. Typically a locally-level frame (e.g. North-East-Down or Wander Aziumuth) or an Earth-centered frame (ECEF or ECI). Gravity and Coriolis force must be included in the acceleration.
Derivations from first principles can be found in many references, one of my favorites is Strapdown Inertial Navigation Technology by Titterton and Weston. Derivations of the inertial navigation equations in locally-level and Earth-fixed frames are given in Chapter 3.
As you've recognized in your question - the initial velocity is an unknown constant of integration. Without some estimate of initial velocity the trajectory resulting from integrating the inertial data can be wildly wrong.

MATLAB - What are the units of Matlab Camera Calibration Toolbox

When showing the extrinsic parameters of calibration (the 3D model including the camera position and the position of the calibration checkerboards), the toolbox does not include units for the axes. It seemed logical to assume that they are in mm, but the z values displayed can not possibly be correct if they are indeed in mm. I'm assuming that there is some transformation going on, perhaps having to do with optical coordinates and units, but I can't figure it out from the documentation. Has anyone solved this problem?
If you marked the side length of your squares in mm, then the z-distance shown would be in mm.
I know next to nothing about matlabs (not entirely true but i avoid matlab wherever I can, and that would be almost always possible) tracking utilities but here's some general info.
Pixel dimension on the sensor has nothing to do with the size of the pixel on screen, or in model space. For all purposes a camera produces a picture that has no meaningful units. A tracking process is unaware of the scale of the scene. (the perspective projection takes care of that). You can re insert a scale by taking 2 tracked points and measuring the distance between those points. This is the solver spaces distance is pretty much arbitrary. Now if you know the real distance between these points you can get a conversion factor. By doing:
real distance / solver space distance.
There's really now way to knowing this distance form the cameras settings as the camera is unable to differentiate between different scales of scenes. So a perfect 1:100 replica is no different for the solver than the real deal. So you must allays relate to something you can measure separately for each measuring session. The camera always produces something that's relative in nature.

How can the friction drag be calculated for a moving and spinning disk on a 2D surface?

Let's consider a disk with mass m and radius R on a surface where friction u also involved. When we give this disk a starting velocity v in a direction, the disk will go towards that direction and slow down and stop.
In case the disk has a rotation (or spin with the rotational line perpendicular on the surface) w beside the speed then the disk won't move on a line, instead bend. Both the linear and angular velocity would be 0 at the end.
How can this banding/curving/dragging be calculated? Is it possible to give analytical solution for the X(v,w,t) function, where X would give the position of the disk according to it's initial v w at a given t?
Any simulation hint would be also fine.
I imagine, depending on w and m and u there would be an additional velocity which is perpendicular to the linear velocity and so the disk's path would bend from the linear path.
If you're going to simulate this, I'd probably recommend something like dividing up the contact surface between the disk and the table into a radial grid. Compute the relative velocity and the force at each point on the grid at each time step, then sum up the forces and torques (r cross F) to get the net force F and the net torque T on the disk as a whole. Then you can apply the equations F=(m)(dv/dt) and T=(I)(dw/dt) to determine the differential changes in v and w for the next time step.
For what it's worth, I don't think a flat disk would curve under the influence of either a frictional force (velocity-independent) or a drag force (linearly proportional to velocity).
A ball will move in a large arc with spin, but a [uniform] disk on a 2D surface will not.
For the disk it's center of spin is the same as it's center of gravity, so there is no torque applied. (As mentioned by duffymo, a nonuniform disk will have a torque applied.)
For a uniform ball, if the axis of the spin is not perpendicular to the table, this causes the ball to experience a rotational torque which causes it to move in a slight arc. The arc has a large radius, and the torque is slight, so usually friction makes the ball stop quickly.
If there was a sideways velocity, the ball would move along a parabola, like a falling object. The torque component (and the radius of the arc) can be computed in the same way you do for a precessing top. It's just that the ball sits at the tip of the top (err....) and the bottom is "imaginary".
Top equation: http://hyperphysics.phy-astr.gsu.edu/HBASE/top.html
omega_p = mgr/I/omega
where
omega_p = rotational velocity...dependent on how quickly you want friction to slow the ball
m = ball mass
g = 9.8 m/s^2 (constant)
r = distance from c.g. (center of ball) to center, depends on angle of spin axis (solve for this)
omega = spin rate of ball
I = rotational inertia of a sphere
My 2 cents.
Numerical integration of Newton's laws of motion would be what I'd recommend. Draw the free body diagram of the disk, give the initial conditions for the system, and numerically integrate the equations for acceleration and velocity forward in time. You have three degrees of freedom: x, y translation in the plane and the rotation perpendicular to the plane. So you'll have six simultaneous ODEs to solve: rates of change of linear and angular velocities, rates of change for two positions, and the rate of change of angular rotation.
Be warned: friction and contact make that boundary condition between the disk and the table non-linear. It's not a trivial problem.
There could be some simplifications by treating the disk as a point mass. I'd recommend looking at Kane's Dynamics for a good understanding of the physics and how to best formulate the problem.
I'm wondering if the bending of the path that you're imagining would occur with a perfectly balanced disk. I haven't worked it out, so I'm not certain. But if you took a perfectly balanced disk and spun it about its center there'd be no translation without an imbalance, because there's no net force to cause it to translate. Adding in an initial velocity in a given direction wouldn't change that.
But it's easy to see a force that would cause the disk to deviate from the straight path if there was an imbalance in the disk. If I'm correct, you'll have to add an imbalance to your disk to see bending from a straight line. Perhaps someone who's a better physicist than me could weigh in.
When you say friction u, I'm not sure what you mean. Usually there is a coefficient of friction C, such that the friction F of a sliding object = C * contact force.
The disk is modeled as a single object consisting of some number of points arranged in circles about the center.
For simplicity, you might model the disk as a hexagon evenly filled with points, to make sure every point represents equal area.
The weight w of each point is the weight of the portion of the disk that it represents.
It's velocity vector is easily computed from the velocity and rotation rate of the disk.
The drag force at that point is minus its weight times the coefficient of friction, times a unit vector in the direction of its velocity.
If the velocity of a point becomes zero, its drag vector is also zero.
You will probably need to use a tolerance about zero, else it might keep jiggling.
To get the total deceleration force on the disk, sum those drag vectors.
To get the angular deceleration moment, convert each drag vector to an angular moment about the center of the disk, and sum those.
Factor in the mass of the disk and angular inertia, then that should give linear and angular acclerations.
For integrating the equations of motion, make sure your solver can handle abrupt transitions, like when the disk stops.
A simple Euler solver with really fine stepsize might be good enough.