Amortized Analysis: Find the Rate of Travel - amortized-analysis

A biker can travel at 24kms per hour with the flow of the wind, but only 12kms per hour against the wind. Assuming the biker starts and finishes at the same point.
What is the rider's amortized rate of travel?
I do not understand how the answer is arrived at, I have read my lecture notes but it is a little confusing.
Thanks

I am assuming the biker goes from point A to point B, and then back from point B to point A, and from A-B he is going 24km/hr and from B-A he is going 12km/hr.
Amortized is a fancy term that here essentially means average, so we want to find the average speed that the biker is traveling.
The actual distance that the person travels is not relevant because the distance from A to B is equivalent to the distance from B to A. For simplicity of the Math, lets say that the distance is 24 kilometers.
We can then say that at a rate of 24km/hr it would take 1 hour to get from A to B.
We can also say that at a rate of 12hm/hr it would take 2 hours to get from B to A.
This means that in total it takes 3 hours to get from A to B back to A. In this time he has gone twice the distance (which we decided was 24). That means in total he traveled 48km.
If it took 3 hours to go 48km, that is (48/3)km/hr or just 16km/hr for the average (amortized) rate of speed.
You can see that the distance we decide upon is indeed irrelevant if you plug in x for the distance, and carry it through.

Related

Using Dijkstra to find the shortest path for a robot that pickup objects

So basically, I have a warehouse represented by a graph and each node in it contains a certain amount of 3 objects (A, B, C). I have to use Dijkstra to find the shortest path the robot should take in order to take an amount of each item provided as the input and minimize the time.
Also, each time the robot picks an object, the robot's speed goes slower so the time it takes for him to travel a vertice isn't equal to its distance anymore. The given equation is Time = Distance * k where k is a constant associated with the robot (k= 1 + mass carried) and type A objects have a mass of 1kg, B objects of 3kg and C objects of 5kg.
My question is how can I modify or use the Dijkstra's algorithm given that I have to take into account the objects that I have to pick and the decrease of speed.
Thanks in advance!
When calculating the cost to go to a node, have another variable to account for time such that more items result in a larger number meaning more costly. So the total cost to go to a node is the sum of the cost between nodes and the variable accounting for time. The rest of Dijkstra should still work.

How can a Neural Network learn from testing outputs against external conditions which it can not directly control

In order to simplify the question and hopefully the answer I will provide a somewhat simplified version of what I am trying to do.
Setting up fixed conditions:
Max Oxygen volume permitted in room = 100,000 units
Target Oxygen volume to maintain in room = 100,000 units
Maximum Air processing cycles per sec == 3.0 cycles per second (min is 0.3)
Energy (watts) used per second is this formula : (100w * cycles_per_second)SQUARED
Maximum Oxygen Added to Air per "cycle" = 100 units (minimum 0 units)
1 person consumes 10 units of O2 per second
Max occupancy of room is 100 person (1 person is min)
inputs are processed every cycle and outputs can be changed each cycle - however if an output is fed back in as an input it could only affect the next cycle.
Lets say I have these inputs:
A. current oxygen in room (range: 0 to 1000 units for simplicity - could be normalized)
B. current occupancy in room (0 to 100 people at max capacity) OR/AND could be changed to total O2 used by all people in room per second (0 to 1000 units per second)
C. current cycles per second of air processing (0.3 to 3.0 cycles per second)
D. Current energy used (which is the above current cycles per second * 100 and then squared)
E. Current Oxygen added to air per cycle (0 to 100 units)
(possible outputs fed back in as inputs?):
F. previous change to cycles per second (+ or - 0.0 to 0.1 cycles per second)
G. previous cycles O2 units added per cycle (from 0 to 100 units per cycle)
H. previous change to current occupancy maximum (0 to 100 persons)
Here are the actions (outputs) my program can take:
Change cycles per second by increment/decrement of (0.0 to 0.1 cycles per second)
Change O2 units added per cycle (from 0 to 100 units per cycle)
Change current occupancy maximum (0 to 100 persons) - (basically allowing for forced occupancy reduction and then allowing it to normalize back to maximum)
The GOALS of the program are to maintain a homeostasis of :
as close to 100,000 units of O2 in room
do not allow room to drop to 0 units of O2 ever.
allows for current occupancy of up to 100 people per room for as long as possible without forcibly removing people (as O2 in room is depleted over time and nears 0 units people should be removed from room down to minimum and then allow maximum to recover back up to 100 as more and more 02 is added back to room)
and ideally use the minimum energy (watts) needed to maintain above two conditions. For instance if the room was down to 90,000 units of O2 and there are currently 10 people in the room (using 100 units per second of 02), then instead of running at 3.0 cycles per second (90 kw) and 100 units per second to replenish 300 units per second total (a surplus of 200 units over the 100 being consumed) over 50 seconds to replenish the deficit of 10,000 units for a total of 4500 kw used. - it would be more ideal to run at say 2.0 cycle per second (40 kw) which would produce 200 units per second (a surplus of 100 units over consumed units) for 100 seconds to replenish the deficit of 10,000 units and use a total of 4000 kw used.
NOTE: occupancy may fluctuate from second to second based on external factors that can not be controlled (lets say people are coming and going into the room at liberty). The only control the system has is to forcibly remove people from the room and/or prevent new people from coming into the room by changing the max capacity permitted at that next cycle in time (lets just say the system could do this). We don't want the system to impose a permanent reduction in capacity just because it can only support outputting enough O2 per second for 30 people running at full power. We have a large volume of available O2 and it would take a while before that was depleted to dangerous levels and would require the system to forcibly reduce capacity.
My question:
Can someone explain to me how I might configure this neural network so it can learn from each action (Cycle) it takes by monitoring for the desired results. My challenge here is that most articles I find on the topic assume that you know the correct output answer (ie: I know A, B, C, D, E inputs all are a specific value then Output 1 should be to increase by 0.1 cycles per second).
But what I want is to meet the conditions I laid out in the GOALS above. So each time the program does a cycle and lets say it decides to try increasing the cycles per second and the result is that available O2 is either declining by a lower amount than it was the previous cycle or it is now increasing back towards 100,000, then that output could be considered more correct than reducing cycles per second or maintaining current cycles per second. I am simplifying here since there are multiple variables that would create the "ideal" outcome - but I think I made the point of what I am after.
Code:
For this test exercise I am using a Swift library called Swift-AI (specifically the NeuralNet module of it : https://github.com/Swift-AI/NeuralNet
So if you want to tailor you response in relation to that library it would be helpful but not required. I am more just looking for the logic of how to setup the network and then configure it to do initial and iterative re-training of itself based on those conditions I listed above. I would assume at some point after enough cycles and different conditions it would have the appropriate weightings setup to handle any future condition and re-training would become less and less impactful.
This is a control problem, not a prediction problem, so you cannot just use a supervised learning algorithm. (As you noticed, you have no target values for learning directly via backpropagation.) You can still use a neural network (if you really insist). Have a look at reinforcement learning. But if you already know what happens to the oxygen level when you take an action like forcing people out, why would you learn such a simple facts by millions of evaluations with trial and error, instead of encoding it into a model?
I suggest to look at model predictive control. If nothing else, you should study how the problem is framed there. Or maybe even just plain old PID control. It seems really easy to make a good dynamical model of this process with few state variables.
You may have a few unknown parameters in that model that you need to learn "online". But a simple PID controller can already tolerate and compensate some amount of uncertainty. And it is much easier to fine-tune a few parameters than to learn the general cause-effect structure from scratch. It can be done, but it involves trying all possible actions. For all your algorithm knows, the best action might be to reduce the number of oxygen consumers to zero permanently by killing them, and then get a huge reward for maintaining the oxygen level with little energy. When the algorithm knows nothing about the problem, it will have to try everything out to discover the effect.

Could you explain why the expected running time of randomized quick sort is Theta of nlogn?

What is difference between expected running time and running time AND Could you explain why the expected running time of randomized quick sort is Theta of nlogn.
Expected Running Time:
For often times expected running time just means the average running time for random inputs. But if it's about a randomized algorithm, which is the case here, it means that the algorithm is running inputs with random choices made by the algorithm.
Proofing running time:
As for proofing the theta n(logn) running time of quicksort, it will be involved with more complicated mathematics, here's reference from CMU that proofs the running time of theta n(logn)(https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0906.pdf).
Not to bother with intricate mathematics, I am only going to focus on how we can comprehend this running time by intuition
Non-random Inputs:If each pivot has rank somewhere in the middle 50 percent, that is, between the 25th percentile and the 75th percentile, then it splits the elements with at least 25% and at most 75% on each side. Then splitting up the data gives us O(logn) and calling on each of them gives us total running time theta of nlog(n). Make sure you understand why this non-randomized quicksort has average running time theta of nlog(n).
Now let's talk about expected running time of randomized quicksort...
When inputs are random, the pivot is not guarateed to be in the middle 50 percent. However, when we start from a random pivot, the pivot would land in the middle 50 percent for about half of the time. Imagine that you are flipping a coin coin until you get n heads. On average cases, you will only need to flip 2k times. By the same token, Quicksort's recursion will terminate on expected 2 times of what non-random inputs would need, which should only be constant multiple of O(logn). Each level of call tree would be called on n times, the expected total work will still be theta (nlogn).

Equation for determining average data transfer speed when day/night throttling limit is different

this may be better posted in Mathematics, but figured someone in StackOverflow may have seen this before. I am trying to devise an equation for determining the average data transfer speed for backup appliances that offsite their data to a data center.
On weekdays during the 8:00a-5:00p hours (1/3 of the day), the connection is throttled to 20% of the measured bandwidth. The remaining 2/3 of the weekday (5:00p-8:00a), the connection is throttled to 80% of the measured bandwidth. On the weekend from Friday 5:00p until Monday 8:00a, the connection is a constant 80% of the measured bandwidth.
The reason behind this is deciding whether to seed the data onto a hard drive versus letting the data transfer over the internet. Making this decision is based on getting a somewhat accurate bandwidth average so that I can calculate the transfer time
I had issues coming up with an equation, so I reverse engineered a few real world occurrences using just the weekday 80%/20% average. I came up with 57.5% of the measured bandwidth, but could not extrapolate an equation from it. Now I want to write a program to determine this. I am thinking factoring in the weekend being 80% the whole time would use a similar equation.
This would be similar scenario to a car travelling at 20% of speed limit for 1/3 of the day and then 80% of speed limit for the rest of that day, and then determine average car speed for the day. I searched online and could not find any reference to an equation for this. Any ideas?
Using the idea you provided, is direct the equation:
Average = (1/3) * bandwith_1 + (2/3) * bandwith_2
If bandwith_1 = 20 and bandwith_2 = 80, the equation gives a maximumm value of 59,99999%.

significant differences between means

Considering the picture below
each values X could be identified by the indeces X_g_s_d_h
g = group g=[1:5]
s = subject number (variable for each g)
d = day number (variable for each s)
h = hour h=[1:24]
so X_1_3_4_12 means that the value X is referred to the
12th hour
of 4th day
of 3rd subject
of group 1
First I calculate the mean (hour by hour) over all the days of each subject. Doing that the index d disappear and each subject is represented by a vector containing 24 values.
X_g_s_h will be the mean over the days of a subject.
Then I calculate the mean (subject by subject) of all the subjects belonging to the same group resulting in X_g_h. Each group is represented by 1 vector of 24 values
Then I calculate the mean over the hours for each group resulting in X_g. Each group now is represented by 1 single value
I would like to see if the means X_g are significantly different between the groups.
Can you tell me what is the proper way?
ps
The number of subjects per group is different and it is also different the number of days for each subject. I have more than 2 groups
Thanks
Ok so I am posting an answer to summarize some of the problems you may have.
Same subjects in both groups
Not averaging:
1-First if we assume that you have only one measure that is repeated every hour for a certain amount of days, that is independent on which day you pick and each hour, then you can reshape your matrix into one column for each subject, per group and perform a ttest with repetitive measures.
2-If you cannot assume that your measure is independent on the hour, but is in day (lets say the concentration of a drug after administration that completely vanish before your next day measure), then you can make a ttest with repetitive measures for each hour (N hours), having a total of N tests.
3-If you cannot assume that your measure is independent on the day, but is in hour (lets say a measure for menstrual cycle, which we will assume stable at each day but vary between days), then you can make a ttest with repetitive measures for each day (M days), having a total of M tests.
4-If you cannot assume that your measure is independent on the day and hour, then you can make a ttest with repetitive measures for each day and hour, having a total of NXM tests.
Averaging:
In the cases where you cannot assume independence you can average the dependent variables, therefore removing the variance but also lowering you statistical power and interpretation.
In case 2, you can average the hours to have a mean concentration and perform a ttest with repetitive measures, therefore having only 1 test. Here you lost the information how it changed from hour 1 to N, and just tested whether the mean concentration between groups within the tested hours is different.
In case 3, you can average both hour and day, and test if for example the mean estrogen is higher in one group than in another, therefore having only 1 test. Again you lost information how it changed between the different days.
In case 4, you can average both hour and day, therefore having only 1 test. Again you lost information how it changed between the different hours and days.
NOT same subjects in both groups
Paired tests are not possible. Follow the same ideology but perform an unpaired test.
You need to perform a statistical test for the null hypothesis H0 that the data in different groups comes from independent random samples from distributions with equal means. It's better to avoid sequential 'mean' operation, but just to regroup data on g. If you assume normality and independence of observations (as pointed out by #ASantosRibeiro below), that you can perform ttest (http://www.mathworks.nl/help/stats/ttest2.html)
clear all;
X = randn(6,5,4,3); %dummy data in g_s_d_h format
Y = reshape(X,5*4*3,6); %reshape data per group
h = zeros(6,6);
for i = 1 : 6
for j = 1 : 6
h(i,j)=ttest2(Y(:,i),Y(:,j));
end
end
If you want to take into account the different weights of the observations, you need to calculate t-value yourself (e.g., see here http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000126.htm)