I'm trying to review for my final and I'm going over example problems given to me by my professor. Can anyone explain to me the concept of how leaky bucket works. Also Here's a review problem my professor gave to me about leaky buckets.
A leaky bucket is at the host network interface. The data rate in the network is 2 Mbyte/s and the data rate from the application to the bucket is 2m5 Mbyte/s
A.) Suppose the host has 250 Mbytes to send onto the network and it sends the data in one burst. What should the minimum capacity of the bucket (in byte) in order that no data is lost?
B.) Suppose the capacity of the bucket is 100M bytes. What is the longest burst time from the host in order that no data is lost?
Leaky bucket symbolizes a bucket with a small hole allowing water (data) to come out at the bottom. Since the top of the bucket has a greater aperture than the bottom, you can put water in it faster that it goes out (so the bucket fills up).
Basically, it represents a buffer on a network between 2 links with different rates.
Problem A
We can compute that sending the data will take 250Mbyte / (2,5Mbyte / s) = 100 s.
During that 100 s, the bucket will have retransmitted (leaked) 100s * 2Mbyte/s = 200Mbytes
So the bucket will need a minimum capacity of 250MB - 200MB = 50MB in order not to lose any data
Problem B
Since the difference between the 2 data rates is 2.5MB/s - 2.0MB/s = 0.5MB/s, it means the bucked fills up by 0.5MB/s (when both links transmit at full capacity).
You can then calculate that the 100MB capacity will be filled after a burst of 100MB / 0.5MB/s = 200s = 3m 20s
Interesting problem - here's my attempt at solving A (no gurantees it's right though!)
So rate in = 2.5, rate out = 2.0, where the rate is in Mbyte/s.
So in 1 second, the bucket will contain 2.5 - 2.0 = 0.5 Mbyte.
1) If the host sends 250 Mbytes. This will take 100 seconds to transfer into the bucket at 2.5 Mbytes/s.
2) If the bucket drains at 2.0 Mbytes/s then it will have drained 100 * 2 = 200 Mbytes.
So I think you need a bucket which is 50 Mbytes capacity.
Related
When choosing the Enterprise Plan for IBM Event Streams, there is a huge cost associated for Base Capacity Unit-Hour which costs more than $5K per month if I put 720 hours in it (assuming 1 month is 720 hours).
This makes it way too expensive and made me wonder if I understood correctly what "Base Capacity Unit-Hour" means.
Just got this from an IBM rep:
The price is 6.85/Base Capacity Unit, and that's by the hour. So broken down we have, 6.85 * 24 * 30 = 4932/month, which makes your estimate correct. Base Capacity Unit covers 150 MB/s(3 brokers) and 2TB storage. If you find you need to scale up, then the rate will increase from there.
In order to simplify the question and hopefully the answer I will provide a somewhat simplified version of what I am trying to do.
Setting up fixed conditions:
Max Oxygen volume permitted in room = 100,000 units
Target Oxygen volume to maintain in room = 100,000 units
Maximum Air processing cycles per sec == 3.0 cycles per second (min is 0.3)
Energy (watts) used per second is this formula : (100w * cycles_per_second)SQUARED
Maximum Oxygen Added to Air per "cycle" = 100 units (minimum 0 units)
1 person consumes 10 units of O2 per second
Max occupancy of room is 100 person (1 person is min)
inputs are processed every cycle and outputs can be changed each cycle - however if an output is fed back in as an input it could only affect the next cycle.
Lets say I have these inputs:
A. current oxygen in room (range: 0 to 1000 units for simplicity - could be normalized)
B. current occupancy in room (0 to 100 people at max capacity) OR/AND could be changed to total O2 used by all people in room per second (0 to 1000 units per second)
C. current cycles per second of air processing (0.3 to 3.0 cycles per second)
D. Current energy used (which is the above current cycles per second * 100 and then squared)
E. Current Oxygen added to air per cycle (0 to 100 units)
(possible outputs fed back in as inputs?):
F. previous change to cycles per second (+ or - 0.0 to 0.1 cycles per second)
G. previous cycles O2 units added per cycle (from 0 to 100 units per cycle)
H. previous change to current occupancy maximum (0 to 100 persons)
Here are the actions (outputs) my program can take:
Change cycles per second by increment/decrement of (0.0 to 0.1 cycles per second)
Change O2 units added per cycle (from 0 to 100 units per cycle)
Change current occupancy maximum (0 to 100 persons) - (basically allowing for forced occupancy reduction and then allowing it to normalize back to maximum)
The GOALS of the program are to maintain a homeostasis of :
as close to 100,000 units of O2 in room
do not allow room to drop to 0 units of O2 ever.
allows for current occupancy of up to 100 people per room for as long as possible without forcibly removing people (as O2 in room is depleted over time and nears 0 units people should be removed from room down to minimum and then allow maximum to recover back up to 100 as more and more 02 is added back to room)
and ideally use the minimum energy (watts) needed to maintain above two conditions. For instance if the room was down to 90,000 units of O2 and there are currently 10 people in the room (using 100 units per second of 02), then instead of running at 3.0 cycles per second (90 kw) and 100 units per second to replenish 300 units per second total (a surplus of 200 units over the 100 being consumed) over 50 seconds to replenish the deficit of 10,000 units for a total of 4500 kw used. - it would be more ideal to run at say 2.0 cycle per second (40 kw) which would produce 200 units per second (a surplus of 100 units over consumed units) for 100 seconds to replenish the deficit of 10,000 units and use a total of 4000 kw used.
NOTE: occupancy may fluctuate from second to second based on external factors that can not be controlled (lets say people are coming and going into the room at liberty). The only control the system has is to forcibly remove people from the room and/or prevent new people from coming into the room by changing the max capacity permitted at that next cycle in time (lets just say the system could do this). We don't want the system to impose a permanent reduction in capacity just because it can only support outputting enough O2 per second for 30 people running at full power. We have a large volume of available O2 and it would take a while before that was depleted to dangerous levels and would require the system to forcibly reduce capacity.
My question:
Can someone explain to me how I might configure this neural network so it can learn from each action (Cycle) it takes by monitoring for the desired results. My challenge here is that most articles I find on the topic assume that you know the correct output answer (ie: I know A, B, C, D, E inputs all are a specific value then Output 1 should be to increase by 0.1 cycles per second).
But what I want is to meet the conditions I laid out in the GOALS above. So each time the program does a cycle and lets say it decides to try increasing the cycles per second and the result is that available O2 is either declining by a lower amount than it was the previous cycle or it is now increasing back towards 100,000, then that output could be considered more correct than reducing cycles per second or maintaining current cycles per second. I am simplifying here since there are multiple variables that would create the "ideal" outcome - but I think I made the point of what I am after.
Code:
For this test exercise I am using a Swift library called Swift-AI (specifically the NeuralNet module of it : https://github.com/Swift-AI/NeuralNet
So if you want to tailor you response in relation to that library it would be helpful but not required. I am more just looking for the logic of how to setup the network and then configure it to do initial and iterative re-training of itself based on those conditions I listed above. I would assume at some point after enough cycles and different conditions it would have the appropriate weightings setup to handle any future condition and re-training would become less and less impactful.
This is a control problem, not a prediction problem, so you cannot just use a supervised learning algorithm. (As you noticed, you have no target values for learning directly via backpropagation.) You can still use a neural network (if you really insist). Have a look at reinforcement learning. But if you already know what happens to the oxygen level when you take an action like forcing people out, why would you learn such a simple facts by millions of evaluations with trial and error, instead of encoding it into a model?
I suggest to look at model predictive control. If nothing else, you should study how the problem is framed there. Or maybe even just plain old PID control. It seems really easy to make a good dynamical model of this process with few state variables.
You may have a few unknown parameters in that model that you need to learn "online". But a simple PID controller can already tolerate and compensate some amount of uncertainty. And it is much easier to fine-tune a few parameters than to learn the general cause-effect structure from scratch. It can be done, but it involves trying all possible actions. For all your algorithm knows, the best action might be to reduce the number of oxygen consumers to zero permanently by killing them, and then get a huge reward for maintaining the oxygen level with little energy. When the algorithm knows nothing about the problem, it will have to try everything out to discover the effect.
We have aerospike running in the Soft layer in bare metal machines in 2 node cluster. our profile average size is 1.5 KB and at peak, operations will be around 6000 ops/sec in each node. The latencies are all fine which is at peak > 1ms will be around 5%.
Now we planned to migrate to aws. So we booted 2 i3.xlarge machines. We ran the benchmark with the 1.5KB object size with the 3x load. results were satisfactory, that is around 4-5%(>1ms). Now we started actual processing, the latencies at peak jumped to 25-30% that is > 1ms and maximum it can accommodate is some 5K ops/sec. So we added one more node, we did benchmark (4.5KB object size and 3x load). The results were 2-4%(>1ms). Now after adding to cluster, the peak came down to 16-22%. We added one more node and peak is now at 10-15%.
The version in aws is aerospike-server-community-3.15.0.2 the version in Sl is Aerospike Enterprise Edition 3.6.3
Our config as follows
#Aerospike database configuration file.
service {
user xxxxx
group xxxxx
run-as-daemon
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
pidfile /var/run/aerospike/asd.pid
service-threads 8
transaction-queues 8
transaction-threads-per-queue 8
proto-fd-max 15000
}
logging {
#Log file must be an absolute path.
file /var/log/aerospike/aerospike.log {
context any info
}
}
network {
service {
port 13000
address h1 reuse-address
}
heartbeat {
mode mesh
port 13001
address h1
mesh-seed-address-port h1 13001
mesh-seed-address-port h2 13001
mesh-seed-address-port h3 13001
mesh-seed-address-port h4 13001
interval 150
timeout 10
}
fabric {
port 13002
address h1
}
info {
port 13003
address h1
}
}
namespace XXXX {
replication-factor 2
memory-size 27G
default-ttl 10d
high-water-memory-pct 70
high-water-disk-pct 60
stop-writes-pct 90
storage-engine device {
device /dev/nvme0n1
scheduler-mode noop
write-block-size 128K
}
}
What should be done to bring down latencies in aws?
This comes down to the difference in the performance characteristics of the SSDs of the i3 nodes, compared to what you had on Softlayer. If you ran Aerospike on a floppy disk you'd get 0.5TPS.
Piyush's comment mentions ACT, the open source tool Aerospike has created to benchmark SSDs with real database workloads. The point of ACT is to find the sustained rate in which the SSD can be relied on to deliver the latency you want. Burst rates don't matter much for databases.
The performance engineering team at Aerospike has used ACT to find what the i3 1900G SSD can do, and published the results in a post. Its ACT rating is 4x, meaning that the full 1900G SSD can do 8Ktps reads, 4Ktps writes with the standard 1.5K object size, 128K block size, and stay at 95% < 1ms, 99% < 8ms, 99.9% < 64ms. This is not particularly good for an SSD. By comparison, a Micron 9200 PRO rates at 94.5x, nearly 24 times higher TPS load. What more, with the i3.xlarge you're sharing half that drive with a neighbor. There's no way to cap the IOPS so that you each get half, there's only a partition of the storage. This means that you can expect latency spikes originating in the neighbor. The i3.2xlarge is the smallest instance that gives you the entire SSD.
So, you take the ACT information and you use it to do capacity planning. The main factors you need to know are the average object size (you can find that using objsz histogram), number of objects (again, available via asadm), peak read TPS and peak write TPS (how does the 60Ktps you mentioned split between reads and writes?).
Check your logs for your cache-read-pct values. If they're in the range of 10% or higher you should be raising your post-write-queue value to get better read latencies (and also reduce IOPS pressure from the drive).
I'm facing difficulty with the following question :
Consider a disk drive with the following specifications .
16 surfaces, 512 tracks/surface, 512 sectors/track, 1 KB/sector, rotation speed 3000 rpm. The disk is operated in cycle stealing mode whereby whenever 1 byte word is ready it is sent to memory; similarly for writing, the disk interface reads a 4 byte word from the memory in each DMA cycle. Memory Cycle time is 40 ns. The maximum percentage of time that the CPU gets blocked during DMA operation is?
the solution to this question provided on the only site is :
Revolutions Per Min = 3000 RPM
or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
No. of tracks read per second = (2^19/2^2)*50
= 6553600 ............. (1)
Interrupt = 6553600 takes 0.2621 sec
Percentage Gain = (0.2621/1)*100
= 26 %
I have understood till (1).
Can anybody explain me how has 0.2621 come ? How is the interrupt time calculated? Please help .
Reversing form the numbers you've given, that's 6553600 * 40ns that gives 0.2621 sec.
One quite obvious problem is that the comments in the calculations are somewhat wrong. It's not
Revolutions Per Min = 3000 RPM ~ or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
No. of tracks read per second = (2^19/2^2)*50 <- WRONG
The numbers are 512K / 4 * 50. So, it's in bytes. How that could be called 'number of tracks'? Reading the full track is 1 full rotation, so the number of tracks readable in 1 second is 50, as there are 50 RPS.
However, the total bytes readable in 1s is then just 512K * 50 since 512K is the amount of data on the track.
But then it is further divided by 4..
So, I guess, the actual comments should be:
Revolutions Per Min = 3000 RPM ~ or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
Interrupts per second = (2^19/2^2) * 50 = 6553600 (*)
Interrupt triggers one memory op, so then:
total wasted: 6553600 * 40ns = 0.2621 sec.
However, I don't really like how the 'number of interrupts per second' is calculated. I currently don't see/fell/guess how/why it's just Bytes/4.
The only VAGUE explanation of that "divide it by 4" I can think of is:
At each byte written to the controller's memory, an event is triggered. However the DMA controller can read only PACKETS of 4 bytes. So, the hardware DMA controller must WAIT until there are at least 4 bytes ready to be read. Only then the DMA kicks in and halts the bus (or part of) for a duration of one memory cycle needed to copy the data. As bus is frozen, the processor MAY have to wait. It doesn't NEED to, it can be doing its own ops and work on cache, but if it tries touching the memory, it will need to wait until DMA finishes.
However, I don't like a few things in this "explanation". I cannot guarantee you that it is valid. It really depends on what architecture you are analyzing and how the DMA/CPU/BUS are organized.
The only mistake is its not
no. of tracks read
Its actually no. of interrupts occured (no. of times DMA came up with its data, these many times CPU will be blocked)
But again I don't know why 50 has been multiplied,probably because of 1 second, but I wish to solve this without multiplying by 50
My Solution:-
Here, in 1 rotation interface can read 512 KB data. 1 rotation time = 0.02 sec. So, one byte data preparation time = 39.1 nsec ----> for 4B it takes 156.4 nsec. Memory Cycle time = 40ns. So, the % of time the CPU get blocked = 40/(40+156.4) = 0.2036 ~= 20 %. But in the answer booklet options are given as A) 10 B)25 C)40 D)50. Tell me if I'm doing wrong ?
We are going to use mongodb for an automated alert notification system. This will also notify different server statistics and business statistics. We would like to have a separate server for this and need to assess the hard ware(both RAM, hard disc and other configurations if any)
Shall some one shed some light on these plases....
What are the things to consider...?
How to prceeed once we collect that information(Is there any standard)...?
Currently I have only the below information.
Writes per second: 400
Average record size in the write: 5KB
Data retendancy policy: 30days
Mongodb buffers writes in memory and flushes them to disk once a while (60sec by default, can be configured with --syncdelay), so writing 400 5KB docs per sec is not going to be a problem if mongo can quickly update all indices (it would be helpful if you could give some info on the type and number of indices you're going to have).
You're going to have 1'036'800'000 documents / 5TB of raw data each month. Mongo will need more than 5TB to store that (for each doc it will repeat all key names, plus indices). To estimate index size:
2 * [ n * ( 18 bytes overhead + avg size of indexed field + 5 or so bytes of conversion fudge factor ) ]
Where n is the number of documents you have.
And then you can estimate the amount of RAM (you need to fit your indices there if you care about query performance).