Creating an optimal selection of overlapping time intervals - matlab

A car dealer rents out the rare 1956 Aston Martin DBR1 (of which Aston Martin only ever made 5).
Since there are so many rental requests, the dealer decides to place bookings for an entire year in advance.
He collects the requests and now needs to figure out which requests to take.
Make a script that selects the rental requests such that greatest number of individual customers
can drive in the rare Aston Martin.
The input of the script is a matrix of days of the year, each row representing the starting and ending
days of the request. The output should be the indices of the customers and their day ranges.
It is encouraged to plan your code first and write your own functions.
At the top of the script, add a comment block with a description of how your code works.
Example of a list with these time intervals:
list = [10 20; 9 15; 16 17; 21 100;];
(It should also work for a list with 100 time intervals)
We could select customers 1 and 4, but then 2 and 3 are impossible, resulting in two happy customers.
Alternatively we could select requests 2, 3 and 4. Hence three happy customers is the optimum here.
The output would be:
customers = [2, 3, 4],
days = [9, 15; 16, 17; 21, 100]
All I can think of is checking if intervals intersect, but I have no clue how to make an overall optimal selection.

My idea:
1) Sort them by start date
2) Make an array of intersections for each one
3) Start to reject from the ones which has the biggest intersection array, removing rejected item from arrays of intersected units
4) Repeat point 3 until only units with empty arrays will remain
In your example we will get data
10 20 [9 15, 16 17]
9 15 [10 20]
16 17 [10 20]
21 100 []
so we reject 10 20 as it has 2 intersections, so we will have only items with empty arrays
9 15 []
16 17 []
21 100 []
so the search is finished
code on javascript
const inputData = ' 50 74; 6 34; 147 162; 120 127; 98 127; 120 136; 53 68; 145 166; 95 106; 242 243; 222 250; 204 207; 69 79; 183 187; 198 201; 184 199; 223 245; 264 291; 100 121; 61 61; 232 247'
// convert string to array of objects
const orders = inputData.split(';')
.map((v, index) => (
{
id: index,
start: Number(v.split(' ')[1]),
end: Number(v.split(' ')[2]),
intersections: []
}
))
// sort them by start value
orders.sort((a, b) => a.start - b.start)
// find intersection for each one and add them to intersection array
orders.forEach((item, index) => {
for (let i = index + 1; i < orders.length; i++) {
if (orders[i].start <= item.end) {
item.intersections.push(orders[i])
orders[i].intersections.push(item)
} else {
break
}
}
})
// sort by intersections count
orders.sort((a, b) => a.intersections.length - b.intersections.length)
// loop while at least one item still has intersections
while (orders[orders.length - 1].intersections.length > 0) {
const rejected = orders.pop()
// remove rejected item from other's intersections
rejected.intersections.forEach(item => {
item.intersections = item.intersections.filter(
item => item.id !== rejected.id
)
})
// sort by intersections count
orders.sort((a, b) => a.intersections.length - b.intersections.length)
}
// sort by start value
orders.sort((a, b) => a.start - b.start)
// show result
orders.forEach(item => { console.log(item.start + ' - ' + item.end)})

Wanted to expand/correct a little bit on the acvepted answer.
You should start by sorting by the start date.
Then accept the very last customer.
Go through the list descending from there and accept all request that do not overlap with the already accepted ones.
That's the optimal solution.

Related

How to round integer number using precision in flutter

I am trying to make the Y axis intervals of linechart dynamic in flutter. Here the MaxVal will get the maximum value of the Y axis.
int interval = (maxVal/6).toInt();
int length = interval.toString().length.toInt();
So here I have divided the maxVal with 6 so I will get the interval and I will find out the length. Next I need to round the interval according to the length. But I couldn't see any option to add precision for in flutter.
The Expected Output
If maxVal = 10000 then
interval will 1666
then length will 4. Then
I expected rounded value will be 2000
I'm assuming that you're asking to round a (non-negative) integer to its most significant base-10 digit. A general way to round non-negative integers is to add half of the unit you want to round to and to then truncate. For example, if you want to round a non-negative integer to the nearest 1000, you can add 1000/2 = 500, and then discard the hundreds, tens, and ones digits. An easy way to discard those digits is to perform an integer division by 1000 and then to multiply by 1000 again.
The trickiest part is determining what unit you want to round to since that's variable. If you want the most significant base-10 digit, you will need to determine the number of digits. In theory you can compute that with logarithms, but it's usually risky to depend on exact results with floating-point arithmetic, you'd have to deal with 0 as a special case, and it's harder to be confident of correctness. It's simpler and less error-prone to just find the length of the number's string representation. Once we find the number of digits, we can determine which unit to round to computing a corresponding power of 10.
import 'dart:math';
/// Rounds [n] to the nearest multiple of [multiple].
int roundToMultiple(int n, int multiple) {
assert(n >= 0);
assert(multiple > 0);
return (n + (multiple ~/ 2)) ~/ multiple * multiple;
}
/// Rounds [n] to its most significant digit.
int roundToMostSignificantDigit(int n) {
assert(n >= 0);
var numDigits = n.toString().length;
var magnitude = pow(10, numDigits - 1) as int;
return roundToMultiple(n, magnitude);
}
void main() {
var inputs = [
0,
1,
5,
9,
10,
11,
16,
19,
20,
21,
49,
50,
51,
99,
100,
469,
833,
1666,
];
for (var n in inputs) {
var rounded = roundToMostSignificantDigit(n);
print('$n => $rounded');
}
}
which prints:
0 => 0
1 => 1
5 => 5
9 => 9
10 => 10
11 => 10
16 => 20
19 => 20
20 => 20
21 => 20
49 => 50
50 => 50
51 => 50
99 => 100
100 => 100
469 => 500
833 => 800
1666 => 2000
(The above code should be tweakable to handle negative numbers too if desired, but first you would need to define whether negative numbers should be rounded toward 0 or toward negative infinity.)

Find the number at the n position in the infinite sequence

Having an infinite sequence s = 1234567891011...
Let's find the number at the n position (n <= 10^18)
EX: n = 12 => 1; n = 15 => 2
import Foundation
func findNumber(n: Int) -> Character {
var i = 1
var z = ""
while i < n + 1 {
z.append(String(i))
i += 1
}
print(z)
return z[z.index(z.startIndex, offsetBy: n-1)]
}
print(findNumber(n: 12))
That's my code but when I find the number at 100.000th position, it returns an error, I thought I appended too many i to z string.
Can anyone help me, in swift language?
The problem we have here looks fairly straight forward. Take a list of all the number 1-infinity and concatenate them into a string. Then find the nth digit. Straight forward problem to understand. The issue that you are seeing though is that we do not have an infinite amount of memory nor time to be able to do this reasonably in a computer program. So we must find an alternative way around this that does not just add the numbers onto a string and then find the nth digit.
The first thing we can say is that we know what the entire list is. It will always be the same. So can we use any properties of this list to help us?
Let's call the input number n. This is the position of the digit that we want to find. Let's call the output digit d.
Well, first off, let's look at some examples.
We know all the single digit numbers are just in the same position as the number itself.
So, for n<10 ... d = n
What about for two digit numbers?
Well, we know that 10 starts at position 10. (Because there are 9 single digit numbers before it). 9 + 1 = 10
11 starts at position 12. Again, 9 single digits + one 2 digit number before it. 9 + 2 + 1 = 12
So how about, say... 25? Well that has 9 single digit numbers and 15 two digit numbers before it. So 25 starts at 9*1 + 15*2 + 1 = 40 (+ 1 as the sum gets us to the end of 24 not the start of 25).
So... 99 starts at? 9*1 + 89*2 + 1 = 188.
The we do the same for the three digit numbers...
100... 9*1 + 90*2 + 1 = 190
300... 9*1 + 90*2 + 199*3 + 1 = 787
1000...? 9*1 + 90*2 + 900*3 + 1 = 2890
OK... so now I'm seeing a pattern here that seems to need to know the number of digits in each number. Well... I can get the number of digits in a number by rounding up the log(base 10) of that number.
rounding up log base 10 of 5 = 1
rounding up log base 10 of 23 = 2
rounding up log base 10 of 99 = 2
rounding up log base 10 of 627 = 3
OK... so I think I need something like...
// in pseudo code
let lengthOfNumber = getLengthOfNumber(n)
var result = 0
for each i from 0 to lengthOfNumber - 1 {
result += 9 * 10^i * (i + 1) // this give 9*1 + 90*2 + 900*3 + ...
}
let remainder = n - 10^(lengthOfNumber - 1) // these have not been added in the loop above
result += remainder * lengthOfNumber
So, in the above pseudo code you can give it any number and it will return the position in the list that that number starts on.
This isn't the exact same as the problem you are trying to solve. And I don't want to solve it for you.
This is just a leg up on how I would go about solving it. Hopefully, this will give you some guidance on how you can take this further and solve the problem that you are trying to solve.

How to efficiently perform nested-loop in Spark/Scala?

So I have this main dataframe, called main_DF which contain all measurement values:
main_DF
group index width height
--------------------------------
1 1 21.3 15.2
1 2 11.3 45.1
2 3 23.2 25.2
2 4 26.1 85.3
...
23 986453 26.1 85.3
And another table called selected_DF, derived from main_DF, which contain the start & end index of important rows in main_DF, along with the length (end_index - start_index). The fields start_index and end_index correspond with field index in main_DF.
selected_DF
group start_index end_index length
--------------------------------
1 1 154 153
2 236 312 76
3 487 624 137
...
238 17487 18624 1137
Now, for each row in selected_DF, I need to perform filtering for all measurement values between the start_index and end_index. For example, let's say row 1 is for index = 1 until 154. After some filtering, dataframe derived from this row is:
peak_DF
peak_start peak_end
--------------------------------
1 12
15 21
27 54
86 91
...
143 150
peak_start and peak_end indicate the area where width exceeds the threshold. It was obtained by selecting all width > threshold, and then check the position of its index (sorry but it's kind of hard to explain, even with the code)
Then I need to take the measurement value (width) based on peak_DF and calculate the average, making it something like:
peak_DF_summary
peak_start peak_end avg_width
--------------------------------
1 12 25.6
15 21 35.7
27 54 24.2
86 91 76.6
...
143 150 13.1
And, lastly, calculate the average of avg_width, and save the result.
After that, the curtain moves to the next row in selected_DF, and so on.
So far I somehow managed to obtain what I want with this code:
val main_DF = spark.read.parquet("hdfs_path_here")
df.createOrReplaceTempView("main_DF")
val selected_DF = spark.read.parquet("hdfs_path_here").collect.par //parallelized array
val final_result_array = scala.collection.mutable.ArrayBuffer.empty[Array[Double]] //for final result
selected_DF.foreach{x =>
val temp = x.split(',')
val start_index = temp(1)
val end_index = temp(2)
//obtain peak_start and peak_end (START)
val temp_df_1 = spark.sql( " (SELECT index, width, height FROM main_DF WHERE width > 25 index BETWEEN " + start_index + " AND " + end_index + ")")
val temp_df_2 = temp_df_1.withColumn("next_index", lead(temp_df("index"), 1).over(window_spec) ).withColumn("previous_index", lag(temp_df("index"), 1).over(window_spec) )
val temp_df_3 = temp_df_2.withColumn("rear_gap", temp_df_2.col("index") - temp_df_2.col("previous_index") ).withColumn("front_gap", temp_df_2.col("next_index") - temp_df_2.col("index") )
val temp_df_4 = temp_df_3.filter("front_gap > 9 or rear_gap > 9")
val temp_df_5 = temp_df_4.withColumn("next_front_gap", lead(temp_df_4("front_gap"), 1).over(window_spec) ).withColumn("next_front_gap_index", lead(temp_df_4("index"), 1).over(window_spec) )
val temp_df_6 = temp_df_5.filter("rear_gap > 9 and next_front_gap > 9").sort("index")
//obtain peak_start and peak_end (END)
val peak_DF = temp_df_6.select("index" , "next_front_gap_index").toDF("peak_start", "peak_end").collect
val peak_DF_temp = peak_DF.map { y =>
spark.sql( " (SELECT avg(width) as avg_width FROM main_DF WHERE index BETWEEN " + y(0) + " AND " + y(1) + ")")
}
val peak_DF_summary = peak_DF_temp.reduceLeft( (dfa, dfb) => dfa.unionAll(dfb) )
val avg_width = peak_DF_summary.agg(mean("avg_width")).as[(Double)].first
final_result_array += avg_width._1
}
spark.catalog.dropTempView("main_DF")
(reference)
The problem is, the code can only run until around halfway (after 20-30 iterations) until it crashed and give out java.lang.OutOfMemoryError: Java heap space. It runs okay when I ran the iterations 1-by-1, though.
So my questions are:
How can there be insufficient memory? I thought the reason should be
accumulated usage of memory, so I add .unpersist() for every
dataframe inside foreach loop (even though I do no .persist()) to no avail.
But then, every memory consumption should be reset along with
re-initiation of variables when we enter new iteration in foreach
loop, no?
Is there any efficient way to do this kind of calculation? I am
doing nested-loop in Spark and I feel like this is a very
inefficient way to do this, but so far it's the only way I can get
result.
I'm using CDH 5.7 with Spark 2.1.0. My cluster has 6 nodes with 32GB memory (each) & 40 cores (total). main_DF is based on 30GB parquet file.

Scala, iterating a collection, working out 10% points

While iterating an arbitrarily-sized List, I'd like to print some output at ~10% intervals to show that the iteration is progressing. For any list of 10 or more elements, I want 10 outputs printed.
I've played around with % and Math functions, but am not always getting 10 outputs printed unless the list sizes are multiples of 10. Would appreciate your help.
One possibility is to calculate 10% of the size based on your input, and then use IterableLike.grouped to group based on that percent:
object Test {
def main(args: Array[String]): Unit = {
val range = 0 to Math.abs(Random.nextInt(100))
val length = range.length
val percent = Math.ceil((10.0 * length) / 100.0).toInt
println(s"Printing by $percent percent")
range.grouped(percent).foreach {
listByPercent =>
println(s"Printing $percent elements: ")
listByPercent.foreach(println)
}
}
}
Unless the length of your list is divisible by 10, then you are not going to get 10 print statements. Here I am rounding by interval up (ceil) so you will have less print statements. You could used Math.floor which would round down, and give you more print statements.
// Some list
val list = List.range(0, 27)
// Find the interval that is roughly 10 percent
val interval = Math.ceil(list.length / 10.0)
// Zip the list with the index, so that we can look at the indexes
list.zipWithIndex.foreach {
case (value, index) =>
// If an index is divisible by out interval, do your logging
if (index % interval == 0) {
println(s"$index / ${list.length}")
}
// Do something with the value here
}
Output:
0 / 27
3 / 27
6 / 27
9 / 27
12 / 27
15 / 27
18 / 27
21 / 27
24 / 27

Facebook interview: find out the order that gives max sum by selecting boxes with number in a ring, when the two next to it is destroyed

Didn't find any similar question about this.
This is a final round Facebook question:
You are given a ring of boxes. Each box has a non-negative number on it, can be duplicate.
Write a function/algorithm that will tell you the order at which you select the boxes, that will give you the max sum.
The catch is, if you select a box, it is taken off the ring, and so are the two boxes next to it (to the right and the left of the one you selected).
so if I have a ring of
{10 3 8 12}
If I select 12, 8 and 10 will be destroyed and you are left with 3.
The max will be selecting 8 first then 10, or 10 first then 8.
I tried re-assign the boxes their value by take its own value and then subtracts the two next to is as the cost.
So the old ring is {10 3 8 12}
the new ring is {-5, -15, -7, -6}, and I will pick the highest.
However, this definitely doesn't work if you have { 10, 19, 10, 0}, you should take the two 10s, but the algorithm will take the 19 and 0.
Help please?
It is most likely dynamic programming, but I don't know how.
The ring can be any size.
Here's some python that solves the problem:
def sublist(i,l):
if i == 0:
return l[2:-1]
elif i == len(l)-1:
return l[1:-2]
else:
return l[0:i-1] + l[i+2:]
def val(l):
if len(l) <= 3:
return max(l)
else:
return max([v+val(m) for v,m in [(l[u],sublist(u,l)) for u in range(len(l))]])
def print_indices(l):
print("Total:",val(l))
while l:
vals = [v+val(m) for v,m in [(l[u],sublist(u,l)) for u in range(len(l)) if sublist(u,l)]]
if vals:
i = vals.index(max(vals))
else:
i = l.index(max(l))
print('choice:',l[i],'index:',i,'new list:',sublist(i,l))
l = sublist(i,l)
print_indices([10,3,8,12])
print_indices([10,19,10,0])
Output:
Total: 18
choice: 10 index: 0 new list: [8]
choice: 8 index: 0 new list: []
Total: 20
choice: 10 index: 0 new list: [10]
choice: 10 index: 0 new list: []
No doubt it could be optimized a bit. The key bit is val(), which calculates the total value of a given ring. The rest is just bookkeeping.