RxScala recursive stream with timeout - scala

I'm trying to recursively define an observable that either emits items from a subject or, if a certain amount of time passes, a default value, in this case I'm using the timer's default value of zero. I'm using RxScala and have begun with the following code:
val s = PublishSubject[Int]()
def o: Observable[Unit] = {
val timeout = Observable.timer(1 second)
Observable.amb(s, timeout)
.first
.concatMap((v) => {
println(v)
o
})
}
ComputationScheduler().createWorker.schedule {
var value = 0
def loop(): Unit = {
Thread.sleep(5000)
s.onNext(value + 1)
value += 1
loop()
}
loop()
}
o.toBlocking.last
This seems like it should work, but the output is confusing. Every other sequence of zeros contains two instead of the expected four. Two zeros are emitted and the remaining three seconds elapses, but without output.
0
0
0
0
1
0
0
2
0
0
0
0
3
0
0
4
0
0
0
0
5
0
0
6
0
0
0
0
7
0
0
8

This one is puzzling indeed! So here's a theory:
In truth, your code is producing 4 ticks every 5 seconds rather than 5.
There is a race condition on the 4th, one, won first by the timeout, then the worker, then the timeout, etc.
So, instead of the sequence being 00001 002 00003... look at it as 0000 1002 0000...
So you might have 2 separate problems here and I can't do much without fiddling with it, but things you can try:
Add also a serial number to o(), so you can see which timouts are not winning the race.
Change the values from 1 and 5 seconds to some that are not multiple of each other, like 1.5 and 5. This might help you take one problem out and focus on the other.
Put an external, unrelated worker to print "----" every second. Start it after 0.3 seconds or so. Might give you a better idea of where the divide is.

Refactoring your code to the following generates the expected result (on my machine):
object Test {
def main(args: Array[String]) {
val s = PublishSubject[Int]()
val timeout = Observable.timer(1 second)
def o: Observable[Unit] = {
Observable.amb(s, timeout).first
.concatMap((v) => {
println(v)
o
})
}
var value = 0
NewThreadScheduler().createWorker.scheduleRec {
Thread.sleep(5000)
value += 1
s.onNext(value)
}
o.toBlocking.last
}
}
Notice the switch to the NewThreadScheduler and the use of the scheduleRec method as opposed to manual recursive scheduling.

Related

How to generate increasing sequence in ScalaCheck?

I'm trying to generate sequence of increasing numbers using ScalaCheck.
I would like to achieve something like this:
0 2 4 6
Which was achieved by increasing range 0..3 by step of 2:
0 * 2 = 0
1 * 2 = 2
2 * 2 = 4
3 * 2 = 6
Thanks for help.
Sorry if question has been questioned before.
Well it appears not so difficult to generate random sequence. Sorry I needed to be more specific about predictability.
object GenerateSequence {
def apply(maxSize: Int, maxStep: Int): Gen[Seq[Int]] = {
for {
size <- Gen.chooseNum(1, maxSize)
step <- Gen.chooseNum(1, maxStep)
} yield {
(0 to size).map(_ * step)
}
}
}
It's not really using ScalaCheck, but you can use the stream of even numbers:
>val even = Stream.from(0,2)
>even.take(4)
// Display
// res2: Stream[Int] = Stream(0, 2, 4, 6)

ForAll in scala check skips some input and do not respect containers size

I am new to scala check and I want to test the following piece of my application. I want to generate 30 and 20 random events and check if my application code correctly computes a result
// generate 30 random events
val eventGenerator: Gen[Event] = for {
d <- Gen.oneOf[String](Seq("es1", "es2", "es3"))
t <- Gen.choose[Long](minEvent.getTime, maxEvent.getTime)
s <- Gen.oneOf[String](Seq("s1", "s2", "s3", "s4", "s5", "s6", "s7"))} yield Event(d, t, s)
val eventsGenerator: Gen[List[VpSearchLog]] = Gen.containerOfN[List, VpSearchLog](30, eventGenerator)
// generate 20 random instances
val instanceGenerator: Gen[Instance] = for {
d <- Gen.oneOf[String](Seq("es1", "es2", "es3"))
t <- Gen.choose[Long](minInstance.getTime, maxInstance.getTime)} yield Instance(d, new Timestamp(t))
val instancesGenerator: Gen[List[Instance]] = Gen.containerOfN[List, Instance](20, instanceGenerator)
val p: Prop = forAll(instancesGenerator, eventsGenerator) { (i, e) =>
println(i.size)
println(e.size)
println()
val instancesWithFeature = computeExpected(instance)
isEqual(transform(instance), instanceWithFeature)
}
For some reason I see this in the stdout
20
15
20
7
20
3
20
1
20
0
starting to compute expected:
Basically it looks like the forAll generates a couple of inputs with a certain size and then skips them. For some reaon, it starts to compute things when one of the input has size 0 and then it starts the proper check. My questions are:
why if I use containerofN or listOfN I don't get exactly input of that specific size? How can I then generate input like this?
is it normal that forAll starts to explore the space of the possible input and skips some of them? Am I missing something here? This behaviour is quite counter intuitive for me
You may need to use forAllNoShrink to avoid the known defect in ScalaCheck where shrinking fails to respect generators
val thirtyInts: Gen[List[Int]] =
Gen.listOfN[Int](30, Gen.const(99))
val twentyLongs: Gen[List[Long]] =
Gen.listOfN[Long](20, Gen.const(44L))
property("listOfN") = {
Prop.forAllNoShrink(thirtyInts, twentyLongs) { (ii, ll) =>
ii.size == 30 && ll.size == 20
}
}

Number of Cycles from list of values, which are mix of positives and negatives in Spark and Scala

Have an RDD with List of values, which are mix of positives and negatives.
Need to compute number of cycles from this data.
For example,
val range = List(sampleRange(2020,2030,2040,2050,-1000,-1010,-1020,Starting point,-1030,2040,-1020,2050,2040,2020,end point,-1060,-1030,-1010)
the interval between each value in above list is 1 second. ie., 2020 and 2030 are recorded in 1 second interval and so on.
how many times it turns from negative to positive and stays positive for >= 2 seconds.
If >= 2 seconds it is a cycle.
Number of cycles: Logic
Example 1: List(1,2,3,4,5,6,-15,-66)
No. of cycles is 1.
Reason: As we move from 1st element of list to 6th element, we had 5 intervals which means 5 seconds. So one cycle.
As we move to 6th element of list, it is a negative value. So we start counting from 6th element and move to 7th element. The negative values are only 2 and interval is only 1. So not counted as cycle.
Example 2:
List(11,22,33,-25,-36,-43,20,25,28)
No. of cycles is 3.
Reason: As we move from 1st element of list to 3rd element, we had 2 intervals which means 2 seconds. So one cycle As we move to 4th element of list, it is a negative value. So we start counting from 4th element and move to 5th, 6th element. we had 2 intervals which means 2 seconds. So one cycle As we move to 7th element of list, it is a positive value. So we start counting from 7th element and move to 8th, 9th element. we had 2 intervals which means 2 seconds. So one cycle.
range is a RDD in the use case. It looks like
scala> range
range: Seq[com.Range] = List(XtreamRange(858,890,899,920,StartEngage,-758,-790,-890,-720,920,940,950))
You can encode this "how many times it turns from negative to positive and stays positive for >= 2 seconds. If >= 2 seconds it is a cycle." pretty much directly into a pattern match with a guard. The expression if(h < 0 && ht > 0 && hht > 0) checks for a cycle and adds one to the result then continues with the rest of the list.
def countCycles(xs: List[Int]): Int = xs match {
case Nil => 0
case h::ht::hht::t if(h < 0 && ht > 0 && hht > 0) => 1 + countCycles(t)
case h::t => countCycles(t)
}
scala> countCycles(range)
res7: Int = 1
A one liner
range.sliding(3).count{case f::s::t::Nil => f < 0 && s > 0 && t > 0}
This generates all sub-sequences of length 3 and counts how many are -ve, +ve, +ve
Generalising cycle length
def countCycles(n:Int, xs:List[Int]) = xs.sliding(n+1)
.count(ys => ys.head < 0 && ys.tail.forall(_ > 0))
The below code would help you resolve you query.
object CycleCheck {
def main(args: Array[String]) {
var data3 = List(1, 4, 82, -2, -12, "startingpoint", -9, 32, 76,45, -98, 76, "Endpoint", -24)
var data2 = data3.map(x => getInteger(x)).filter(_ != "unknown").map(_.toString.toInt)
println(data2)
var nCycle = findNCycle(data2)
println(nCycle)
}
def getInteger(obj: Any) = obj match {
case n: Int => obj
case _ => "unknown"
}
def findNCycle(obj: List[Int]) : Int = {
var cycleCount =0
var sign = ""
var signCheck="+"
var size = obj.size - 1
var numberOfCycles=0
var i=0
for( x <- obj){
if (x < 0){
sign="-"
}
else if (x > 0){
sign="+"
}
if(signCheck.equals(sign))
cycleCount=cycleCount+1
if(!signCheck.equals(sign) && cycleCount>1){
cycleCount = 1
numberOfCycles=numberOfCycles+1
}
if(size==i && cycleCount>1)
numberOfCycles= numberOfCycles+1
if(cycleCount==1)
signCheck = sign;
i=i+1
}
return numberOfCycles
}
}

Executors run longer than timeout value

Here is a code segment of scala. I set timeout as 100 mills. Out of 10000 loops, 106 of them run more than 100 mills without throwing exceptions. The largest one is even 135 mills. Any reason why this is happening?
for (j <- 0 to 10000) {
total += 1
val executor = Executors.newSingleThreadExecutor
val result = executor.submit[Int](new Callable[Int] {
def call = try {
Thread.sleep(95)
for (i <- 0 to 1000000) {}
4
} catch {
case e: Exception => exception1 += 1
5
}
})
try {
val t1 = Calendar.getInstance.getTimeInMillis
result.get(100, TimeUnit.MILLISECONDS)
val t2 = Calendar.getInstance.getTimeInMillis
println("timediff = " + (t2 - t1).toString)
} catch {
case e: Exception => exception2 += 1
}
}
Firstly, if you're running on Windows you should be aware that the timer resolution is around 15.6 milliseconds.
Secondly, your empty loop of 1M iterations is quite likely to be removed by a compiler, and more importantly, can't be interrupted by any timeout.
The way a thread sleep works is that a thread asks the o/s to interrupt it after the given time. That's how the timeout works in the result.get call. Now you're relying on the OS thread that does this to be running at the exact time when your timeout has expired, which of course it may not be. Then there is the fact you have 10000 threads for it to interrupt which it can't do all at the exact same time.

Why are XOR often used in java hashCode() but another bitwise operators are used rarely?

I often see code like
int hashCode(){
return a^b;
}
Why XOR?
Of all bit-operations XOR has the best bit shuffling properties.
This truth-table explains why:
A B AND
0 0 0
0 1 0
1 0 0
1 1 1
A B OR
0 0 0
0 1 1
1 0 1
1 1 1
A B XOR
0 0 0
0 1 1
1 0 1
1 1 0
As you can see for AND and OR do a poor job at mixing bits.
OR will on average produce 3/4 one-bits. AND on the other hand will produce on average 3/4 null-bits. Only XOR has an even one-bit vs. null-bit distribution. That makes it so valuable for hash-code generation.
Remember that for a hash-code you want to use as much information of the key as possible and get a good distribution of hash-values. If you use AND or OR you'll get numbers that are biased towards either numbers with lots of zeros or numbers with lots of ones.
XOR has the following advantages:
It does not depend on order of computation i.e. a^b = b^a
It does not "waste" bits. If you change even one bit in one of the components, the final value will change.
It is quick, a single cycle on even the most primitive computer.
It preserves uniform distribution. If the two pieces you combine are uniformly distributed so will the combination be. In other words, it does not tend to collapse the range of the digest into a narrower band.
More info here.
XOR operator is reversible, i.e. suppose I have a bit string as 0 0 1 and I XOR it with another bit string 1 1 1, the the output is
0 xor 1 = 1
0 1 = 1
1 1 = 0
Now I can again xor the 1st string with the result to get the 2nd string. i.e.
0 1 = 1
0 1 = 1
1 0 = 1
So, that makes the 2nd string a key. This behavior is not found with other bit operator
Please see this for more info --> Why is XOR used on Cryptography?
There is another use case: objects in which (some) fields must be compared without regarding their order. For example, if you want a pair (a, b) be always equal to the pair (b, a).
XOR has the property that a ^ b = b ^ a, so it can be used in hash function in such cases.
Examples: (full code here)
definition:
final class Connection {
public final int A;
public final int B;
// some code omitted
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Connection that = (Connection) o;
return (A == that.A && B == that.B || A == that.B && B == that.A);
}
#Override
public int hashCode() {
return A ^ B;
}
// some code omitted
}
usage:
HashSet<Connection> s = new HashSet<>();
s.add(new Connection(1, 3));
s.add(new Connection(2, 3));
s.add(new Connection(3, 2));
s.add(new Connection(1, 3));
s.add(new Connection(2, 1));
s.remove(new Connection(1, 2));
for (Connection x : s) {
System.out.println(x);
}
// output:
// Connection{A=2, B=3}
// Connection{A=1, B=3}