Cannot run concurrent subscribers consistently using cyclops-react - reactive-programming

Is it possible to have concurrent subscribers using cyclops-react library?
For example, if a run the following code:
ReactiveSeq<Integer> initialStream = ReactiveSeq.of(1, 2, 3, 4, 5, 6);
ReactiveSubscriber<Integer> sub1 = Spouts.reactiveSubscriber();
ReactiveSubscriber<Integer> sub2 = Spouts.reactiveSubscriber();
FutureStream<Integer> futureStream = FutureStream.builder().fromStream(initialStream)
.map(v -> v -1);
futureStream.subscribe(sub1);
futureStream.subscribe(sub2);
CompletableFuture future1 = CompletableFuture.runAsync(() -> sub1.reactiveStream().forEach(v -> System.out.println("1 -> " + v)));
CompletableFuture future2 = CompletableFuture.runAsync(() -> sub2.reactiveStream().forEach(v -> System.out.println("2 -> " + v)));
try {
future1.get();
future2.get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
I get the following result:
1 -> 0
2 -> 0
2 -> 1
1 -> 0
1 -> 1
1 -> 1
2 -> 2
2 -> 3
2 -> 4
2 -> 5
1 -> 2
1 -> 2
1 -> 3
1 -> 4
1 -> 5
1 -> 3
1 -> 4
1 -> 5
I'm getting repeated values on the subscribers streams. Thank's in advance for any help.

cyclops-react only supports single subscribers. I think the behaviour here should be changed to ignore the second subscription attempt rather than allow it to mess up both (I will log a bug - thank you!).
You may be able to use Topics to the same effect however. We can rewrite your example using Topics
ReactiveSeq<Integer> initialStream = ReactiveSeq.of(1,2,3,4,5,6);
FutureStream<Integer> futureStream = FutureStream.builder()
.fromStream(initialStream)
.map(v -> v -1);
Queue<Integer> queue= QueueFactories.<Integer>boundedNonBlockingQueue(1000).build();
Topic<Integer> topic = new Topic<Integer>(queue,QueueFactories.<Integer>boundedNonBlockingQueue(1000));
ReactiveSeq<Integer> s2 = topic.stream();
ReactiveSeq<Integer> s1 = topic.stream();
Thread t = new Thread(()->{
topic.fromStream(futureStream);
topic.close();
});
t.start();
CompletableFuture future1 = CompletableFuture.runAsync(() -> s1.forEach(v -> System.out.println("1 -> " + v)));
CompletableFuture future2 = CompletableFuture.runAsync(() -> s2.forEach(v -> System.out.println("2 -> " + v)));
try {
future1.get();
future2.get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
And the output is more inline with what we might expect
2 -> 0
1 -> 0
2 -> 1
1 -> 1
2 -> 2
1 -> 2
2 -> 3
1 -> 3
2 -> 4
1 -> 4
2 -> 5
1 -> 5

Related

Converting arabic numbers into chinese financial numbers

I am trying to create a function in functional programming, which recieves a normal Int value and translates it to financial chinese numbers and returns a String, for exaple: 301 = 三百零一. To begin, I have two maps, one with every digit from 0 to 9, and the other one with the exponentials, from 10, to 1000000.
val digits: Map[Int, String] = Map(0 -> "〇", 1 -> "壹", 2 -> "貳", 3 -> "參", 4 -> "肆", 5 -> "伍", 6 -> "陸", 7 -> "柒", 8 -> "捌", 9 -> "玖");
val exponent: Map[Int, String] = Map(1 -> "", 10 -> "拾", 100 -> "佰", 1000 -> "仟", 10000 -> "萬", 100000 -> "億", 1000000 -> "兆");
For the ones who don´t know, here goes a little explanation about how chinese numbers work. If you already know, don´t bother in reading this paragraph. In the chinese numbers, when you want to write a large number, for example 5000, you write the 5 and the 1000 symbols (伍仟) to intimate that you are multiplying 5 * 1000. If you have 539, it´s 5100 + 310 + 9. This would be 伍佰參拾玖. Lastly, if the number has 0´s between multiplications, it doesn´t matter how many they are, you write only one 0 between the other characters. For example: 501 = 5100 + 1. This is 伍佰〇壹. One last example for calrification: 50103 = 510000 + 1*100 + 3. This is 伍萬〇壹佰〇參.
So what I could do, is the following:
def format(unit: Int): String = {
val l = unit.toString.map(_.asDigit).toList
if(l.isEmpty) ""
else if(l.tail.isEmpty) digits(l.head)
else digits(l.head) + format(l.tail.mkString.toInt)
}
This translates the characters one by one. For example:
format(135) "壹參伍"
And I don´t know how to continue.
If I understood your problem correctly you can do something like this:
def toChineseFinancial(number: Int): String = {
val digits = number.toString.iterator.map(_.asDigit).toList
val length = digits.length
val exponents = List.tabulate(length)(n => math.pow(10, n).toInt)
val (sb, _) =
digits
.iterator
.zip(exponents.reverseIterator)
.foldLeft(new collection.mutable.StringBuilder(length * 2) -> false) {
case ((sb, flag), (digit, exp)) =>
if (digit == 0) sb -> true
else if (flag) sb.append("〇").append(digitsMap(digit)).append(exponentsMap(exp)) -> false
else sb.append(digitsMap(digit)).append(exponentsMap(exp)) -> false
}
sb.result()
}
You can see it running here.
Note: I used mutable.StringBuilder because building Strings is somewhat expensive, but if you want to avoid any kind of mutability you can easily replace it with a normal String.
I would expand the exponents Map using a simple case class for its values to cover:
numbers of magnitude 1, 10, 10^2, ..., 10^12
10's, 100's and 1000's of "萬" (10^4), "億" (10^8) and "兆" (10^12)
as shown below:
case class CNU(unit: String, factor: Int)
val exponents: Map[Long, CNU] = Map(
1L -> CNU("", 1),
10L -> CNU("拾", 1),
100L -> CNU("佰", 1),
1000L -> CNU("仟", 1),
10000L ->CNU("萬", 1),
100000L -> CNU("萬", 10),
1000000L -> CNU("萬", 100),
10000000L -> CNU("萬", 1000),
100000000L -> CNU("億", 1),
1000000000L -> CNU("億", 10),
10000000000L -> CNU("億", 100),
100000000000L -> CNU("億", 1000),
1000000000000L -> CNU("兆", 1),
10000000000000L -> CNU("兆", 10),
100000000000000L -> CNU("兆", 100),
1000000000000000L -> CNU("兆", 1000)
)
Creating the method:
val digits: Map[Int, String] = Map(
0 -> "〇", 1 -> "壹", 2 -> "貳", 3 -> "參", 4 -> "肆",
5 -> "伍", 6 -> "陸", 7 -> "柒", 8 -> "捌", 9 -> "玖"
)
def toChineseNumber(num: Long): String = {
val s = num.toString
val ds = s.map(_.asDigit).zip(s.length-1 to 0 by -1)
ds.foldRight(List.empty[String], 0){ case ((d, i), (accList, dPrev)) =>
val cnu = exponents(math.pow(10, i).toLong)
val digit =
if (d == 0) {
if (dPrev != 0 || num == 0) digits(d) else ""
}
else
digits(d)
val unit =
if (d == 0)
""
else {
if (cnu.factor == 1) cnu.unit else exponents(cnu.factor).unit
}
((digit + unit) :: accList, d)
}.
_1.mkString
}
Note that method foldRight is used to traverse and process the input number from right to left and dPrev in the tuple-accumulator is for carrying digits across iterations for handling repetitive 0's.
Testing it:
toChineseNumber(50)
// res1: String = 伍拾
toChineseNumber(30001)
// res2: String = 參萬〇壹
toChineseNumber(1023405)
// res3: String = 壹佰〇貳萬參仟肆佰〇伍
toChineseNumber(2233007788L)
// res4: String = 貳拾貳億參仟參佰〇柒仟柒佰捌拾捌

trying to figure out Project euler problem 1, (get the sum of all numbers under 1000, divisible under 3, and 5

I am able to print out on the screen the numbers divisible by 3, and 5 under 1000, but am unsure how I am suppose to add the sum of all the numbers! please help me get in right direction thanks! :) Only been doing swift for two days now, really enjoying it. But that being my code may not be the prettiest ;)
import UIKit
func sumFinder (untill n : Int) {
print (3)
print (5)
var num1 = 3
var num2 = 5
for iteration in 0...n {
var num3 = num1 + 3
var num4 = num2 + 5
print(num3)
print(num4)
num1 = num3
num2 = num4
let sum = (num1 + num2 + num3 + num4)
}
}
sumFinder(untill:1000)
You can do it in one line: Create a range, filter the items with isMultiple(of and sum up the result with reduce
func sumFinder (until n : Int) -> Int {
return (0...n).lazy.filter{ $0.isMultiple(of: 3) && $0.isMultiple(of: 5) }.reduce(0, +)
}
sumFinder(until: 1000) // 33165
However the actual challenge in Project Euler – Problem 1 is
Find the sum of all the multiples of 3 or 5 below 1000
In this case the result is 234168. Replace && with ||
func sumFinder (until n : Int) -> Int {
return (0...n).lazy.filter{ $0.isMultiple(of: 3) || $0.isMultiple(of: 5) }.reduce(0, +)
}
You can try
Your attempt
func getSum(_ toValue:Int) -> Int {
var sum = 0
for i in (0...toValue) {
if i.isMultiple(of: 15) {
sum += i
}
}
return sum
}
For short swifty way ( Recommended )
func getSum(_ toValue:Int) -> Int {
return stride(from: 0, to:toValue, by: 1).filter{ $0.isMultiple(of:15)}.reduce(0,+)
}
Test
print(getSum(1000)) // 33165
Side note
An int that is multiable of 3 and 5 is multiple of there multiplication ( 15 ) so this
i.isMultiple(of: 3) && i.isMultiple(of: 5) = i.isMultiple(of: 15)

Calculate occurrences per group of events - Spark

having the following rdd
BBBBBBAAAAAAABAABBBBBBBB
AAAAABBAAAABBBAAABAAAAAB
I need to calculate the numbers of iterations per group of event, so, for this example the expected output should be:
BBBBBBAAAAAAABAABBBBBBBB A -> 2 B -> 3
AAAAABBAAAABBBAAABBCCCCC A -> 3 B -> 4 C-> 1
Final Output -> A -> 5 B -> 7 C-> 1
I have implemented the splitting and them a sliding for each character to try to obtain the values, but I cannot obtain the expected result.
Thanks,
val baseRDD = sc.parallelize(Seq("BBBBBBAAAAAAABAABBBBBBBB", "AAAAABBAAAABBBAAABBCCCC"))
baseRDD.flatMap(x => "(\\w)\\1*".r.findAllMatchIn(x).map(x => (x.matched.charAt(0), 1)).toList).reduceByKey((accum, current) => accum + current).foreach(println(_))
Result
(C,1)
(B,6)
(A,5)
Hope this is what you wanted.

Strange results when using Scala collections

I have some tests with results that I can't quite explain.
The first test does a filter, map and reduce on a list containing 4 elements:
{
val counter = new AtomicInteger(0)
val l = List(1, 2, 3, 4)
val filtered = l.filter{ i =>
counter.incrementAndGet()
true
}
val mapped = filtered.map{ i =>
counter.incrementAndGet()
i*2
}
val reduced = mapped.reduce{ (a, b) =>
counter.incrementAndGet()
a+b
}
println("counted " + counter.get + " and result is " + reduced)
assert(20 == reduced)
assert(11 == counter.get)
}
The counter is incremented 11 times as I expected: once for each element during filtering, once for each element during mapping and three times to add up the 4 elements.
Using wildcards the result changes:
{
val counter = new AtomicInteger(0)
val l = List(1, 2, 3, 4)
val filtered = l.filter{
counter.incrementAndGet()
_ > 0
}
val mapped = filtered.map{
counter.incrementAndGet()
_*2
}
val reduced = mapped.reduce{ (a, b) =>
counter.incrementAndGet()
a+b
}
println("counted " + counter.get + " and result is " + reduced)
assert(20 == reduced)
assert(5 == counter.get)
}
I can't work out how to use wildcards in the reduce (code doesnt compile), but now, the counter is only incremented 5 times!!
So, question #1: Why do wildcards change the number of times the counter is called and how does that even work?
Then my second, related question. My understanding of views was that they would lazily execute the functions passed to the monadic methods, but the following code doesn't show that.
{
val counter = new AtomicInteger(0)
val l = Seq(1, 2, 3, 4).view
val filtered = l.filter{
counter.incrementAndGet()
_ > 0
}
println("after filter: " + counter.get)
val mapped = filtered.map{
counter.incrementAndGet()
_*2
}
println("after map: " + counter.get)
val reduced = mapped.reduce{ (a, b) =>
counter.incrementAndGet()
a+b
}
println("after reduce: " + counter.get)
println("counted " + counter.get + " and result is " + reduced)
assert(20 == reduced)
assert(5 == counter.get)
}
The output is:
after filter: 1
after map: 2
after reduce: 5
counted 5 and result is 20
Question #2: How come the functions are being executed immediately?
I'm using Scala 2.10
You're probably thinking that
filter {
println
_ > 0
}
means
filter{ i =>
println
i > 0
}
but Scala has other ideas. The reason is that
{ println; _ > 0 }
is a statement that first prints something, and then returns the > 0 function. So it interprets what you're doing as a funny way to specify the function, equivalent to:
val p = { println; (i: Int) => i > 0 }
filter(p)
which in turn is equivalent to
println
val temp = (i: Int) => i > 0 // Temporary name, forget we did this!
val p = temp
filter(p)
which as you can imagine doesn't quite work out the way you want--you only print (or in your case do the increment) once at the beginning. Both your problems stem from this.
Make sure if you're using underscores to mean "fill in the parameter" that you only have a single expression! If you're using multiple statements, it's best to stick to explicitly named parameters.

Slow IO with large data

I'm trying to find a better way to do this as it could take years to compute! I'm need to compute a map which is too large to fit in memory, so I am trying to make use of IO as follows.
I have a file that contains a list of Ints, about 1 million of them. I have another file that contains data about my (500,000) document collection. I need to calculate a function of the count, for every Int in the first file, of how many documents (lines in the second) it appears in. Let me give an example:
File1:
-1
1
2
etc...
file2:
E01JY3-615, CR93E-177 , [-1 -> 2,1 -> 1,2 -> 2,3 -> 2,4 -> 2,8 -> 2,... // truncated for brevity]
E01JY3-615, CR93E-177 , [1 -> 2,2 -> 2,4 -> 2,5 -> 2,8 -> 2,... // truncated for brevity]
etc...
Here is what I have tried so far
def printToFile(f: java.io.File)(op: java.io.PrintWriter => Unit) {
val p = new java.io.PrintWriter(new BufferedWriter((new FileWriter(f))))
try {
op(p)
} finally {
p.close()
}
}
def binarySearch(array: Array[String], word: Int):Boolean = array match {
case Array() => false
case xs => if (array(array.size/2).split("->")(0).trim().toInt == word) {
return true
} else if (array(array.size/2).split("->")(0).trim().toInt > word){
return binarySearch(array.take(array.size/2), word)
} else {
return binarySearch(array.drop(array.size/2 + 1), word)
}
}
var v = Source.fromFile("vocabulary.csv").getLines()
printToFile(new File("idf.csv"))(out => {
v.foreach(word =>{
var docCount: Int = 0
val s = Source.fromFile("documents.csv").getLines()
s.foreach(line => {
val split = line.split("\\[")
val fpStr = split(1).init
docCount = if (binarySearch(fpStr.split(","), word.trim().toInt)) docCount + 1 else docCount
})
val output = word + ", " + math.log10(500448 / (docCount + 1))
out.println(output)
println(output)
})
})
There must be a faster way to do this, can anyone think of a way?
From what I understand of your code, you are trying to find every word in the dictionary in the document list.
Hence, you are making N*M comparisons, where N is the number of words (in the dictionary with integers) and M is the number of documents in the document list. Instantiating to your values, you are trying to calculate 10^6 * 5*10^5 comparisons which is 5*10^11. Unfeasible.
Why not create a mutable map with all the integers in the dictionary as keys (1000000 ints in memory is roughly 3.8M from my measurements) and pass through the document list only once, where for each document you extract the integers and increment the respective count values in the map (for which the integer is key).
Something like this:
import collection.mutable.Map
import scala.util.Random._
val maxValue = 1000000
val documents = collection.mutable.Map[String,List[(Int,Int)]]()
// util function just to insert fake input; disregard
def provideRandom(key:String) ={ (1 to nextInt(4)).foreach(_ => documents.put(key,(nextInt(maxValue),nextInt(maxValue)) :: documents.getOrElse(key,Nil)))}
// inserting fake documents into our fake Document map
(1 to 500000).foreach(_ => {val key = nextString(5); provideRandom(key)})
// word count map
val wCount = collection.mutable.Map[Int,Int]()
// Counting the numbers and incrementing them in the map
documents.foreach(doc => doc._2.foreach(k => wCount.put(k._1, (wCount.getOrElse(k._1,0)+1))))
scala> wCount
res5: scala.collection.mutable.Map[Int,Int] = Map(188858 -> 1, 178569 -> 2, 437576 -> 2, 660074 -> 2, 271888 -> 2, 721076 -> 1, 577416 -> 1, 77760 -> 2, 67471 -> 1, 804106 -> 2, 185283 -> 1, 41623 -> 1, 943946 -> 1, 778258 -> 2...
the result is a map with its keys being a number in the dict and the value the number of times it appears in the document list
This is oversimplified since
I dont verify if the number exists in the dictionary, although you only need to init the map with the values and then increment the value in the final map if it has that key;
I dont do IO, which speeds up the whole thing
This way you only pass through the documents once, which makes the task feasible again.