Why do I get the following different results when converting a vector using either tuple or Tuple?
julia> a = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> tuple(a)
([1, 2, 3],)
julia> Tuple(a)
(1, 2, 3)
Broadcasting gives the same result though:
julia> tuple.(a)
3-element Vector{Tuple{Int64}}:
(1,)
(2,)
(3,)
julia> Tuple.(a)
3-element Vector{Tuple{Int64}}:
(1,)
(2,)
(3,)
(The latter is not so surprising as it just converts single numbers to tuples.)
(This is Julia 1.6.1.)
Tuple is a type and as with all collections in Julia base, if you pass another collection to it, it creates an instance of that type from the contents of the other collection. So Tuple([1, 2, 3]) constructs a tuple of the values 1, 2 and 3 just like Set([1, 2, 3]) constructs a set of those same values. Similarly, if you write Dict([:a => 1, :b => 2, :c => 3]) you get a dict that contains the pairs :a => 1, :b => 2 and :c => 3. This also works nicely when the argument to the constructor is an iterator; some examples:
julia> Tuple(k^2 for k=1:3)
(1, 4, 9)
julia> Set(k^2 for k=1:3)
Set{Int64} with 3 elements:
4
9
1
julia> Dict(string(k, base=2, pad=2) => k^2 for k=1:3)
Dict{String, Int64} with 3 entries:
"10" => 4
"11" => 9
"01" => 1
So that's why Tuple works the way it does. The tuple function, on the other hand, is a function that makes a tuple from its arguments like this:
julia> tuple()
()
julia> tuple(1)
(1,)
julia> tuple(1, "two")
(1, "two")
julia> tuple(1, "two", 3.0)
(1, "two", 3.0)
Why have tuple at all instead of just having Tuple? You could express this last example as Tuple([1, "two", 3.0]). However, that requires constructing a temporary untyped array only to iterate it and make a tuple from its contents, which is really inefficient. If only there was a more efficient container type that the compiler can usually eliminate the construction of... like a tuple. For that we'd write Tuple((1, "two", 3.0)). Which works, but is completely redundant since (1, "two", 3.0) is already the tuple you wanted. So why would you use tuple? Most of the time you don't, you just use the (1, "two", 3.0) syntax for constructing a tuple. But sometimes you want an actual function that you can apply to some values to get a tuple of them—and tuple is that function. You can actually make an anonymous function that does this pretty easily: (args...) -> (args...,). You can just think of tuple as a handy abbreviation for that function.
The following code shows a type mismatch error :
def f(arr:List[Int]): List[Int] =
for(num <- 0 to arr.length-1; if num % 2 == 1) yield arr(num)
It is says that it found an IndexedSeq instead of a List. The following works :
def f(arr:List[Int]): List[Int] =
for(num <- (0 to arr.length-1).toList; if num % 2 == 1) yield arr(num)
I have used i <- a to b in a for loop before but haven't seen this error before. Can someone please explain why the format i <- a to b cannot be used here ?
because 0 to arr.length-1 return type is: IndexedSeq[Int], so when execute for yield it also will yield result with IndexedSeq[Int] type.
The correct function define:
def f(arr:List[Int]):IndexedSeq[Int] = for( num <- 0 to arr.length-1 if num%2==1) yield arr(num)
And
for( num <- 0 to arr.length-1 if num%2==1) yield arr(num)
will translate to:
scala> def f(arr:List[Int]) = (0 to arr.length-1).filter(i => i%2==1).map(i => arr(i))
f: (arr: List[Int])scala.collection.immutable.IndexedSeq[Int]
So we can see the return type is decided by 0 to arr.length-1 type.
and (0 to arr.length-1).toList is changing the return IndexedSeq[int] type to List[Int] type, so for yield will generate result with type of List[Int].
In Scala, for each iteration of your for loop, yield generates a value which will be remembered. The type of the collection that is returned is the same type that you were iterating over, so a List yields a List, a IndexedSeq yields a IndexedSeq, and so on.
The type of (0 to arr.length-1) is scala.collection.immutable.Range, it's Inherited from scala.collection.immutable.IndexedSeq[Int]. So, in the first case, the result is IndexedSeq[Int], but the return type of function f is List[Int], obviously it doesn't work. In the second case, a List yields a List, and the return type of f is List[Int].
You can also write function f as follow:
def f(arr: List[Int]): IndexedSeq[Int] = for( a <- 1 to arr.length-1; if a % 2 == 1) yield arr(a)
Another example:
scala> for (i <- 1 to 5) yield i
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3, 4, 5)
scala> for (e <- Array(1, 2, 3, 4, 5)) yield e
res1: Array[Int] = Array(1, 2, 3, 4, 5)
In scala for is a syntax sugar, where:
for (i <- a to b) yield func(i)
translate to:
RichInt(a).to(b).map({ i => func(i) })
RichInt.to returns a Range
Range.map returns a IndexedSeq
I am trying to do all lines combinations without repetition of a text file.
Example:
1
2
2
1
1
Result:
Line 1 with line 2 = (1,2)
Line 1 with line 3 = (1,2)
Line 1 with line 4 = (1,1)
Line 1 with line 5 = (1,1)
Line 2 with line 3 = (2,2)
Line 2 with line 4 = (2,1)
Line 2 with line 5 = (2,1)
Line 3 with line 4 = (2,1)
Line 3 with line 5 = (2,1)
Line 4 with line 5 = (1,1)
or
Considering (x,y), if (x != y) 0 else 1:
0
0
1
1
1
0
0
0
0
1
I have the following code:
def processCombinations(rdd: RDD[String]) = {
rdd.mapPartitions({ partition => {
var previous: String = null;
if (partition.hasNext)
previous = partition.next
for (element <- partition) yield {
if (previous == element)
"1"
else
"0"
}
}
})
}
The piece of code above is doing the combinations of the first element of my RDD, in other words: (1,2) (1,2) (1,1) (1,1).
The problem is: This code ONLY works with ONE PARTITION. I'd like to make this work with many partitions, how could I do that?
It's not very clear exactly what you want as output, but this reproduces your first example, and translates directly to Spark. It generates combinations, but only where the index of the first element in the original list is less than the index of the second, which is I think what you're asking for.
val r = List(1,2,2,1,1)
val z = r zipWithIndex
z.flatMap(x=>z.map(y=>(x,y))).collect{case(x,y) if x._2 < y._2 => (x._1, y._1)}
//List((1,2), (1,2), (1,1), (1,1), (2,2), (2,1), (2,1), (2,1), (2,1), (1,1))
or, as a for-comprehension
for (x<-z; y<-z; if x._2 < y._2) yield (x._1, y._1)
This code calculate the combinations without repetitions by using recursion. It gets 2 arguments: number of elements for the combination and the list of elements.
It works in the following way: for the given list: 1, 2, 3, 4, 5 => It takes the 4 first elements for the first combination. Then It generates other combination with 5, the last element of the list. When there are not more elements left in the list, It moves one position back (third position) and takes the next element to generates more combinations from there: 1, 2, "4", 5. This operation is done recursively with all of elements of the list.
def combinator[A](n: Int, list: List[A], acc: List[A]): List[List[A]] = {
if (n == 0)
List(acc.reverse)
else if (list == Nil)
List()
else
combinator(n - 1, list.tail, list.head :: acc) ::: combinator(n, list.tail, acc)
}
combinator(4, List(1, 2, 3, 4, 5), List()).foreach(println)
// List(1, 2, 3, 4)
// List(1, 2, 3, 5)
// List(1, 2, 4, 5)
// List(1, 3, 4, 5)
// List(2, 3, 4, 5)
Is there a Round Robin Queue available in Scala Collections?
I need to repeatedly iterate a list that circles through itself
val x = new CircularList(1,2,3,4)
x.next (returns 1)
x.next (returns 2)
x.next (returns 3)
x.next (returns 4)
x.next (returns 1)
x.next (returns 2)
x.next (returns 3)
... and so on
It's pretty easy to roll your own with continually and flatten:
scala> val circular = Iterator.continually(List(1, 2, 3, 4)).flatten
circular: Iterator[Int] = non-empty iterator
scala> circular.take(17).mkString(" ")
res0: String = 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
There's also a continually method on Stream—just be careful not to hold onto a reference to the head of the stream if you're going to be generating lots of elements.
You can very easily create a circular list using a Stream.
scala> val l = List(1, 2, 3, 4).toStream
l: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> def b: Stream[Int] = l #::: b
b: Stream[Int]
scala> b.take(20).toList
res2: List[Int] = List(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)
Edit: you want to make sure to define the repeated part beforehand, once and only once, to avoid blowing the heap (structural sharing in Stream). As in:
def circular[A](a: Seq[A]): Stream[A] = {
val repeat = a.toStream
def b: Stream[A] = repeat #::: b
b
}
Version more concentrated on getting new element on every execution.
val getNext: () => Int = {
def b: Stream[Int] = List(1, 2, 3, 4).toStream #::: b
var cyclicIterator: Stream[Int] = b
() => {
val tail = cyclicIterator.tail
val result = tail.head
cyclicIterator = tail
result
}
} // could be written more sexy?
In your problem you can use it like:
for(i <- 1 to 10) yield getNext()
This is ugly in having an external mutable index, but it does do what's requested:
scala> var i = 0
scala> val ic4 = Iterator.continually { val next = IndexedSeq(1, 2, 3, 4)(i % 4); i += 1; next }
i: Int = 0
ic4: Iterator[Int] = non-empty iterator
scala> ic4 take 10 foreach { i => printf("ic4.next=%d%n", i) }
ic4.next=1
ic4.next=2
ic4.next=3
ic4.next=4
ic4.next=1
ic4.next=2
ic4.next=3
ic4.next=4
ic4.next=1
ic4.next=2
At least it illustrates Iterator.continually. There is also Stream.continually, which has the same signature.
I have a Scala List that contains some repeated numbers. I want to count the number of times a specific number will repeat itself. For example:
val list = List(1,2,3,3,4,2,8,4,3,3,5)
val repeats = list.takeWhile(_ == List(3,3)).size
And the val repeats would equal 2.
Obviously the above is pseudo-code and takeWhile will not find two repeated 3s since _ represents an integer. I tried mixing both takeWhile and take(2) but with little success. I also referred code from How to find count of repeatable elements in scala list but it appears the author is looking to achieve something different.
Thanks for your help.
This will work in this case:
val repeats = list.sliding(2).count(_.forall(_ == 3))
The sliding(2) method gives you an iterator of lists of elements and successors and then we just count where these two are equal to 3.
Question is if it creates the correct result to List(3, 3, 3)? Do you want that to be 2 or just 1 repeat.
val repeats = list.sliding(2).toList.count(_==List(3,3))
and more generally the following code returns tuples of element and repeats value for all elements:
scala> list.distinct.map(x=>(x,list.sliding(2).toList.count(_.forall(_==x))))
res27: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
which means that the element '3' repeats 2 times consecutively at 2 places and all others 0 times.
and also if we want element repeats 3 times consecutively we just need to modify the code as follows:
list.distinct.map(x=>(x,list.sliding(3).toList.count(_.forall(_==x))))
in SCALA REPL:
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 5)
scala> list.distinct.map(x=>(x,list.sliding(3).toList.count(_==List(x,x,x))))
res29: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
Even sliding value can be varied by defining a function as:
def repeatsByTimes(list:List[Int],n:Int) =
list.distinct.map(x=>(x,list.sliding(n).toList.count(_.forall(_==x))))
Now in REPL:
scala> val list = List(1,2,3,3,4,2,8,4,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 4, 2, 8, 4, 3, 3, 5)
scala> repeatsByTimes(list,2)
res33: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,2,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 2, 4, 3, 3, 3, 5)
scala> repeatsByTimes(list,3)
res34: List[(Int, Int)] = List((1,0), (2,0), (3,3), (4,0), (8,0), (5,0))
scala>
We can go still further like given a list of integers and given a maximum number
of consecutive repetitions that any of the element can occur in the list, we may need a list of 3-tuples representing (the element, number of repetitions of this element, at how many places this repetition occurred). this is more exhaustive information than the above. Can be achieved by writing a function like this:
def repeats(list:List[Int],maxRep:Int) =
{ var v:List[(Int,Int,Int)] = List();
for(i<- 1 to maxRep)
v = v ++ list.distinct.map(x=>
(x,i,list.sliding(i).toList.count(_.forall(_==x))))
v.sortBy(_._1) }
in SCALA REPL:
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,2,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 2, 4, 3, 3, 3, 5)
scala> repeats(list,3)
res38: List[(Int, Int, Int)] = List((1,1,1), (1,2,0), (1,3,0), (2,1,3),
(2,2,0), (2,3,0), (3,1,9), (3,2,6), (3,3,3), (4,1,3), (4,2,0), (4,3,0),
(5,1,1), (5,2,0), (5,3,0), (8,1,1), (8,2,0), (8,3,0))
scala>
These results can be understood as follows:
1 times the element '1' occurred at 1 places.
2 times the element '1' occurred at 0 places.
............................................
............................................
.............................................
2 times the element '3' occurred at 6 places..
.............................................
3 times the element '3' occurred at 3 places...
............................................and so on.
Thanks to Luigi Plinge I was able to use methods in run-length encoding to group together items in a list that repeat. I used some snippets from this page here: http://aperiodic.net/phil/scala/s-99/
var n = 0
runLengthEncode(totalFrequencies).foreach{ o =>
if(o._1 > 1 && o._2==subjectNumber) n+=1
}
n
The method runLengthEncode is as follows:
private def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
private def runLengthEncode[A](ls: List[A]): List[(Int, A)] =
pack(ls) map { e => (e.length, e.head) }
I'm not entirely satisfied that I needed to use the mutable var n to count the number of occurrences but it did the trick. This will count the number of times a number repeats itself no matter how many times it is repeated.
If you knew your list was not very long you could do it with Strings.
val list = List(1,2,3,3,4,2,8,4,3,3,5)
val matchList = List(3,3)
(matchList.mkString(",")).r.findAllMatchIn(list.mkString(",")).length
From you pseudocode I got this working:
val pairs = list.sliding(2).toList //create pairs of consecutive elements
val result = pairs.groupBy(x => x).map{ case(x,y) => (x,y.size); //group pairs and retain the size, which is the number of occurrences.
result will be a Map[List[Int], Int] so you can the count number like:
result(List(3,3)) // will return 2
I couldn't understand if you also want to check lists of several sizes, then you would need to change the parameter to sliding to the desired size.
def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
def encode[A](ls: List[A]): List[(Int, A)] = pack(ls) map { e => (e.length, e.head) }
val numberOfNs = list.distinct.map{ n =>
(n -> list.count(_ == n))
}.toMap
val runLengthPerN = runLengthEncode(list).map{ t => t._2 -> t._1}.toMap
val nRepeatedMostInSuccession = runLengthPerN.toList.sortWith(_._2 <= _._2).head._1
Where runLength is defined as below from scala's 99 problems problem 9 and scala's 99 problems problem 10.
Since numberOfNs and runLengthPerN are Maps, you can get the population count of any number in the list with numberOfNs(number) and the length of the longest repitition in succession with runLengthPerN(number). To get the runLength, just compute as above with runLength(list).map{ t => t._2 -> t._1 }.