I am writing in Scala and within this if statement I have a for loop and I initialized i=0 and used i in the for loop. It is telling me that declaration is not used, but I am using it in the for loop. top is always equal to 5 also.
else {
var i = 0
for (i <- 0 until (top))
For loops work a little different in Scala than other languages. In Scala, a for comprehension is syntax sugar over foreach, filter, map/flatMap higher order functions.
When you write
for (i <- 0 until top)
The compiler re-writes this to
(0 until top).foreach {
i => ...
}
Here, the foreach body is an anonymous function where i is the function parameter that has the same type as the iterable. So you can just remove the declaration for i in your code snippet.
Like posted before, the for in Scala is a very different animal than in e.g. Java. Using it as an iterator is only one of its many potential usages.
For your code: The first and the second i are not the same. In fact, the variable in the for expression is ephemeral and doesn't leave the for's scope, but actually shadows the outer one.
var i = 0 // you don't need this
for (i <- 0 until (top)) // loops from 0 to whatever top is regardless what's in the outer i
This will print 0 1 2 3 4:
val top = 5
for (i <- 0 until (top)) {
println(i)
}
One really cool aspect of for comes from the yield keyword (example from the first link):
val names = List("adam", "david", "frank")
val ucNames = for (name <- names) yield name.capitalize
Also it is used to kind of map over multiple collections/monads etc. at once like in Haskell, a task rather tedious otherwise:
val names = List("adam", "david")
val numbers = List(1, 2)
val lst = for {
name <- names
number <- numbers
} yield s"$name: $number"
lst will now hold the cartesian product of the two lists as a List[String]: List("adam: 1", "adam: 2", "david: 1", "david: 2")
Related
This is mix Chisel / Scala question.
Background, I need to sum up a lot of numbers (the number of input signals in configurable). Due to timing constrains I had to split it to groups of 4 and pipe(register it), then it is fed into next stage (which will be 4 times smaller, until I reach on)
this is my code:
// log4 Aux function //
def log4(n : Int): Int = math.ceil(math.log10(n.toDouble) / math.log10(4.0)).toInt
// stage //
def Adder4PipeStage(len: Int,in: Vec[SInt]) : Vec[SInt] = {
require(in.length % 4 == 0) // will not work if not a muliplication of 4
val pipe = RegInit(VecInit(Seq.fill(len/4)(0.S(in(0).getWidth.W))))
pipe.zipWithIndex.foreach {case(p,j) => p := in.slice(j*4,(j+1)*4).reduce(_ +& _)}
pipe
}
// the pipeline
val adderPiped = for(j <- 1 to log4(len)) yield Adder4PipeStage(len/j,if(j==1) io.in else <what here ?>)
how to I access the previous stage, I am also open to hear about other ways to implement the above
There are several things you could do here:
You could just use a var for the "previous" value:
var prev: Vec[SInt] = io.in
val adderPiped = for(j <- 1 to log4(len)) yield {
prev = Adder4PipeStage(len/j, prev)
prev
}
It is a little weird using a var with a for yield (since the former is fundamentally mutable while the latter tends to be used with immutable-style code).
You could alternatively use a fold building up a List
// Build up backwards and reverse (typical in functional programming)
val adderPiped = (1 to log4(len)).foldLeft(io.in :: Nil) {
case (pipes, j) => Adder4PipeStage(len/j, pipes.head) :: pipes
}.reverse
.tail // Tail drops "io.in" which was 1st element in the result List
If you don't like the backwards construction of the previous fold,
You could use a fold with a Vector (better for appending than a List):
val adderPiped = (1 to log4(len)).foldLeft(Vector(io.in)) {
case (pipes, j) => pipes :+ Adder4PipeStage(len/j, pipes.last)
}.tail // Tail drops "io.in" which was 1st element in the result Vector
Finally, if you don't like these immutable ways of doing it, you could always just embrace mutability and write something similar to what one would in Java or Python:
For loop and mutable collection
val pipes = new mutable.ArrayBuffer[Vec[SInt]]
for (j <- 1 to log4(len)) {
pipes += Adder4PipeStage(len/j, if (j == 1) io.in else pipes.last)
}
I am using a for comprehension on a stream and I would like to know how many iterations took to get o the final results.
In code:
var count = 0
for {
xs <- xs_generator
x <- xs
count = count + 1 //doesn't work!!
if (x prop)
yield x
}
Is there a way to achieve this?
Edit: If you don't want to return only the first item, but the entire stream of solutions, take a look at the second part.
Edit-2: Shorter version with zipWithIndex appended.
It's not entirely clear what you are attempting to do. To me it seems as if you are trying to find something in a stream of lists, and additionaly save the number of checked elements.
If this is what you want, consider doing something like this:
/** Returns `x` that satisfies predicate `prop`
* as well the the total number of tested `x`s
*/
def findTheX(): (Int, Int) = {
val xs_generator = Stream.from(1).map(a => (1 to a).toList).take(1000)
var count = 0
def prop(x: Int): Boolean = x % 317 == 0
for (xs <- xs_generator; x <- xs) {
count += 1
if (prop(x)) {
return (x, count)
}
}
throw new Exception("No solution exists")
}
println(findTheX())
// prints:
// (317,50403)
Several important points:
Scala's for-comprehension have nothing to do with Python's "yield". Just in case you thought they did: re-read the documentation on for-comprehensions.
There is no built-in syntax for breaking out of for-comprehensions. It's better to wrap it into a function, and then call return. There is also breakable though, but it works with Exceptions.
The function returns the found item and the total count of checked items, therefore the return type is (Int, Int).
The error in the end after the for-comprehension is to ensure that the return type is Nothing <: (Int, Int) instead of Unit, which is not a subtype of (Int, Int).
Think twice when you want to use Stream for such purposes in this way: after generating the first few elements, the Stream holds them in memory. This might lead to "GC-overhead limit exceeded"-errors if the Stream isn't used properly.
Just to emphasize it again: the yield in Scala for-comprehensions is unrelated to Python's yield. Scala has no built-in support for coroutines and generators. You don't need them as often as you might think, but it requires some readjustment.
EDIT
I've re-read your question again. In case that you want an entire stream of solutions together with a counter of how many different xs have been checked, you might use something like that instead:
val xs_generator = Stream.from(1).map(a => (1 to a).toList)
var count = 0
def prop(x: Int): Boolean = x % 317 == 0
val xsWithCounter = for {
xs <- xs_generator;
x <- xs
_ = { count = count + 1 }
if (prop(x))
} yield (x, count)
println(xsWithCounter.take(10).toList)
// prints:
// List(
// (317,50403), (317,50721), (317,51040), (317,51360), (317,51681),
// (317,52003), (317,52326), (317,52650), (317,52975), (317,53301)
// )
Note the _ = { ... } part. There is a limited number of things that can occur in a for-comprehension:
generators (the x <- things)
filters/guards (if-s)
value definitions
Here, we sort-of abuse the value-definition syntax to update the counter. We use the block { counter += 1 } as the right hand side of the assignment. It returns Unit. Since we don't need the result of the block, we use _ as the left hand side of the assignment. In this way, this block is executed once for every x.
EDIT-2
If mutating the counter is not your main goal, you can of course use the zipWithIndex directly:
val xsWithCounter =
xs_generator.flatten.zipWithIndex.filter{x => prop(x._1)}
It gives almost the same result as the previous version, but the indices are shifted by -1 (it's the indices, not the number of tried x-s).
I am using scala to write up a spark application that reads data from csv files using dataframes (none of these details matter really, my question can be answered by anyone who is good at functional programming)
I'm used to sequential programming and its taking a while to think of things in the functional way.
I basically want to read to columns (a,b) from a csv file and keep track of those rows where b < 0.
I implemented this but its pretty much how I would do it Java and I would like to utilize Scala's features instead:
val ValueDF = fileDataFrame.select("colA", "colB")
val ValueArr = ValueDF.collect()
for ( index <- 0 until (ValueArr.length)){
var row = ValueArr(index)
var A = row(0).toString()
var B = row(1).toString().toDouble
if (B < 0){
//write A and B somewhere
}
}
Converting the dataframe to an array defeats the purpose of distributed computation.
So how could I possibly get the same results but instead of forming an array and traversing through it, I would rather want to perform some transformations of the data frame itself (such as map/filter/flatmap etc).
I should get going soon hopefully, just need some examples to wrap my head around it.
You are doing basically a filtering operation (ignore if not (B < 0)) and mapping (from each row, get A and B / do something with A and B).
You could write it like this:
val valueDF = fileDataFrame.select("colA", "colB")
val valueArr = valueDF.collect()
val result = valueArr.filter(_(1).toString().toDouble < 0).map{row => (row(0).toString(), row(1).toString().toDouble)}
// do something with result
You also can do first the mapping and then the filtering:
val result = valueArr.map{row => (row(0).toString(), row(1).toString().toDouble)}.filter(_._2 < 0)
Scala also offers more convenient versions for this kind of operations (thanks Sascha Kolberg), called withFilter and collect. withFilter has the advantage over filter that it doesn't create a new collection, saving you one pass, see this answer for more details. With collect you also map and filter in one pass, passing a partial function which allows to do pattern matching, see e.g. this answer.
In your case collect would look like this:
val valueDF = fileDataFrame.select("colA", "colB")
val valueArr = valueDF.collect()
val result = valueArr.collect{
case row if row(1).toString().toDouble < 0) => (row(0).toString(), row(1).toString().toDouble)
}
// do something with result
(I think there's a more elegant way to express this but that's left as an exercise ;))
Also, there's a lightweight notation called "sequence comprehensions". With this you could write:
val result = for (row <- valueArr if row(1).toString().toDouble < 0) yield (row(0).toString(), row(1).toString().toDouble)
Or a more flexible variant:
val result = for (row <- valueArr) yield {
val double = row(1).toString().toDouble
if (double < 0) {
(row(0).toString(), double)
}
}
Alternatively, you can use foldLeft:
val valueDF = fileDataFrame.select("colA", "colB")
val valueArr = valueDF.collect()
val result = valueArr.foldLeft(Seq[(String, Double)]()) {(s, row) =>
val a = row(0).toString()
val b = row(1).toString().toDouble
if (b < 0){
s :+ (a, b) // append tuple with A and B to results sequence
} else {
s // let results sequence unmodified
}
}
// do something with result
All of these are considered functional... which one you prefer is for the most part a matter of taste. The first 2 examples (filter/map, map/filter) do have a performance disadvantage compared to the rest because they iterate through the sequence twice.
Note that in FP it's very important to minimize side effects / isolate them from the main logic. I/O ("write A and B somewhere") is a side effect. So you normally will write your functions such that they don't have side effects - just input -> output logic without affecting or retrieving data from the surroundings. Once you have a final result, you can do side effects. In this concrete case, once you have result (which is a sequence of A and B tuples), you can loop through it and print it. This way you can for example change easily the way to print (you may want to print to the console, send to a remote place, etc.) without touching the main logic.
Also you should prefer immutable values (val) wherever possible, which is safer. Even in your loop, row, A and B are not modified so there's no reason to use var.
(Btw, I corrected the values names to start with lower case, see conventions).
In Scala I tend to favour writing large chained expressions over many smaller expressions with val assignments. At my company we've sort of evolved a style for this type of code. Here's a totally contrived example (idea is to show an expression with lots of chained calls):
import scala.util.Random
val table = (1 to 10) map { (Random.nextInt(100), _) } toMap
def foo: List[Int] =
(1 to 100)
.view
.map { _ + 3 }
.filter { _ > 10 }
.flatMap { table.get }
.take(3)
.toList
Daniel Spiewak's Scala Style Guide (pdf), which I generally like, suggests the leading dot notation in the chained method calls may be bad (see doc: Method Invocation / Higher-Order Functions), though it doesn't cover multi-line expressions like this directly.
Is there another, more accepted/idiomatic way to write the function foo above?
UPDATE: 28-Jun-2011
Lots of great answers and discussion below. There doesn't appear to be a 100% "you must do it this way" answer, so I'm going to accept the most popular answer by votes, which is currently the for comprehension approach. Personally, I think I'm going to stick with the leading-dot notation for now and accept the risks that come with it.
The example is slightly unrealistic, but for complex expressions, it's often far cleaner to use a comprehension:
def foo = {
val results = for {
x <- (1 to 100).view
y = x + 3 if y > 10
z <- table get y
} yield z
(results take 3).toList
}
The other advantage here is that you can name intermediate stages of the computation, and make it more self-documenting.
If brevity is your goal though, this can easily be made into a one-liner (the point-free style helps here):
def foo = (1 to 100).view.map{3+}.filter{10<}.flatMap{table.get}.take(3).toList
//or
def foo = ((1 to 100).view map {3+} filter {10<} flatMap {table.get} take 3).toList
and, as always, optimise your algorithm where possible:
def foo = ((1 to 100).view map {3+} filter {10<} flatMap {table.get} take 3).toList
def foo = ((4 to 103).view filter {10<} flatMap {table.get} take 3).toList
def foo = ((11 to 103).view flatMap {table.get} take 3).toList
I wrap the entire expression into a set of parenthesis to group things and avoid dots if possible,
def foo: List[Int] =
( (1 to 100).view
map { _ + 3 }
filter { _ > 10 }
flatMap { table.get }
take(3)
toList )
Here's how extempore does it. You can't go wrong.
(specMember
setInfo subst(env, specMember.info.asSeenFrom(owner.thisType, sym.owner))
setFlag (SPECIALIZED)
resetFlag (DEFERRED | CASEACCESSOR | ACCESSOR | LAZY)
)
Authentic compiler source!
I prefer lots of vals:
def foo = {
val range = (1 to 100).view
val mappedRange = range map { _+3 }
val importantValues = mappedRange filter { _ > 10 } flatMap { table.get }
(importantValues take 3).toList
}
Because I don't know what you want to purpose with your code, I chose random names for the vals.
There is a big advantage to choose vals instead of the other mentioned solutions:
It is obvious what your code does. In your example and in the solutions mentioned in most other answers anyone does not know at first sight what it does. There is too much information in one expression. Only in a for-expression, mentioned by #Kevin, it is possible to choose telling names but I don't like them because:
They need more lines of code
They are slower due to pattern match the declared values (I mentioned this here).
Just my opinion, but I think they look ugly
My rule: if the expression fits on a single (80-120 character) line, keep it on one line and omit the dots wherever possible:
def foo: List[Int] =
(1 to 100).view map { _ + 3 } filter { _ > 10 } flatMap table.get take 3 toList
As Kevin pointed out, the point-free style may improve brevity (but could harm readability for developers not familiar with it):
def foo: List[Int] =
(1 to 100).view map{3+} filter{10<} flatMap table.get take 3 toList
The leading dot notation is perfectly acceptable if you need to separate the expression over multiple lines due to length. Another reason to use this notation is when the operations need individual comments. If you need to spread an expression over multiple lines, due to its length or the need to comment individual operations, it's best to wrap the entire expression in parens (as Alex Boisvert suggests. In these situations, each (logical) operation should go on its own line (i.e. each operation goes on a single line, except where multiple consecutive operations can be described succinctly by a single comment):
def foo: List[Int] =
( (1 to 100).view
map { _ + 3 }
filter { _ > 10 }
flatMap table.get
take 3
toList )
This technique avoids potential semicolon inference issues that can arise when using leading dot notation or calling a 0-arg method at the end of the expression.
I usually try to avoid using dot for things like map and filter. So I would probably write it like the following:
def foo: List[Int] =
(1 to 100).view map { x =>
x + 3 } filter { x =>
x > 10 } flatMap { table.get } take(3) toList
The leading dot notation is very readable. I might start using that.
Is it possible to use yield as an iterator without evaluation of every value?
It is a common task when it is easy to implement complex list generation, and then you need to convert it into Iterator, because you don't need some results...
Sure. Actually, there are three options for non-strictness, which I list below. For the examples, assume:
val list = List.range(1, 10)
def compute(n: Int) = {
println("Computing "+n)
n * 2
}
Stream. A Stream is a lazily evaluated list. It will compute values on demand, but it will not recompute values once they have been computed. It is most useful if you'll reuse parts of the stream many times. For example, running the code below will print "Computing 1", "Computing 2" and "Computing 3", one time each.
val stream = for (n <- list.toStream) yield compute(n)
val third = stream(2)
println("%d %d" format (third, stream(2)))
A view. A view is a composition of operations over a base collection. When examining a view, each element examined is computed on-demand. It is most useful if you'll randomly access the view, but will never look but at a small part of it. For example, running the code below will print "Computing 3" two times, and nothing else (well, besides the result).
val view = for (n <- list.view) yield compute(n)
val third = view(2)
println("%d %d" format (third, view(2)))
Iterator. An Iterator is something that is used to lazily walk through a collection. One can think of it as a "one-shot" collection, so to speak. It will neither recompute nor store any elements -- once an element has been "computed", it cannot be used again. It is a bit more tricky to use because of that, but it is the most efficient one given these constraints. For example, the following example needs to be different, because Iterator does not support indexed access (and view would perform badly if written this way), and the code below prints "Computing 1", "Computing 2", "Computing 3", "Computing 4", "Computing 5" and "Computing 6". Also, it prints two different numbers at the end.
val iterator = for (n <- list.iterator) yield compute(n)
val third = iterator.drop(2).next
println("%d %d" format (third, iterator.drop(2).next))
Use views if you want lazy evaluation, see Views.
The Scala 2.8 Collections API is a fantastic read if you're going to use the Scala collections a lot.
I have a List...
scala> List(1, 2, 3)
res0: List[Int] = List(1, 2, 3)
And a function...
scala> def foo(i : Int) : String = { println("Eval: " + i); i.toString + "Foo" }
foo: (i: Int)String
And now I'll use a for-comprehension with an Iterator...
scala> for { i <- res0.iterator } yield foo(i)
res2: Iterator[java.lang.String] = non-empty iterator
You can use a for comprehension on any type with flatMap, map and filter methods. You could also use the views:
scala> for { i <- res0.view } yield foo(i)
res3: scala.collection.SeqView[String,Seq[_]] = SeqViewM(...)
Evaluation is non-strict in either case...
scala> res3.head
Eval: 1
res4: String = 1Foo