Flatten syntax with yield - improving code readability - scala

I'm trying to improve the readability of my code and I'm having a hard time with this little chunk.
Foo is a method that accepts a List[Ping]
Thing.generate returns a List[Ping]
ListOfPings is a List[Ping]
hasQuality returns a boolean value from evaluating a Ping
Here's the code:
foo((for {
pinger <- listOfPings
} yield pinger.generate.filter(_.hasQuality)).flatten)
Each Ping in listOfPingss is creating a List[Thing] with the generate method, meaning the result of the yield at the end of the loop is a List[List[Ping]].
I'm flattening that List[List[Ping]] (not the individual lists), and putting the whole result into foo
I'm having trouble making this look nicer, potentially with a flatmap? I sincerely appreciate the help.

Something like:
foo {
for (p <- listOfPings ; q <- p.generate if q.hasQuality) yield q
}

Related

Execution order in Scala for/yield block

I make three database calls (that all return Future values) using this syntax:
for {
a <- databaseCallA
b <- databaseCallB(a)
c <- databaseCallC(a)
} yield (a,b,c)
The second and third call depend on the result of the first, but the two of them could be run in parallel.
How can I get databaseCallC to be issued immediately after databaseCallB (without waiting for the result b)?
Or is this already happening?
This is not happening currently - you have told the Futures to start one after the other. To parallelise the second and third call, you could use this:
for {
a <- databaseCallA
(eventualB, eventualC) = (databaseCallB(a), databaseCallC(a))
b <- eventualB
c <- eventualC
} yield(a,b,c)
This will start both the computation of b and c as soon as a is available, and complete once all three are available with the triple

What does a pure expression mean in Scala?

I have two questions. The first one: is code a pure expression?
lazy val code: Unit = {
// block of code
var s = "abc"
for (i <- 0 until 10) println(i)
s += s concat "def"
println(s)
}
And the second one: What does a pure expression mean? Is this a some code which does not return anything?
A pure expression is a computation that serves only to produce a resulting value - it has no side effects. In the case of your field code above, you make calls to print stuff to the console (println), which is considered a side effect, so it is not a pure expression. An example of a pure expression would be something like:
lazy val foo = 2 + 3
It does nothing apart from generate the final value for foo, and could safely be replaced by the result of the computation (5) without changing the outcome of the program in any way. If you made such a replacement in your code above:
lazy val code: Unit = ()
you would change the program - it no longer prints anything to the console.
Have a look here, for example, for more information about pure functions and pure expressions, and their significance in functional programming.

Scala lazy val caching

In the following example:
def maybeTwice2(b: Boolean, i: => Int) = {
lazy val j = i
if (b) j+j else 0
}
Why is hi not printed twice when I call it like:
maybeTwice2(true, { println("hi"); 1+41 })
This example is actually from the book "Functional Programming in Scala" and the reason given as why "hi" not getting printed twice is not convincing enough for me. So just thought of asking this here!
So i is a function that gives an integer right? When you call the method you pass b as true and the if statement's first branch is executed.
What happens is that j is set to i and the first time it is later used in a computation it executes the function, printing "hi" and caching the resulting value 1 + 41 = 42. The second time it is used the resulting value is already computed and hence the function returns 84, without needing to compute the function twice because of the lazy val j.
This SO answer explores how a lazy val is internally implemented. In j + j, j is a lazy val, which amounts to a function which executes the code you provide for the definition of the lazy val, returns an integer and caches it for further calls. So it prints hi and returns 1+41 = 42. Then the second j gets evaluated, and calls the same function. Except this time, instead of running your code, it fetches the value (42) from the cache. The two integers are then added (returning 84).

Scala while loop returns Unit all the time

I have the following code, but I can't get it to work. As soon as I place a while loop inside the case, it's returning a unit, no matter what I change within the brackets.
case While(c, body) =>
while (true) {
eval(Num(1))
}
}
How can I make this while loop return a non-Unit type?
I tried adding brackets around my while condition, but still it doesn't do what it's supposed to.
Any pointers?
Update
A little more background information since I didn't really explain what the code should do, which seems to be handy if I want to receive some help;
I have defined a eval(exp : Exp). This will evaluate a function.
Exp is an abstract class. Extended by several classes like Plus, Minus (few more basic operations) and a IfThenElse(cond : Exp, then : Exp, else : Exp). Last but not least, there's the While(cond: Exp, body: Exp).
Example of how it should be used;
eval(Plus(Num(1),Num(4)) would result in NumValue(5). (Evaluation of Num(v : Value) results in NumValue(v). NumValue extends Value, which is another abstract class).
eval(While(Lt(Num(1),Var("n")), Plus(Num(1), Var("n"))))
Lt(a : Exp, b : Exp) returns NumValue(1) if a < b.
It's probably clear from the other answer that Scala while loops always return Unit. What's nice about Scala is that if it doesn't do what you want, you can always extend it.
Here is the definition of a while-like construct that returns the result of the last iteration (it will throw an exception if the loop is never entered):
def whiley[T](cond : =>Boolean)(body : =>T) : T = {
#scala.annotation.tailrec
def loop(previous : T) : T = if(cond) loop(body) else previous
if(cond) loop(body) else throw new Exception("Loop must be entered at least once.")
}
...and you can then use it as a while. (In fact, the #tailrec annotation will make it compile into the exact same thing as a while loop.)
var x = 10
val atExit = whiley(x > 0) {
val squared = x * x
println(x)
x -= 1
squared
}
println("The last time x was printed, its square was : " + atExit)
(Note that I'm not claiming the construct is useful.)
Which iteration would you expect this loop to return? If you want a Seq of the results of all iterations, use a for expression (also called for comprehension). If you want just the last one, create a var outside the loop, set its value on each iteration, and return that var after the loop. (Also look into other looping constructs that are implemented as functions on different types of collections, like foldLeft and foldRight, which have their own interesting behaviors as far as return value goes.) The Scala while loop returns Unit because there's no sensible one size fits all answer to this question.
(By the way, there's no way for the compiler to know this, but the loop you wrote will never return. If the compiler could theoretically be smart enough to figure out that while(true) never terminates, then the expected return type would be Nothing.)
The only purpose of a while loop is to execute a side-effect. Or put another way, it will always evaluate to Unit.
If you want something meaningful back, why don't you consider using an if-else-expression or a for-expression?
As everyone else and their mothers said, while loops do not return values in Scala. What no one seems to have mentioned is that there's a reason for that: performance.
Returning a value has an impact on performance, so the compiler would have to be smart about when you do need that return value, and when you don't. There are cases where that can be trivially done, but there are complex cases as well. The compiler would have to be smarter, which means it would be slower and more complex. The cost was deemed not worth the benefit.
Now, there are two looping constructs in Scala (all the others are based on these two): while loops and recursion. Scala can optimize tail recursion, and the result is often faster than while loops. Or, otherwise, you can use while loops and get the result back through side effects.

Best way to create generic/method consistency for sort.data.frame?

I've finally decided to put the sort.data.frame method that's floating around the internet into an R package. It just gets requested too much to be left to an ad hoc method of distribution.
However, it's written with arguments that make it incompatible with the generic sort function:
sort(x,decreasing,...)
sort.data.frame(form,dat)
If I change sort.data.frame to take decreasing as an argument as in sort.data.frame(form,decreasing,dat) and discard decreasing, then it loses its simplicity because you'll always have to specify dat= and can't really use positional arguments. If I add it to the end as in sort.data.frame(form,dat,decreasing), then the order doesn't match with the generic function. If I hope that decreasing gets caught up in the dots `sort.data.frame(form,dat,...), then when using position-based matching I believe the generic function will assign the second position to decreasing and it will get discarded. What's the best way to harmonize these two functions?
The full function is:
# Sort a data frame
sort.data.frame <- function(form,dat){
# Author: Kevin Wright
# http://tolstoy.newcastle.edu.au/R/help/04/09/4300.html
# Some ideas from Andy Liaw
# http://tolstoy.newcastle.edu.au/R/help/04/07/1076.html
# Use + for ascending, - for decending.
# Sorting is left to right in the formula
# Useage is either of the following:
# sort.data.frame(~Block-Variety,Oats)
# sort.data.frame(Oats,~-Variety+Block)
# If dat is the formula, then switch form and dat
if(inherits(dat,"formula")){
f=dat
dat=form
form=f
}
if(form[[1]] != "~") {
stop("Formula must be one-sided.")
}
# Make the formula into character and remove spaces
formc <- as.character(form[2])
formc <- gsub(" ","",formc)
# If the first character is not + or -, add +
if(!is.element(substring(formc,1,1),c("+","-"))) {
formc <- paste("+",formc,sep="")
}
# Extract the variables from the formula
vars <- unlist(strsplit(formc, "[\\+\\-]"))
vars <- vars[vars!=""] # Remove spurious "" terms
# Build a list of arguments to pass to "order" function
calllist <- list()
pos=1 # Position of + or -
for(i in 1:length(vars)){
varsign <- substring(formc,pos,pos)
pos <- pos+1+nchar(vars[i])
if(is.factor(dat[,vars[i]])){
if(varsign=="-")
calllist[[i]] <- -rank(dat[,vars[i]])
else
calllist[[i]] <- rank(dat[,vars[i]])
}
else {
if(varsign=="-")
calllist[[i]] <- -dat[,vars[i]]
else
calllist[[i]] <- dat[,vars[i]]
}
}
dat[do.call("order",calllist),]
}
Example:
library(datasets)
sort.data.frame(~len+dose,ToothGrowth)
Use the arrange function in plyr. It allows you to individually pick which variables should be in ascending and descending order:
arrange(ToothGrowth, len, dose)
arrange(ToothGrowth, desc(len), dose)
arrange(ToothGrowth, len, desc(dose))
arrange(ToothGrowth, desc(len), desc(dose))
It also has an elegant implementation:
arrange <- function (df, ...) {
ord <- eval(substitute(order(...)), df, parent.frame())
unrowname(df[ord, ])
}
And desc is just an ordinary function:
desc <- function (x) -xtfrm(x)
Reading the help for xtfrm is highly recommended if you're writing this sort of function.
There are a few problems there. sort.data.frame needs to have the same arguments as the generic, so at a minimum it needs to be
sort.data.frame(x, decreasing = FALSE, ...) {
....
}
To have dispatch work, the first argument needs to be the object dispatched on. So I would start with:
sort.data.frame(x, decreasing = FALSE, formula = ~ ., ...) {
....
}
where x is your dat, formula is your form, and we provide a default for formula to include everything. (I haven't studied your code in detail to see exactly what form represents.)
Of course, you don't need to specify decreasing in the call, so:
sort(ToothGrowth, formula = ~ len + dose)
would be how to call the function using the above specifications.
Otherwise, if you don't want sort.data.frame to be an S3 generic, call it something else and then you are free to have whatever arguments you want.
I agree with #Gavin that x must come first. I'd put the decreasing parameter after the formula though - since it probably isn't used that much, and hardly ever as a positional argument.
The formula argument would be used much more and therefore should be the second argument. I also strongly agree with #Gavin that it should be called formula, and not form.
sort.data.frame(x, formula = ~ ., decreasing = FALSE, ...) {
...
}
You might want to extend the decreasing argument to allow a logical vector where each TRUE/FALSE value corresponds to one column in the formula:
d <- data.frame(A=1:10, B=10:1)
sort(d, ~ A+B, decreasing=c(A=TRUE, B=FALSE)) # sort by decreasing A, increasing B