How can I split a list into multiple other lists? - scala

I only recently started working with Scala and I came face to face with a problem I can't seem to find a solution to. So basically, I'm given an input text file by the name of "in.txt", which includes lines of coordinates that I have to work with like I've shown bellow.
2 1
6 6
4 2
2 5
2 6
2 7
3 4
6 1
6 2
2 3
6 3
6 4
6 5
6 7
I decided to use a List to store all the values so I could use built in functions to do calculations with the values afterwards.
val lines = io.Source.fromFile("in.txt").getLines
val coordinates =
lines
.drop(0)
.toList
.sortWith(_<_)
.mkString
.replaceAll("\\s", "")
.grouped(2)
.toList
Everything works as it should, as the output of println(coordinates) is
List(21, 23, 25, 26, 27, 34, 42, 61, 62, 63, 64, 65, 66, 67)
But what I want to do next is to create multiple lists out of this one. For example, a new list should be created if, for example, a value starts with "2", and all the values that start with "2" would be placed in the new list like this:
List(21, 23, 25, 26, 27)
Then the same would be done with "3", then "4" and so on.
Using functions such as .partition and .groupBy works, but taking into account the fact that the values in the coordinates can also reach 4 digit numbers, and that they can change if the input file is edited, it would be a pain to write all those conditions manually. So basically my question is this: Is it possible to achieve this by making use of Scala's functionality, some sort of form of iterations?
Thanks in advance!

I am assuming your file can take a mixture of 2, 3, 4, ... digit strings.
scala> val l = List("12", "13", "123", "1234")
l: List[String] = List(12, 13, 123, 1234)
scala> val grouped = l.groupBy(s => s.take(s.length - 1)).values
grouped: Iterable[List[String]] = MapLike(List(123), List(12, 13), List(1234))
If you want this sorted:
val grouped = l.groupBy(s => s.take(s.length - 1)).toSeq.sortBy(_._1).map{ case (_, l) => l.sorted}
grouped: Seq[List[String]] = ArrayBuffer(List(12, 13), List(123), List(1234))

You can generate all your input conditions with a range:
val conditions = 1 to 9999
And then foldLeft them filtering your original list by each of its elements:
conditions.foldLeft(List():List[List[Int]])((acc, elem) => l.filter(_.toString.startsWith(elem.toString))::acc).filterNot(_.isEmpty)
Output
res28: List[List[Int]] = List(List(67), List(66), List(65), List(64), List(63), List(62), List(61), List(42), List(34), List(27), List(26), List(25), List(23), List(21), List(61, 62, 63, 64, 65, 66, 67), List(42), List(34), List(21, 23, 25, 26, 27))

Related

Trying to understand the Range collection in Scala and why errors arise upon assigning the data type in Scala v2.13 as opposed to v2.11

I am taking a basic Scala training offered here: https://www.lynda.com/Scala-tutorials/Scala-Essential-Training-Data-Science/559182-2.html and on an introductory section where the instructor is introducing collections, he issues these commands in the REPL (using Scala v2.11):
scala> val myRange = 1 to 10
myRange: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> val myRange2 : Range = new Range(1, 101, 2)
myRange2: Range = Range(1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99)
The inferred type becomes a type of object: scala.collection.immutable.Range.Inclusive and the explicitly typed one is just: Range.
Furthermore, in version 2.13 of Scala (the current version I had installed before I rolled back to the instructors version), the same commands result in:
scala> val myRange = 1 to 10
myRange: scala.collection.immutable.Range.Inclusive = Range 1 to 10
scala> val myRange2 : Range = new Range(1, 101, 2)
^
error: class Range is abstract; cannot be instantiated
There isn't any explanation from the instructor as to these types which appear to differ, and I'm struggling to understand why an error occurs in the newer version of Scala? Was the Range class previously not an abstract class? And if so, why was it changed?
Taking it in reverse, yes Scala 2.13 made Range abstract when it was previously concrete. However there was never any need to use new to create one because val myRange2 = Range(1, 101, 2) will work just fine, so this is an error in the tutorial.
The to method returns the type Range.Inclusive which is why this is printed by REPL. This is a subtype of Range so it has all the methods of Range and can be used wherever a Range can be used.
new Range returns Range because it is explicitly calling the constructor so it must return Range.
Note that if you do use Range(1, 101, 2) this will return Range in 2.12 and Range.Exclusive in 2.13.
If you compare an older Range Scaladocs page (2.12.7 in this case) to the current Range Scaladocs page (2.13.1), you'll see that, yes, the Range class was changed to abstract. Not sure why. Collections went through a lot of changes with the 2.13 release.
As for the different Range type refinements, it's because to means Inclusive, which is not the default Range type.
Welcome to Scala 2.12.7 (OpenJDK 64-Bit Server VM, Java 11.0.6).
Type in expressions for evaluation. Or try :help.
scala> 2 to 8
res0: scala.collection.immutable.Range.Inclusive = Range 2 to 8
scala> 2 until 9
res1: scala.collection.immutable.Range = Range 2 until 9
scala> Range(2, 9)
res2: scala.collection.immutable.Range = Range 2 until 9
And there's been a further refinement in 2.13.
2 to 8 //res0: scala.collection.immutable.Range.Inclusive = Range 2 to 8
2 until 9 //res1: scala.collection.immutable.Range = Range 2 until 9
Range(2, 9) //res2: scala.collection.immutable.Range.Exclusive = Range 2 until 9

printing elements in list using stream

Why does the following code prints only 1 and not the rest of the list elements?
scala> val l: List [Int] = List(1,2,3)
l: List[Int] = List(1, 2, 3)
scala> l.toStream.map(x => print(x))
1res3: scala.collection.immutable.Stream[Unit] = Stream((), ?)
What is the correct way to write this code?
I'll divide my answer to two:
1. The map method in Scala:
you're using map, which expects a function with no side-effects (printing is a side effect). What you're looking for is:
l.toStream.foreach(x => print(x))
Basically, the general idea is that map takes something and converts it to something else (for example, increasing its value). while foreach is performing some action on that value that isn't supposed to have a return value.
scala> l.toStream.foreach(x => print(x))
123
2. Stream in Scala:
Streams are lazy, so Scala only computes the values it needs. Try this:
scala> l.toStream.map(x => x+1)
res2: scala.collection.immutable.Stream[Int] = Stream(2, ?)
You can see it computed the first value, and the question marks states that it has no idea what comes after it, because it didn't compute it yet. In you're example the first value is nothing, as the print returns no value.
Stream is on demand data structure which means not all the values will be evaluated until you need them.
example,
scala> val stream = (1 to 10000).toStream
stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)
Now if you access head and tail, stream will be evaluated upto 2nd index.
scala> stream.head
res13: Int = 1
scala> stream.tail
res14: scala.collection.immutable.Stream[Int] = Stream(2, ?)
scala> stream
res15: scala.collection.immutable.Stream[Int] = Stream(1, 2, ?)
If you access index 99,
scala> stream(99)
res16: Int = 100
Now if you print stream, stream will be evaluated upto 99th index,
scala> stream
res17: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, ?)
It's always good to process only those in stream, which you need. you can use take() for that.
scala> stream.take(50).foreach(x => print(x + " "))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
So, answer to your question can be,
scala> List(1,2,3).toStream.take(3).foreach(x => print(x + " "))
1 2 3
Reference
https://www.coursera.org/learn/progfun2/home/week/2
to print complete stream use
l.toStream.print
Output: 1, 2, 3, empty
to print first n values, you may use take(n)
l.toStream.take(2).print
prints output: 1, 2, empty
You can print it with
l.toStream.foreach(println)
But generally speaking is not a good idea trying to print or even processing without being careful a whole Stream since it may be infinite and cause an error while doing so.
More info about Streams here
Streams in Scala are lazy data structures which means that they tend to perform only the as needed work.
scala> val stream1 = Stream.range(1, 10)
// stream1: scala.collection.immutable.Stream[Int] = Stream(1, ?)
In this case, only the first element is computed. The stream knows how to compute rest of the elements and will compute them only when it actually needs them. for example ("consumers"),
scala> val list1 = stream1.toList
// list1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> stream1.foreach(print)
// 123456789
But when faced with "transformers", Streams will just keep the new transformation with them but will not apply to whole Stream. The map method is supposed to provide transforms
scala> val stream2 = stream1.map(i => i + 5)
// stream2: scala.collection.immutable.Stream[Int] = Stream(6, ?)
So, it just know that it has to apply this i => i + 5 function to respective elements of stream1 to get element of stream2. And will do that when required (facing any consumer).
Lets consider something similar to your example,
scala> val stream3 = stream1.map(i => println("element :: " + i))
// element :: 1
// stream3: scala.collection.immutable.Stream[Unit] = Stream((), ?)
Here your "transform" function takes an element Int, prints it in line and returns nothing which is called Unit or () in Scala. Out lazy stream here, will compute this transform for first element and will not do for rest. And this computation here will result in that element :: 1 being printed.
Now, lets see what happens when we apply some consumer to it,
scala> val list3 = stream3.toList
// element :: 2
// element :: 3
// element :: 4
// element :: 5
// element :: 6
// element :: 7
// element :: 8
// element :: 9
// list3: List[Unit] = List((), (), (), (), (), (), (), (), ())
Which will look wrong to most people. All I wanted to convert my stream to list but why are all these lines getting printed.
Which is why, when you are using map, you should provide a pure function.
What is a pure function? The simple answer is that a pure function only does the things it is supposed to do and nothing else. It does not cause any change out of its scope. It just takes something and give something else back.
All of the following are pure functions,
scala> val pf1 = (i: Int) => i + 1
// pf1: Int => Int = $$Lambda$1485/1538411140#6fdc53db
scala> val pf2 = (i: Int) => {
| val x = 100
| val xy = 200
| xy + i
| }
// pf2: Int => Int = $$Lambda$1487/14070792#7bf770ba
scala> val pf3 = (i: Int) => ()
// pf3: Int => Unit = $$Lambda$1486/1145379385#336cd7d5
Where as following is not a pure function,
val npf1 = (i: Int) => println(i)
// npf1: Int => Unit = $$Lambda$1488/1736134005#7ac97ba6
Because it causes a "magical" change in the environment.
Why "magical"? Because, it claims to be a function of type Int => Unit which means it should just be transforming an Int to an Unit. But it also printed something on our console, which was outside of its environment.
A real world example of this magic will be that - whenever you put a bread in your toaster it causes a rain storm on the Hulk's current location. And nobody wants the Hulk to come looking for their toaster.
In general, the bottom line is, that you should not use side effects in .map. When you do foo.map(bar) that just returns another collection, that contains element, generated by applying bar to the original collection. It may or may not be lazy. The point is, you should treat the elements of any collection as undefined until something looks at them.
If you want side effects, use foreach: Seq(1,2,3).toStream.foreach(println)

Set operation to divide lists by one another in Scala

Right now I have 2 lists in Scala:
val one = List(50, 10, 17, 8, 16)
val two = List(582, 180, 174, 159, 158)
These lists are going to be of the same length, and right now I'm looking to divide each element of the first list by a corresponding element in the second. In other words, I want a list that consists of:
List(50/582, 10/180, etc...)
Is there a set operation that accomplishes this that can be done without looping?
Thank you!
You can use the zip function.
val one = List(50, 10, 17, 8, 16)
val two = List(582, 180, 174, 159, 158)
one.zip(two).map {
case (a, b) => a.toDouble/b.toDouble
}

ScalaCheck: choose an integer with custom probability distribution

I want to create a generator in ScalaCheck that generates numbers between say 1 and 100, but with a bell-like bias towards numbers closer to 1.
Gen.choose() distributes numbers randomly between the min and max value:
scala> (1 to 10).flatMap(_ => Gen.choose(1,100).sample).toList.sorted
res14: List[Int] = List(7, 21, 30, 46, 52, 64, 66, 68, 86, 86)
And Gen.chooseNum() has an added bias for the upper and lower bounds:
scala> (1 to 10).flatMap(_ => Gen.chooseNum(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 61, 85, 86, 91, 92, 100, 100)
I'd like a choose() function that would give me a result that looks something like this:
scala> (1 to 10).flatMap(_ => choose(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 2, 5, 11, 18, 35, 49, 100)
I see that choose() and chooseNum() take an implicit Choose trait as an argument. Should I use that?
You could use Gen.frequency() (1):
val frequencies = List(
(50000, Gen.choose(0, 9)),
(38209, Gen.choose(10, 19)),
(27425, Gen.choose(20, 29)),
(18406, Gen.choose(30, 39)),
(11507, Gen.choose(40, 49)),
( 6681, Gen.choose(50, 59)),
( 3593, Gen.choose(60, 69)),
( 1786, Gen.choose(70, 79)),
( 820, Gen.choose(80, 89)),
( 347, Gen.choose(90, 100))
)
(1 to 10).flatMap(_ => Gen.frequency(frequencies:_*).sample).toList
res209: List[Int] = List(27, 21, 31, 1, 21, 18, 9, 29, 69, 29)
I got the frequencies from https://en.wikipedia.org/wiki/Standard_normal_table#Complementary_cumulative. The code is just a sample of the table (% 3 or mod 3), but I think you can get the idea.
I can't take much credit for this, and will point you to this excellent page:
http://www.javamex.com/tutorials/random_numbers/gaussian_distribution_2.shtml
A lot of this depends what you mean by "bell-like". Your example doesn't show any negative numbers but the number "1" can't be in the middle of the bell and not produce any negative numbers unless it was a very, very tiny bell!
Forgive the mutable loop but I use them sometimes when I have to reject values in a collection build:
object Test_Stack extends App {
val r = new java.util.Random()
val maxBellAttempt = 102
val stdv = maxBellAttempt / 3 //this number * 3 will happen about 99% of the time
val collectSize = 100000
var filled = false
val l = scala.collection.mutable.Buffer[Int]()
//ref article above "What are the minimum and maximum values with nextGaussian()?"
while(l.size < collectSize){
val temp = (r.nextGaussian() * stdv + 1).abs.round.toInt //the +1 is the mean(avg) offset. can be whatever
//the abs is clipping the curve in half you could remove it but you'd need to move the +1 over more
if (temp <= maxBellAttempt) l+= temp
}
val res = l.to[scala.collection.immutable.Seq]
//println(res.mkString("\n"))
}
Here's the distribution I just pasted the output into excel and did a "countif" to show the freq of each:

how to randomly select a certain number of elements from a list

I would like to randomly select a certain number of elements from a list and make another list out of it. For example out of a list containing 100 elements I would like to randomly select 20 of the elements and store it in another list.
The easiest way to do this is a one-liner:
scala> util.Random.shuffle((1 to 100).toList).take(10)
res0: List[Int] = List(63, 21, 49, 70, 73, 14, 23, 88, 28, 97)
You could try to get clever and avoid shuffling the entire list, but it's almost definitely not necessary, and it'll be very easy to get it wrong.
Use util.Random to shuffle the list and then take the first 20 elements :
scala> import scala.util.Random
import scala.util.Random
scala> val l = List.range(1,100)
l: List[Int] = List(1, 2, 3, ...., 98, 99)
scala> Random.shuffle(l).take(20)
res2: List[Int] = List(11, 32, 95, 56, 90, ..., 45, 20)