I wasn't sure if groupBy, takeWhile, or grouped would achieve what I wanted to do. I need to develop a function that automatically groups a list of numbers according to the interval I want to specify. The use case is taking a list of ages and sorting them into dynamic age categories (like 1-5, 5-10, etc.). It would need to be dynamic since the user may want to change the intervals.
For example, I have the list of numbers: List(103, 206, 101, 111, 211, 234, 242, 99)
I can interval by 10, or by 100. Then the result of an input of 100 would be: List(List(99),List(101,103,111),List(206,211,234,242)).
I searched Google and SO for the last hour but couldn't find anything. Thanks for the help!
You will want groupBy:
val xs = List(103, 206, 101, 111, 211, 234, 242, 99)
xs.groupBy(_ / 100)
// Map(0 -> List(99), 1 -> List(103, 101, 111), ...)
grouped just creates subsequent clumps of a given size, not looking at the actual elements. takeWhile just takes the leading elements as long as a predicate holds.
You can use the withDefaultValue method on the resulting map to make it appear as an indexed sequence, where some entries are empty:
val ys = xs.groupBy(_ / 100) withDefaultValue Nil
ys(0) // List(99)
ys(4) // List() !
Here's an approach that generates the ranges and filters for values within them. I think the l.groupBy(_ / 100).values is preferable though.
val interval = 100
//This gives List(Vector(0, 100), Vector(100, 200), Vector(200, 300))
val intervals = 0 until l.max + interval by interval sliding(2)
for(interval <- intervals;
within <- List(l.filter(x => x > interval(0) && x <= interval(1)))
) yield within
With val l = List(103, 206, 101, 111, 211, 234, 242, 99) this gives:
List[List[Int]] = List(List(99), List(103, 101, 111), List(206, 211, 234, 242))
Related
Right now I have 2 lists in Scala:
val one = List(50, 10, 17, 8, 16)
val two = List(582, 180, 174, 159, 158)
These lists are going to be of the same length, and right now I'm looking to divide each element of the first list by a corresponding element in the second. In other words, I want a list that consists of:
List(50/582, 10/180, etc...)
Is there a set operation that accomplishes this that can be done without looping?
Thank you!
You can use the zip function.
val one = List(50, 10, 17, 8, 16)
val two = List(582, 180, 174, 159, 158)
one.zip(two).map {
case (a, b) => a.toDouble/b.toDouble
}
I want to create a generator in ScalaCheck that generates numbers between say 1 and 100, but with a bell-like bias towards numbers closer to 1.
Gen.choose() distributes numbers randomly between the min and max value:
scala> (1 to 10).flatMap(_ => Gen.choose(1,100).sample).toList.sorted
res14: List[Int] = List(7, 21, 30, 46, 52, 64, 66, 68, 86, 86)
And Gen.chooseNum() has an added bias for the upper and lower bounds:
scala> (1 to 10).flatMap(_ => Gen.chooseNum(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 61, 85, 86, 91, 92, 100, 100)
I'd like a choose() function that would give me a result that looks something like this:
scala> (1 to 10).flatMap(_ => choose(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 2, 5, 11, 18, 35, 49, 100)
I see that choose() and chooseNum() take an implicit Choose trait as an argument. Should I use that?
You could use Gen.frequency() (1):
val frequencies = List(
(50000, Gen.choose(0, 9)),
(38209, Gen.choose(10, 19)),
(27425, Gen.choose(20, 29)),
(18406, Gen.choose(30, 39)),
(11507, Gen.choose(40, 49)),
( 6681, Gen.choose(50, 59)),
( 3593, Gen.choose(60, 69)),
( 1786, Gen.choose(70, 79)),
( 820, Gen.choose(80, 89)),
( 347, Gen.choose(90, 100))
)
(1 to 10).flatMap(_ => Gen.frequency(frequencies:_*).sample).toList
res209: List[Int] = List(27, 21, 31, 1, 21, 18, 9, 29, 69, 29)
I got the frequencies from https://en.wikipedia.org/wiki/Standard_normal_table#Complementary_cumulative. The code is just a sample of the table (% 3 or mod 3), but I think you can get the idea.
I can't take much credit for this, and will point you to this excellent page:
http://www.javamex.com/tutorials/random_numbers/gaussian_distribution_2.shtml
A lot of this depends what you mean by "bell-like". Your example doesn't show any negative numbers but the number "1" can't be in the middle of the bell and not produce any negative numbers unless it was a very, very tiny bell!
Forgive the mutable loop but I use them sometimes when I have to reject values in a collection build:
object Test_Stack extends App {
val r = new java.util.Random()
val maxBellAttempt = 102
val stdv = maxBellAttempt / 3 //this number * 3 will happen about 99% of the time
val collectSize = 100000
var filled = false
val l = scala.collection.mutable.Buffer[Int]()
//ref article above "What are the minimum and maximum values with nextGaussian()?"
while(l.size < collectSize){
val temp = (r.nextGaussian() * stdv + 1).abs.round.toInt //the +1 is the mean(avg) offset. can be whatever
//the abs is clipping the curve in half you could remove it but you'd need to move the +1 over more
if (temp <= maxBellAttempt) l+= temp
}
val res = l.to[scala.collection.immutable.Seq]
//println(res.mkString("\n"))
}
Here's the distribution I just pasted the output into excel and did a "countif" to show the freq of each:
I would like to randomly select a certain number of elements from a list and make another list out of it. For example out of a list containing 100 elements I would like to randomly select 20 of the elements and store it in another list.
The easiest way to do this is a one-liner:
scala> util.Random.shuffle((1 to 100).toList).take(10)
res0: List[Int] = List(63, 21, 49, 70, 73, 14, 23, 88, 28, 97)
You could try to get clever and avoid shuffling the entire list, but it's almost definitely not necessary, and it'll be very easy to get it wrong.
Use util.Random to shuffle the list and then take the first 20 elements :
scala> import scala.util.Random
import scala.util.Random
scala> val l = List.range(1,100)
l: List[Int] = List(1, 2, 3, ...., 98, 99)
scala> Random.shuffle(l).take(20)
res2: List[Int] = List(11, 32, 95, 56, 90, ..., 45, 20)
Below sample data
val combineList = List(("A",12),("B",11),("C",12),("D",14),("E",23),("F",12),("D",53),("C",23),("B",12),("A",22),("E",21),("F",12),("C",21),("B",34),("A",34),("G",67),("D",23),("E",21),("F",12),("D",31),("B",41),("E",14),("F",15),("G",18),("A",11),("C",10),("D",9),("A",13),("E",1),("F",14))
and
val X = 98
Now want final output as below,
first group by all values as below
val groupKey = List(Map("A"->List(12,22,34,11,13)),Map("B"->List(11,12,34,41)),Map("C"->List(12,23,21,10)),Map("D"->List(14,53,23,31,9)),
Map("E"->List(23,21,21,14,1)),Map("F"->List(12,12,12,15,14)),Map("G"->List(67,18)))
Second substract X from groupKey List values here X always gretter than List values so second output will be as
val substrackValues = List(Map("A"->List(86,76,34,87,85)),Map("B"->List(87,86,34,57)),Map("C"->List(86,75,77,88)),Map("D"->List(84,45,75,31,89)),
Map("E"->List(75,77,77,84,97)),Map("F"->List(86,86,86,15,84)),Map("G"->List(31,80)))
Consider
combineList.groupBy(_._1).mapValues(xs => xs.map(v => X-v._2))
which delivers
Map(E -> List(75, 77, 77, 84, 97), F -> List(86, 86, 86, 83, 84), A -> List(86, 76, 64, 87, 85), G -> List(31, 80), B -> List(87, 86, 64, 57), C -> List(86, 75, 77, 88), D -> List(84, 45, 75, 67, 89))
Note the embedded maps in groupKey above are singleton maps which can well be represented with tuples of [(String,List[Int])] or even better agglomerated into one map.
In the solution proposed here after grouping by first tuple element, we transform each element in each list by the value of X.
I have a 3-tuple list like the following [I added line breaks for readability]:
(2, 127, 3)
(12156, 127, 3)
(4409, 127, 2) <-- 4409 occurs 2x
(1312, 127, 12) <-- 1312 occurs 3x
(4409, 128, 1) <--
(12864, 128, 1)
(1312, 128, 1) <--
(2664, 128, 2)
(12865, 129, 1)
(183, 129, 1)
(12866, 129, 2)
(1312, 129, 10) <--
I want to sum up based on the first entry. The first entry should be unique.
The result should look like this:
(2, 127, 3)
(12156, 127, 3)
(4409, 127, 3) <- new sum = 3
(1312, 127, 23) <- new sum = 23
(12864, 128, 1)
(2664, 128, 2)
(12865, 129, 1)
(183, 129, 1)
(12866, 129, 2)
How can I achieve this in Scala?
Try this:
list groupBy {_._1} mapValues {v => (v.head._1, v.head._2, v map {_._3} sum)}
The middle entry is preserved and it always takes the first one that appeared in the input list.
If you can just ignore the middle entry, then:
val l = List(('a,'e,1), ('b,'f,2), ('a,'g,3), ('b,'h,4))
l.groupBy(_._1).mapValues(_.map(_._3).sum)
// Map('b -> 6, 'a -> 4)
If you have to keep the middle entry around:
l.groupBy(_._1).map {
case (_, values) =>
val (a,b,_) = values.head
(a, b, values.map(_._3).sum)
}
// List(('b,'f,6), ('a,'e,4))
You could use the concept of a monoid. If the first two values of your entries build the key values and the remaining the associate value itself, you could use a Map.
Once you have a Map you may proceed like this:
Best way to merge two maps and sum the values of same key?