my scala list as below `enter code here
List((192.168.11.3,A,1413876302036,-,-,UP,,0.0,0.0,12,0,0,Null0,UP,0,0,4294967295,other), (192.168.11.3,A,1413876302036,-,-,UP,,0.0,0.0,8,0,0,C,DOWN,0,0,100000000,P), (192.168.1.1,A,1413876001775,-,-,UP,,0.0,0.0,12,0,0,E,UP,0,0,4294967295,other), (192.168.1.1,A,1413876001775,-,-,UP,,0.0,0.0,8,0,0,F,DOWN,0,0,100000000,E))
Now I want following operation, in list third parameter are changed in above is 1413876302036 and 1413876001775. I want to subtracts this as below
val sub = ((192.168.11.3,A,(1413876302036-1413876001775),-,-,UP,,0.0,0.0,12,0,0,Null0,UP,0,0,4294967295,other),(192.168.1.1,A,(1413876001775-1413876001775),-,-,UP,,0.0,0.0,12,0,0,E,UP,0,0,4294967295,other))
how should this calculate in scala
After 15 minutes of reading your question, I think I still don't understand it, but if I do here is an answer:
val list = List(("192.168.11.3",'A',1413876302036l,0,0,0), ("192.168.11.3",'A',1413876302036l,0,0,0),
("192.168.1.1",'A',1413876001775l,0,0,0), ("192.168.1.1",'A',1413876001775l,0,0,0))
val newList = list map { _ match {
case (a,b,value,c,d,e) => (a,b,value-1413876001775l,c,d,e)
}}
I allowed myself to rewrite your example a little. Next time try to keep it simple and go with SSCCE rules
Related
I'm new to scala and FP in general and trying to practice it on a dummy example.
val counts = ransomNote.map(e=>(e,1)).reduceByKey{case (x,y) => x+y}
The following error is raised:
Line 5: error: value reduceByKey is not a member of IndexedSeq[(Char, Int)] (in solution.scala)
The above example looks similar to staring FP primer on word count, I'll appreciate it if you point on my mistake.
It looks like you are trying to use a Spark method on a Scala collection. The two APIs have a few similarities, but reduceByKey is not part of it.
In pure Scala you can do it like this:
val counts =
ransomNote.foldLeft(Map.empty[Char, Int].withDefaultValue(0)) {
(counts, c) => counts.updated(c, counts(c) + 1)
}
foldLeft iterates over the collection from the left, using the empty map of counts as the accumulated state (which returns 0 is no value is found), which is updated in the function passed as argument by being updated with the found value, incremented.
Note that accessing a map directly (counts(c)) is likely to be unsafe in most situations (since it will throw an exception if no item is found). In this situation it's fine because in this scope I know I'm using a map with a default value. When accessing a map you will more often than not want to use get, which returns an Option. More on that on the official Scala documentation (here for version 2.13.2).
You can play around with this code here on Scastie.
On Scala 2.13 you can use the new groupMapReduce
ransomNote.groupMapReduce(identity)(_ => 1)(_ + _)
val str = "hello"
val countsMap: Map[Char, Int] = str
.groupBy(identity)
.mapValues(_.length)
println(countsMap)
I am still learning Scala and I am facing the following issue. Currently I have the following list in input
val listA=List("banana,africa,1,0",
"apple,europe,1,2",
"peas,africa,1,4")
The wanted output is :
val listB=list("banana,africa,1,0,1",
"apple,europe,1,2,3",
"peas,africa,1,4,5")
My aim is to add an element corresponding to the sum of the two last elements for each line in the list. I wrote the following basic function
def addSum(listin:List[String]):List[String]= {
listin.map(_.split(",")).map(d => d + "," + d(2)+d(3))
}
this is not working any suggestion aboit a better way to do it please
Thanks a lot
Simple solution is to do something like below
listA.map(str => str.split(",")).map(arr => (arr ++ Array(arr(2).toInt+arr(3).toInt)).mkString(","))
I written the code below for finding even numbers and the number just before it in a RDD object. In this I first converted that to a List and tried to use my own function to find the even numbers and the numbers just before them. The following is my code. In this I have made an empty list in which I am trying to append the numbers one by one.
object EvenandOdd
{
def mydef(nums:Iterator[Int]):Iterator[Int]=
{
val mylist=nums.toList
val len= mylist.size
var elist=List()
var i:Int=0
var flag=0
while(flag!=1)
{
if(mylist(i)%2==0)
{
elist.++=List(mylist(i))
elist.++=List(mylist(i-1))
}
if(i==len-1)
{
flag=1
}
i=i+1
}
}
def main(args:Array[String])
{
val myrdd=sc.parallelize(List(1,2,3,4,5,6,7,8,9,10),2)
val myx=myrdd.mapPartitions(mydef)
myx.collect
}
}
I am not able to execute this command in Scala shell as well as in Eclipse and not able to figure out the error as I am just a beginner to Scala.
The following are the errors I got in Scala Shell.
<console>:35: error: value ++= is not a member of List[Nothing]
elist.++=List(mylist(i))
^
<console>:36: error: value ++= is not a member of List[Nothing]
elist.++=List(mylist(i-1))
^
<console>:31: error: type mismatch;
found : Unit
required: Iterator[Int]
while(flag!=1)
^
Your code looks too complicated and not functional. Also, it introduce potential problems with memory: you take Iterator as param and return Iterator as output. So, knowing that Iterator itself could be lazy and has under the hood huge amount of data, materializing it inside method with list could cause OOM. So your task is to get as much data from initial iterator as it it enough to answer two methods for new Iterator: hasNext and next
For example (based on your implementation, which outputs duplicates in case of sequence of even numbers) it could be:
def mydef(nums:Iterator[Int]): Iterator[Int] = {
var before: Option[Int] = None
val helperIterator = new Iterator[(Option[Int], Int)] {
override def hasNext: Boolean = nums.hasNext
override def next(): (Option[Int], Int) = {
val result = (before, nums.next())
before = Some(result._2)
result
}
}
helperIterator.withFilter(_._2 % 2 == 0).flatMap{
case (None, next) => Iterator(next)
case (Some(prev), next) => Iterator(prev, next)
}
}
Here you have two iterators. One helper, which just prepare data, providing previous element for each next. And next on - resulting, based on helper, which filter only even for sequence elements (second in pair), and output both when required (or just one, if first element in sequence is even)
For initial code
Additionally to answer of #pedrorijo91, in initial code you do did not also return anything (suppose you wanted to convert elist to Iterator)
It will be easier if you use a functional coding style rather than an iterative coding style. In functional style the basic operation is straightforward.
Given a list of numbers, the following code will find all the even numbers and the values that precede them:
nums.sliding(2,1).filter(_(1) % 2 == 0)
The sliding operation creates a list containing all possible pairs of adjacent values in the original list.
The filter operation takes only those pairs where the second value is even.
The result is an Iterator[List[Int]] where each List[Int] has two elements. You should be able to use this in your RDD framework.
It's marked part of the developer API, so there's no guarantee it'll stick around, but the RDDFunctions object actually defines sliding for RDDs. You will have to make sure it sees elements in the order you want.
But this becomes something like
rdd.sliding(2).filter(x => x(1) % 2 == 0) # pairs of (preceding number, even number)
for the first 2 errors:
there's no ++= operator on Lists. You will have to do list = list ++ element
I've just started using Scala/Spark and having come from a Java background and I'm still trying to wrap my head around the concept of immutability and other best practices of Scala.
This is a very small segment of code from a larger program:
intersections is RDD(Key, (String, String))
obs is (Key, (String, String))
Data is just a case class I've defined above.
val intersections = map1 join map2
var listOfDatas = List[Data]()
intersections take NumOutputs foreach (obs => {
listOfDatas ::= ParseInformation(obs._1.key, obs._2._1, obs._2._2)
})
listOfDatas foreach println
This code works and does what I need it to do, but I was wondering if there was a better way of making this happen. I'm using a variable list and rewriting it with a new list every single time I iterate, and I'm sure there has to be a better way to create an immutable list that's populated with the results of the ParseInformation method call. Also, I remember reading somewhere that instead of accessing the tuple values directly, the way I have done, you should use case classes within functions (as partial functions I think?) to improve readability.
Thanks in advance for any input!
This might work locally, but only because you are takeing locally. It will not work once distributed as the listOfDatas is passed to each worker as a copy. The better way of doing this IMO is:
val processedData = intersections map{case (key, (item1, item2)) => {
ParseInfo(key, item1, item2)
}}
processedData foreach println
A note for a new to functional dev: If all you are trying to do is transform data in an iterable (List), forget foreach. Use map instead, which runs your transformation on each item and spits out a new iterable of the results.
What's the type of intersections? It looks like you can replace foreach with map:
val listOfDatas: List[Data] =
intersections take NumOutputs map (obs => {
ParseInformation(obs._1.key, obs._2._1, obs._2._2)
})
I have a server API that returns a list of things, and does so in chunks of, let's say, 25 items at a time. With every response, we get a list of items, and a "token" that we can use for the following server call to return the next 25, and so on.
Please note that we're using a client library that has been written in stodgy old mutable Java, and doesn't lend itself nicely to all of Scala's functional compositional patterns.
I'm looking for a way to return a lazily evaluated sequence of all server items, by doing a server call with the latest token whenever the local list of items has been exhausted. What I have so far is:
def fetchFromServer(uglyStateObject: StateObject): Seq[Thing] = {
val results = server.call(uglyStateObject)
uglyStateObject.update(results.token())
results.asScala.toList ++ (if results.moreAvailable() then
fetchFromServer(uglyStateObject)
else
List())
}
However, this function does eager evaluation. What I'm looking for is to have ++ concatenate a "strict sequence" and a "lazy sequence", where a thunk will be used to retrieve the next set of items from the server. In effect, I want something like this:
results.asScala.toList ++ Seq.lazy(() => fetchFromServer(uglyStateObject))
Except I don't know what to use in place of Seq.lazy.
Things I've seen so far:
SeqView, but I've seen comments that it shouldn't be used because it re-evaluates all the time?
Streams, but they seem like the abstraction is supposed to generate elements at a time, whereas I want to generate a bunch of elements at a time.
What should I use?
I also suggest you to take a look at scalaz-strem. Here is small example how it may look like
import scalaz.stream._
import scalaz.concurrent.Task
// Returns updated state + fetched data
def fetchFromServer(uglyStateObject: StateObject): (StateObject, Seq[Thing]) = ???
// Initial state
val init: StateObject = new StateObject
val p: Process[Task, Thing] = Process.repeatEval[Task, Seq[Thing]] {
var state = init
Task(fetchFromServer(state)) map {
case (s, seq) =>
state = s
seq
}
} flatMap Process.emitAll
As a matter of fact, in the meantime I already found a slightly different answer that I find more readable (indeed using Streams):
def fetchFromServer(uglyStateObject: StateObject): Stream[Thing] = {
val results = server.call(uglyStateObject)
uglyStateObject.update(results.token())
results.asScala.toStream #::: (if results.moreAvailable() then
fetchFromServer(uglyStateObject)
else
Stream.empty)
}
Thanks everyone for