I am trying to write a function that
takes a List read from a file as input
outputs the most frequently used string as well an integer that shows the number of times that it was used.
example output:
("Cat",5)
function signature:
def mostFreq(info: List[List[String]]): (String, Int) =
First,I thought about creating a
Map variable and a counter variable
iterating over my list to fill the map
then iterate over the map
However, there must be a simpler way to do this scala but I'm not used to scala's library just yet.
I have seen this as one way to do it that uses only integers.
Finding the most frequent/common element in a collection?
But I was wondering how it could be done using string and integers.
The solution from the linked post has just about everything you need for this.
def mostFreq(info: List[List[String]]): (String, Int) =
info.flatten.groupBy(identity).mapValues(_.size).maxBy(_._2)
It doesn't handle ties terribly well, but you haven't stated how ties should be handled.
Related
I'm new to scala and i'm trying to understand the real differences between those 3 syntaxes of code :
//first code
def add(x:Int, y:Int) = {x+y}
//second code
val add2 = (x:Int,y:Int) => x+y
//third code
def add3 = (x:Int,y:Int) => x+y
I can approximatively see the differences but I don't know which one should I use depends on the context.
Is anyone have concrete examples ?
Thanks a lot !
This is a FAQ.
But, given you also asked when to use each I guess it is worth adding a little bit over that.
First, let's explain what does each mean:
Is a method of two arguments, both are Int and it returns another Int; as such, it is not a value.
Is a function of two arguments, both are Ints and it returns another Int; as such, it is a value whose type is Function2[Int, Int, Int] (commonly known as (Int, Int) => Int).
Is a method of zero arguments, and it returns a function (Int, Int) => Int
In general always use the first one, since it is more powerful (Scala 3 will reduce the differences between methods and functions, but still), for most people the syntax is more clear, it should be more efficient. And, even if you are going to use it as a function, for example for a map, then eta-expansion will take care of that.
Use the second one when you are absolutely sure you need it as a function, for example you know you will be using things like andThen with other functions; this is not very common.
Never use the third one since it would be creating a new object every time you call it, to then discard it after its use (which would be very inefficient) and it should be the same as using val instead of def; the only "valid" reason for a method of no arguments that returns a function, would be that the function is always different but that would imply a side-effect that is discouraged.
I am new in Scala programming and the flatmap function gives me headache.
Before you ask me to search, I made some research on Stackoverflow : https://stackoverflow.com/search?q=scala+map+to+list but I didn't find something to solve easily my problem.
Here is my problem :
For a scala programming project with Jacop (http://jacopguide.osolpro.com/guideJaCoP.html) ,
I need to convert this kind of map
Map[Int,List[List[IntVar]]]
to :
List[T]
// In my case , only a List[IntVar]
I know I should use several times flatMap but I wonder how correctly use that in my case.
Thanks for your help
If you want every IntVar from the values of the map, you could to this:
map.values.flatten.flatten.toList
The call to values returns an Iterable containing all the values of the map. In this case, it returns an object of type Iterable[List[List[IntVar]]]. The first flatten-call on this object flattens it to an Iterable[List[IntVar]]. The second flatten-call flattens this object further to an Iterable[IntVar]. Finally, the toList method converts it to a List[IntVar].
The title pretty much sums it up. Option as a singleton collection can sometimes be confusing, but sometimes it allows for an interesting application. I have one example on top of my head, and would like to learn more of such examples.
My only example is running for comprehension on the Option[List[T]]. We can do the following:
val v = Some(List(1, 2, 3))
for {
list <- v.toList
elem <- list
} yield elem + 1
Without having Option.toList, it wouldn't be possible to stay in the same for comprehension, and I'd be forced to write something like this:
for {
list <- v
} yield for {
elem <- list
} yield elem + 1
The first example is cleaner, and it's an advantage of Option being a collection. Of course, the result type will be different in these 2 examples, but let's assume it doesn't matter for the sake of discussion.
Any other examples? I'd especially like to concentrate on collection-like usage, and not usage of Option's monadic properties - those are pretty much obvious. In other words, map and flatMap functions are out of scope of this question. They're definitely very useful, just coming from elsewhere.
I find that working with Option[T] as a collection's main benefit is that you get to use operations defined on a collection, such as map, flatmap, filter, foreach etc. This makes it easier to do operations on a given option, instead of using pattern matching or checking Option[T].isDefined to see if a value exists.
For example, let's take the user repository example from Daniel Westheide blog post about Option[T]:
Say you have a UserRepository object which returns users based on their ID. The user may or may not exist, hence it returns an Option[Person]. Now let's say we want to search a person by id and then filter their age. We can do:
val age: Some[Int] = UserRepository.findById(1).map(_.age)
Now let's say that a Person also has a gender property of type Option[String]. If you wanted to extract that out, you could use map:
val gender: Option[Option[String]] = UserRepository.findById(1).map(_.gender)
But working with nested options isn't too convenient. For that, you have flatMap:
val gender: Option[String] = UserRepository.findById(1).flatMap(_.gender)
And if we want to print out the gender if it exists, we can use foreach:
gender.foreach(println)
You'll find yourself working with scala types that have nested Option[T] fields defined and it's really handy to have collection like methods which help you remove out boilerplate and noise for extracting the actual value out of the operation.
A more real life use case I just encountered the other day was working with the awscala SDK, where I wanted to retrieve an object from S3 storage:
val bucket: Option[Bucket] = s3.bucket(amazonConfig.bucketName)
val result: Option[S3Object] = bucket.flatMap(_.get(amazonConfig.offsetKey))
result.flatMap(s3Object =>
Source.fromInputStream(s3Object.content).mkString.decodeOption[Array[KafkaOffset]])
So what happens here is that you query the S3 service for a bucket, which may or may not exist. Then, you want to extract an S3Object out of it which actually contains the data, but the API itself returns an Option[S3Object], so it's handy to use flatMap to flat out get an Option[S3Object] instead of Option[Option[S3Object]]. Finally, I want to deserialize the S3Object which actually contains a JSON, and using the Argonaut library, it returns an Option[MyObject], so then again using flatMap to the rescue of extracting the inner option type.
Edit:
As you pointed out, map and flatMap belong to the monadic property of Option[T]. I've written a blog post describing the reduction of two options where the final solution was:
def reduce[T](a: Option[T], b: Option[T], f: (T, T) => T): Option[T] = {
(a ++ b).reduceLeftOption(f)
}
Which takes advantage of the ++ operator defined on any collection which is also specifically defined on Option[T], being a collection.
I'd suggest to take a look at the corresponding chapter of The Neophyte's Guide to Scala.
In my experience, most useful use-cases of Option-as-collection are to filter an option and to make flatMap that implicitly filters None values.
I'm creating a method to retrieve a list of users from a database by ID.
I'm trying to decide on the pros and cons of declaring the ids parameter as Option[Seq[String]] vs Seq[Option[String]]?
In what cases should I favour one over the other?
A list of users in neither well represented as an Option[Seq[String]] nor as a Seq[Option[String]]. I would expect something like a List[User] as a list of users. Or maybe a Vector or Seq
If your string represents your user, and the None case does nothing, you could consider filtering those out. You can do this with
val dbresult: Seq[Option[String]] = ???
val strings = dbresult collect { case Some(str) => str }
or
val strings = dbresult.flatten
but it's difficult to give good advice without knowing what the Option[String] or Option[Seq] represents
As usual this strongly depends on the use case.
A Seq[Option[String]] will be useful if the size of the sequence is relevant (eg., because you want to zip it with another sequence).
If this is not the case I would opt for flattening the sequence in order to just have a Seq[String]. This will likely be a better choice than Option[Seq[String]], as the sequence can also be of zero length.
In fact an Option can usually be treated as if it where an array that can have either length zero or one. Therefore wrapping an Iterable in an Option often only adds unnecessary complexity.
Does Scala have a native way to count all occurrences of a character in a string?
If so, how do I do it?
If not, do I need to use Java? If so, how do I do that?
Thanks!
"hello".count(_ == 'l') // returns 2
i don't use Scala or even java but google search for "Scala string" brought me to here
which contains :
def
count (p: (Char) ⇒ Boolean): Int
Counts the number of elements in the string which satisfy a predicate.
p
the predicate used to test elements.
returns
the number of elements satisfying the predicate p.
Definition Classes
TraversableOnce → GenTraversableOnce
Seems pretty straight forward but i dont use Scala so don't know the syntax of calling a member function. May be more overhead than needed this way because it looks like it can search for a sequence of characters. read on a different result page a string can be changed into a sequence of characters and you can probably easily loop through them and increase a counter.
You can also take a higher level approach to look into substring occurrences within another string, by using sliding:
def countSubstring(str: String, sub: String): Int =
str.sliding(sub.length).count(_ == sub)