How to implement chain of string.replaceAll in Scala - scala

I have a string that I need transform to "canonical" view and for do that I need to call replaceAll() many times on string. I made it work next way:
val text = "Java Scala Fother Python JS C# Child"
val replacePatterns = List("Java", "Scala", "Python", "JS", "C#")
var replaced = text
for (pattern <- replacePatterns) {
replaced = replaced.replaceAll(pattern, "")
}
This code is result in replaced = "Fother Child" as I want, but it looks very imperative and I want eliminate accumulator "replaced".
Is there a way in Scala to handle it in one line without var's?
Thanks.

Use a fold over the list of patterns and the text to be processed as start point:
replacePatterns.foldLeft(text){case (res, pattern) => res.replaceAll(pattern, "")}

Related

Is it possible to insert a variable in Scala string like python format string?

I'm trying to create a Scala code that is similar to python where I read a text file such as:
test.sql
select * from {name}
and insert name variable from the main program itself.
I'm new to Scala but I was able to read the file as such:
val filename = "test.sql"
val src_file = scala.io.Source.fromFile(filename)
val sql_str = try src_file.getLines mkString "\n" finally src_file.close()
Now I'm trying to do something like I would do in Python but in Scala:
sql_str.format(name = "table1")
Is there an equivalent of this in Scala.
I strongly advise against using a string interpolation / replacement to build SQL queries, as it's an easy way to leave your program vulnerable to SQL Injection (regardless of what programming language you're using). If interacting with SQL is your goal, I'd recommend looking into a database helper library like Doobie or Slick.
That disclaimer out of the way, there are a few approaches to string interpolation in Scala.
Normally, string interpolation is done with a string literal in your code, with $... or ${...} used to interpolate expressions into your string (the {} are needed if your expression is more than just a name reference). For example
val name: String = /* ... */
val message = s"Hello, $name"
// bad idea if `table` might come from user input
def makeQuery(table: String) = s"select * from $table"
But this doesn't work for string templates that you load from a file; it only works for string literals that are defined in your code. If you can change things up so that your templates are defined in your code instead of a file, that'll be the easiest way.
If that doesn't work, you could resort to Java's String.format method, in which the template String uses %s as a placeholder for an expression (see the docs for full info on the syntax for that). This related question has an example for using that. This is probably closest to what you actually asked for.
You could also do something custom with string replacement, e.g.
val template: String = /* load from file */
template.replace("{name}", "my_table")
// or something more general-purpose
def customInterpolate(template: String, vars: Map[String, String]): String = {
vars.foldLeft(template) { case (s, (k, v)) =>
s.replace(s"{$k}", v)
}
}
val exampleTmp = s"update {name} set message = {message}"
customInterpolate(exampleTmp, Map(
"name" -> "my_table",
"message" -> "hello",
))

Is there a Scala equivalent to python's dir? [duplicate]

In language like python and ruby to ask the language what index-related methods its string class supports (which methods’ names contain the word “index”) you can do
“”.methods.sort.grep /index/i
And in java
List results = new ArrayList();
Method[] methods = String.class.getMethods();
for (int i = 0; i < methods.length; i++) {
Method m = methods[i];
if (m.getName().toLowerCase().indexOf(“index”) != -1) {
results.add(m.getName());
}
}
String[] names = (String[]) results.toArray();
Arrays.sort(names);
return names;
How would you do the same thing in Scala?
Curious that no one tried a more direct translation:
""
.getClass.getMethods.map(_.getName) // methods
.sorted // sort
.filter(_ matches "(?i).*index.*") // grep /index/i
So, some random thoughts.
The difference between "methods" and the hoops above is striking, but no one ever said reflection was Java's strength.
I'm hiding something about sorted above: it actually takes an implicit parameter of type Ordering. If I wanted to sort the methods themselves instead of their names, I'd have to provide it.
A grep is actually a combination of filter and matches. It's made a bit more complex because of Java's decision to match whole strings even when ^ and $ are not specified. I think it would some sense to have a grep method on Regex, which took Traversable as parameters, but...
So, here's what we could do about it:
implicit def toMethods(obj: AnyRef) = new {
def methods = obj.getClass.getMethods.map(_.getName)
}
implicit def toGrep[T <% Traversable[String]](coll: T) = new {
def grep(pattern: String) = coll filter (pattern.r.findFirstIn(_) != None)
def grep(pattern: String, flags: String) = {
val regex = ("(?"+flags+")"+pattern).r
coll filter (regex.findFirstIn(_) != None)
}
}
And now this is possible:
"".methods.sorted grep ("index", "i")
You can use the scala REPL prompt. To find list the member methods of a string object, for instance, type "". and then press the TAB key (that's an empty string - or even a non-empty one, if you like, followed by a dot and then press TAB). The REPL will list for you all member methods.
This applies to other variable types as well.
More or less the same way:
val names = classOf[String].getMethods.toSeq.
filter(_.getName.toLowerCase().indexOf(“index”) != -1).
map(_.getName).
sort(((e1, e2) => (e1 compareTo e2) < 0))
But all on one line.
To make it more readable,
val names = for(val method <- classOf[String].getMethods.toSeq
if(method.getName.toLowerCase().indexOf("index") != -1))
yield { method.getName }
val sorted = names.sort(((e1, e2) => (e1 compareTo e2) < 0))
This is as far as I got:
"".getClass.getMethods.map(_.getName).filter( _.indexOf("in")>=0)
It's strange Scala array doesn't have sort method.
edit
It would end up like.
"".getClass.getMethods.map(_.getName).toList.sort(_<_).filter(_.indexOf("index")>=0)
Now, wait a minute.
I concede Java is verbose compared to Ruby for instance.
But that piece of code shouldn't have been so verbose in first place.
Here's the equivalent :
Collection<String> mds = new TreeSet<String>();
for( Method m : "".getClass().getMethods()) {
if( m.getName().matches(".*index.*")){ mds.add( m.getName() ); }
}
Which has almost the same number of characters as the marked as correct, Scala version
Just using the Java code direct will get you most of the way there, as Scala classes are still JVM ones. You could port the code to Scala pretty easily as well, though, for fun/practice/ease of use in REPL.

Scala: value split is not a member of char

I am trying to write word count program in Scala. I'm using a string "file" :
file.map( _.split(" ")).flatMap(word => (word, 1)).reduceByKey( _ + _ )
It is keep saying that:
value split is not a member of Char
Can't figure out how to solve it!
When you call map on a String it is wrapped with WrappedString which extends AbstractSeq[Char]. Therefore, when you call map it is as if you are doing so on a Seq of Char not a Seq of String.
See the link below for the code https://github.com/scala/scala/blob/v2.10.2/src/library/scala/collection/immutable/WrappedString.scala
The code below splits by whitespace and returns the size, a word counter.
val file = "Some test data"
file.split("\\s+").size
To get a count of the number of times each word in the string appears.
val file = "Some test data test"
println(file.split("\\s+").toList.groupBy(w => w).mapValues(_.length))
I found out that the code is perfect! Just because I was running it on Spark, the answer was kept in lazy RDD file that I needed to collect it somehow. Therefore, I saved it to a text file and problem solved! Here is the code:
file.flatMap(line=>line.split(" ")).map(w=>(w,1)).reduceByKey(+).saveAsTextFile("OUT.txt")
Thanks.

Runtime exception in syntax like defining two vals with identical names

In some book I've got a code similar to this:
object ValVarsSamples extends App {
val pattern = "([ 0-9] +) ([ A-Za-z] +)". r // RegEx
val pattern( count, fruit) = "100 Bananas"
}
This is supposed to be a trick, it should like defining same names for two vals, but it is not.
So, this fails with an exception.
The question: what this might be about? (what's that supposed to be?) and why it does not work?
--
As I understand first: val pattern - refers to RegEx constructor function.. And in second val we are trying to pass the params using such a syntax? just putting a string
This is an extractor:
val pattern( count, fruit) = "100 Bananas"
This code is equivalent
val res = pattern.unapplySeq("100 Bananas")
count = res.get(0)
fruit = res.get(1)
The problem is your regex doesn't match, you should change it to:
val pattern = "([ 0-9]+) ([ A-Za-z]+)". r
The space before + in [ A-Za-z] + means you are matching a single character in the class [ A-Za-z] and then at least one space character. You have the same issue with [ 0-9] +.
Scala regexes define an extractor, which returns a sequence of matching groups in the regular expression. Your regex defines two groups so if the match succeeds the sequence will contain two elements.

How do I change this code to be functional programming in scala?

I'm new to functional programming and as I'm reading this book. It basically says that if you code contains "var" it means that you're still doing in a imperative way. I'm not sure how do I change my code to be functional way. Please suggests.
So basically what this code does is to processText some text and use regular expression to extract a particular text from "taggedText" and add that to a list and convert the list to json.
val text = params("text")
val pattern = """(\w+)/ORGANIZATION""".r
var list = List[String]()
val taggedText = processText(text)
pattern.findAllIn(taggedText).matchData foreach {
m => list ::= m.group(1)
}
pretty(render(list)) // render to json
Try replacing the middle section with
val list = pattern.findAllIn(taggedText).matchData.map(m => m.group(1)).toList
You can write m => m.group(1) as _.group(1) if you want.