I want to understand what does second argument in indexOf in Scala mean for Strings?
object Playground extends App {
val g: String = "Check out the big brains on Brad!"
println(g.indexOf("o",7));
}
The above program returns: 25 which is something I am not able to understand why?
It is actually the index of last o but how is it related to 7? Is it like the second argument n returns the index of the occurence of nth character and if n exceeds with the number of occurences then it returns the index of last present element?
But if that's the case then this doesn't make sense:
object Playground extends App {
val g: String = "Check out the big brains on Brad!"
(1 to 7).foreach(i => println(s"$i th Occurence = ${g.indexOf("o",i)} "))
}
which outputs:
1 th Occurence = 6
2 th Occurence = 6
3 th Occurence = 6
4 th Occurence = 6
5 th Occurence = 6
6 th Occurence = 6
7 th Occurence = 25
Source: https://www.scala-exercises.org/std_lib/infix_prefix_and_postfix_operators
According to Scala String documentation, the second parameter is the index to start searching from:
def indexOf(elem: Char, from: Int): Int
Finds index of first occurrence of some value in this string after or at some start index.
elem : the element value to search for.
from : the start index
returns : the index >= from of the first element of this string that is equal (as determined by ==) to elem, or -1, if none exists.
Thus, in your case, when you specify 7, it means that you will look for the index of the first character "o" which is located at index 7 or after . And indeed, in your String you have two "o", one at index 6, one at index 25.
indexOf(int ch, int fromIndex) looks for the character in the string from the specified index (fromIndex). This means it starts looking at 7th position.
Going forward you need to learn to read the official docs: indexOf:
def indexOf(elem: A, from: Int): Int
[use case] Finds index of first occurrence of some value in this general sequence after or at some start index.
Note: may not terminate for infinite-sized collections.
elem
the element value to search for.
from
the start index
returns
the index >= from of the first element of this general sequence that is equal (as determined by ==) to elem, or -1, if none exists.
I personally like to use Intellij to jump into the source code with CMD + B.
No matter how you take it: in your development flow you'll frequently access the manual\docs of the lib or lang you're using.
Related
I am new to Scala programming, I want to generate random number with 15 digits, So can you please let share some example. I have tried the below code to get the alpha number string with 10 digits.
var ranstr = s"${(Random.alphanumeric take 10).mkString}"
print("ranstr", ranstr)
You need to pay attention to the return type. You cannot have a 15-digit Int because that type is a 32-bit signed integer, meaning that it's maximum value is a little over 2B. Even getting a 10-digit number means you're at best getting a number between 1B and the maximum value of Int.
Other answers go in the detail of how to get a 15-digits number using Long. In your comment you mentioned between, but because of the limitation I mentioned before, using Ints will not allow you to go beyond the 9 digits in your example. You can, however, explicitly annotate your numeric literals with a trailing L to make them Long and achieve what you want as follows:
Random.between(100000000000000L, 1000000000000000L)
Notice that the documentation for between says that the last number is exclusive.
If you're interested in generating arbitrarily large numbers, a String might get the job done, as in the following example:
import scala.util.Random
import scala.collection.View
def nonZeroDigit: Char = Random.between(49, 58).toChar
def digit: Char = Random.between(48, 58).toChar
def randomNumber(length: Int): String = {
require(length > 0, "length must be strictly positive")
val digits = View(nonZeroDigit) ++ View.fill(length - 1)(digit)
digits.mkString
}
randomNumber(length = 1)
randomNumber(length = 10)
randomNumber(length = 15)
randomNumber(length = 40)
Notice that when converting an Int to a Char what you get is the character encoded by that number, which isn't necessarily the same as the digit represented by the Int itself. The numbers you see in the functions from the ASCII table (odds are it's good enough for what you want to do).
If you really need a numeric type, for arbitrarily large integers you will need to use BigInt. One of its constructors allows you to parse a number from a string, so you can re-use the code above as follows:
import scala.math.BigInt
BigInt(randomNumber(length = 15))
BigInt(randomNumber(length = 40))
You can play around with this code here on Scastie.
Notice that in my example, in order to keep it simple, I'm forcing the first digit of the random number to not be zero. This means that the number 0 itself will never be a possible output. If you want that to be the case if one asks for a 1-digit long number, you're advised to tailor the example to your needs.
A similar approach to that by Alin's foldLeft, based here in scanLeft, where the intermediate random digits are first collected into a Vector and then concatenated as a BigInt, while ensuring the first random digit (see initialization value in scanLeft) is greater than zero,
import scala.util.Random
import scala.math.BigInt
def randGen(n: Int): BigInt = {
val xs = (1 to n-1).scanLeft(Random.nextInt(9)+1) {
case (_,_) => Random.nextInt(10)
}
BigInt(xs.mkString)
}
To notice that Random.nextInt(9) will deliver a random value between 0 and 8, thus we add 1 to shift the possibble values from 1 to 9. Thus,
scala> (1 to 15).map(randGen(_)).foreach(println)
8
34
623
1597
28474
932674
5620336
66758916
186155185
2537294343
55233611616
338190692165
3290592067643
93234908948070
871337364826813
There a lot of ways to do this.
The most common way is to use Random.nextInt(10) to generate a digit between 0-9.
When building a number of a fixed size of digits, you have to make sure the first digit is never 0.
For that I'll use Random.nextInt(9) + 1 which guarantees generating a number between 1-9, a sequence with the other 14 generated digits, and a foldleft operation with the first digit as accumulator to generate the number:
val number =
Range(1, 15).map(_ => Random.nextInt(10)).foldLeft[Long](Random.nextInt(9) + 1) {
(acc, cur_digit) => acc * 10 + cur_digit
}
Normally for such big numbers it's better to represent them as sequence of characters instead of numbers because numbers can easily overflow. But since a 15 digit number fits in a Long and you asked for a number, I used one instead.
In scala we have scala.util.Random to get a random value (not only numeric), for a numeric value random have nextInt(n: Int) what return a random num < n. Read more about random
First example:
val random = new Random()
val digits = "0123456789".split("")
var result = ""
for (_ <- 0 until 15) {
val randomIndex = random.nextInt(digits.length)
result += digits(randomIndex)
}
println(result)
Here I create an instance of random and use a number from 0 to 9 to generate a random number of length 15
Second example:
val result2 = for (_ <- 0 until 15) yield random.nextInt(10)
println(result2.mkString)
Here I use the yield keyword to get an array of random integers from 0 to 9 and use mkString to combine the array into a string. Read more about yield
Not sure if this is the right place to ask but I couldn't find any related or similar questions.
Anyway: imagine you have a certain string like
val exampleString = "Hello StackOverflow this is my question, cool right?"
If given a position in this string, for example 23, return the word that 'occupies' this position in the string. If we look at the example string, we can see that the 23rd character is the letter 's' (the last character of 'this'), so we should return index = 5 (because 'this' is the 5th word). In my question spaces are counted as words. If, for example, we were given position 5, we land on the first space and thus we should return index = 1.
I'm implementing this in Scala (but this should be quite language-agnostic and I would love to see implementations in other languages).
Currently I have the following approach (assume exampleString is the given string and charPosition the given position):
exampleString.split("((?<= )|(?= ))").scanLeft(0)((a, b) => a + b.length()).drop(1).zipWithIndex.takeWhile(_._1 <= charPosition).last._2 + 1
This works, but it is way too complex to be honest. Is there a better (more efficient?) way to achieve this. I'm fairly new to functions like fold, scan, map, filter ... but I would love to learn more.
Thanks in advance.
def wordIndex(exampleString: String, index: Int): Int = {
exampleString.take(index + 1).foldLeft((0, exampleString.head.isWhitespace)) {
case ((n, isWhitespace), c) =>
if (isWhitespace == c.isWhitespace) (n, isWhitespace)
else (n + 1, !isWhitespace)
}._1
}
This will fold over the string, keeping track of whether the previous character was a whitespace or not, and if it detects a change, it will flip the boolean and add 1 to the count (n).
This will be able to handle groups of spaces (e.g. in hello world, world would be at position 2), and also spaces at the start of the string would count as index 0 and the first word would be index 1.
Note that this can't handle when the input is an empty string, I'll let you decide what you want to do in that case.
I have a Scala for loop that goes like this:
val a = sc.textFile("path to file containing 8 elements")
for(i <- 0 to a.count.toInt)
{
println((a.take(i).last))
}
But it throws java.lang.NoSuchElementException error.
I am not able to understand what's wrong and how to resolve it?
There are two problems
1) The "to" operator for defining range (in 0 to a.count.toInt) is a problem here as it is inclusive range from 0 to 8. In a collection of 8 elements, it is trying to access element at index 8.
You can use 0 until a.count.toInt.
2) Second problem is the way "last" operator is called. When i=0, the expression a.take(i) is an empty collection and hence calling "last" on it results in NoSuchElementException.
Why would you iteratively take 1, 2, 3...8 elements from a collection just to take the last element everytime?
It is ok to do what you are doing with a collection of 8 elements but if you wanted to do this on a larger RDD, you should consider caching the RDD if you want to do something like this on a larger RDD.
I have a table field where the data contains our memberID numbers followed by character or character + number strings
For example:
My Data
1234567Z1
2345T10
222222T10Z1
111
111A
Should Become
123456
12345
222222
111
111
I want to get just the member number (as shown in Should Become above). I.E. all the digits that are LEFT of the first character.
As the length of the member number can be different for each person (the first 1 to 7 digit) and the letters used can be different (a to z, 0 to 8 characters long), I don't think I can SPLIT the field.
Right now, in Power Query, I do 27 search and replace commands to clean this data (e.g. find T10 replace with nothing, find T20 replace with nothing, etc)
Can anyone suggest a better way to achieve this?
I did successfully create a formula for this in Excel...but I am now trying to do this in Power Query and I don't know how to convert the formula - nor am I sure this is the most efficient solution.
=iferror(value(left([MEMBERID],7)),
iferror(value(left([MEMBERID],6)),
iferror(value(left([MEMBERID],5)),
iferror(value(left([MEMBERID],4)),
iferror(value(left([MEMBERID],3)),0)
)
)
)
)
Thanks
There are likely several ways to do this. Here's one way:
Create a query Letters:
let
Source = { "a" .. "z" } & { "A" .. "Z" }
in
Source
Create a query GetFirstLetterIndex:
let
Source = (text) => let
// For each letter find out where it shows up in the text. If it doesn't show up, we will have a -1 in the list. Make that positive so that we return the index of the first letter which shows up.
firstLetterIndex = List.Transform(Letters, each let pos = Text.PositionOf(text, _), correctedPos = if pos < 0 then Text.Length(text) else pos in correctedPos),
minimumIndex = List.Min(firstLetterIndex)
in minimumIndex
in
Source
In the table containing your data, add a custom column with this formula:
Text.Range([ColumnWithData], 0, GetFirstLetterIndex([ColumnWithData]))
That formula will take everything from your data text until the first letter.
I'm in the process of getting comfortable passing unnamed functions as arguments and I am using this to practice with, based off of the examples in the Swift Programming Guide.
So we have an array of Ints:
var numbers: Int[] = [1, 2, 3, 4, 5, 6, 7]
And I apply a transform like so: (7)
func transformNumber(number: Int) -> Int {
let result = number * 3
return result
}
numbers = numbers.map(transformNumber)
Which is equal to: (7)
numbers = numbers.map({(number: Int) -> Int in
let result = number * 3
return result;
})
Which is equal to: (8)
numbers = numbers.map({number in number * 3})
Which is equal to: (8)
numbers = numbers.map({$0 * 3})
Which is equal to: (8)
numbers = numbers.map() {$0 * 3}
As you can see in the following graphic, the iteration count in the playground sidebar shows that in the furthest abstraction of a function declaration, it has an 8 count.
Question
Why is it showing as 8 iterations for the last two examples?
It's not showing 8 iterations, really. It's showing that 8 things executed on that line. There were 7 executions as part of the map function, and an 8th to do the assignment back into the numbers variable.
It looks like this could probably provide more helpful diagnostics. I would highly encourage you to provide feedback via https://bugreport.apple.com.
Slightly rewriting your experiment to use only closures, the call counts still differ by one:
Case 1: Explicitly specifying argument types (visit count is 7)
var f1 = {(number: Int) -> Int in
let result = number * 3
return result
}
numbers.map(f1)
Case 2: Implicit argument types (visit count is 8)
var f2 = {$0 * 3}
numbers.map(f2)
If the (x times) count reported by the REPL does indeed represent a count of visits to that code location, and noting that the count is greater by one in cases where the closure type arguments are not explicitly specified (e.g. f2), my guess is that at least in the playground REPL, the extra visit is to establish actual parameter types and fill that gap in the underlying AST.