Guava splitter with limit from backwards - guava

I would like to split string from backward and omit last two occurrences.
Example
String:
"foo:bar:baz:boo:ban"
And I would like to omit last two :
and get
foo bar baz

List<String> all = Splitter.on(':').splitToList("foo:bar:baz:boo:ban");
List<String> allButLastTwo = all.subList(0, all.size() - 2);

Related

Is it possible to build a list from a carriage return separated string?

Background
I have the following string:
var MyString = 'Test1⏎Test2⏎Test3⏎Test4'
⏎ = line feed = \n
What I'm trying to do
I want to create a List which is a list of lines. Basically every item that is followed by a \n would become an entry in the list.
I want the base string MyString to become shortened to reflect what pieces of the string have been moved to the List
The reason I want to leave a residual MyString is that new data might come in later that might be considered part of the same line, so I do not want to commit the data to the List until there is a carriage return seen
What the result of all this would be
So in my above example, only Test1 Test2 Test3 are followed by \n but not Test4
Output List would be: [Test1, Test2, Test3]
MyString would become: Test4
What I've tried and failed with
I tried using LineSplitter but it seems to want to take Test4 as a separate entry as well
final lines = const LineSplitter().convert(MyString);
for (final daLine in lines) {
MyList.add(daLine);
}
And it creates [Test1, Test2, Test3, Test4]
A solution would be to just .removeLast() on the list that you split.
String text = 'Test1\nTest2\nTest3\nTest4';
List<String> list = text.split('\n');
text = list.removeLast();
print(list); // [Test1, Test2, Test3]
print(text); // Test4
To me you are combining two questions. Every language I know has built-in ways to split a string on a char, including newline chars. The distinct thing you want is a split function that doesn't include the last entry.
You may be combining your answers as well :) Is there some resource constraint or streamed input that prevents you from just building the list, then popping off the final entry?
If yes:
I think you have to build your own split. Look at the implementation code for LineSplitter(), and make something similar except which leaves the final entry.
If no:
simply call
MyString = MyList.removeLast();
after your for-loop.

Swift 5 split string at integer index

It used to be you could use substring to get a portion of a string. That has been deprecated in favor on string index. But I can't seem to make a string index out of integers.
var str = "hellooo"
let newindex = str.index(after: 3)
str = str[newindex...str.endIndex]
No matter what the string is, I want the second 3 characters. So and str would contain "loo". How can I do this?
Drop the first three characters and the get the remaining first three characters
let str = "helloo"
let secondThreeCharacters = String(str.dropFirst(3).prefix(3))
You might add some code to handle the case if there are less than 6 characters in the string

How to use split method with string containing brackets?

I have a string that contains some data. Data is separated like this:
var stringData = (SomeWordsWithSpacesInBetween) 0 (SomeWordsWithSpaceInBetween) 1 ...
I want to be able to extract data between the brackets and numbers between the words in brackets as such:
stringData.split( some way to split them)[0] = SomeWordsWithSpacesInBetween;
stringData.split(some way to split them)[1] = 0;
How to split them this way?
var s = '(Some Words With Spaces InBetween) 0 (SomeWordsWithSpaceInBetween) 1';
var r = RegExp(r'\(((\w+ ?)*)\) (\d+) ?').allMatches(s).expand((e) => [e[1], e[3]]);
You can do it using regular expression. Here is an example.
List<String>getStringList(){
String abc = '(SomeWordsWithSpacesInBetween) 0 (SomeWordsWithSpaceInBetween) 1 (SomeWordsWithSpaceInBetween)';
List<String> myList = new List();
RegExp exp = new RegExp(r"\) (\d+) \(");
myList = abc.split(exp);
print('${myList}');
return myList;
}

Remove white spaces in scala-spark

I have sample file record like this
2018-01-1509.05.540000000000001000000751111EMAIL#AAA.BB.CL
and the above record is from a fixed length file and I wanted to split based on the lengths
and when I split I am getting a list as shown below.
ListBuffer(2018-01-15, 09.05.54, 00000000000010000007, 5, 1111, EMAIL#AAA.BB.CL)
Everything looks fine until now . But I am not sure why is there extra-space adding in each field in the list(not for the first field).
Example : My data is "09.05.54",But I am getting as" 09.05.54" in the list.
My Logic for splitting is shown below
// Logic to Split the Line based on the lengths
def splitLineBasedOnLengths(line: String, lengths: List[String]): ListBuffer[Any] = {
var splittedLine = line
var split = new ListBuffer[Any]()
for (i <- lengths) yield {
var c = i.toInt
var fi = splittedLine.take(c)
split += fi
splittedLine = splittedLine.drop(c)
}
split
}
The above code take's the line and list[String] which are nothing but lengths as input and gives the listbuffer[Any] which has the lines split according to the length.
Can any one help me why am I getting extra space before each field after splitting ?
There are no extra spaces in the data. It's just adding some separation between the elements when printing them (using toString) to make them easier to read.
To prove this try the following code:
split.foreach(s => println(s"\"$s\""))
You will see the following printed:
"2018-01-15"
"09.05.54"
"00000000000010000007"
"5"
"1111"
"EMAIL#AAA.BB.CL"

Getting IndexOutOfBounds Exception while search for a subtring

I have a string like
var word = "banana"
and a sentence like var sent = "the monkey is holding a banana which is yellow"
sent1 = "banana!!"
I want to search banana in sent and then write to a file in the following way:
the monkey is holding a
banana
which is yellow
I'm doing it in the following way:
var before = sent.substring(0, sent.indexOf(word))
var after = sent.substring(sent.indexOf(word) + word.length)
println(before)
println(after)
This works fine but when I do the same for sent1, then it gives me IndexOutOfBoundsException. I think it is because there is nothing before banana in sent1. How to deal with this?
You can split based on the word and you will get an array with everything before and after the word.
val search = sent.split(word)
search: Array[String] = Array("the monkey is holding a ", " which is yellow")
This works in the "banana!!!" case:
"banana!!".split(word)
res5: Array[String] = Array("", !!)
Now you can write the three lines to a file in your favorite way:
println(search(0))
println(word)
println(search(1))
What if you had more than one occurrence of the word? .split understands regular expressions, so you could improve the previous solution with something like this:
string
.replaceAll("\\s+(?=banana)|(?<=banana)\\s+")
.foreach(println)
\\s means a whitespace character
(?=<word>) means "followed by <word>"
(?<=<word>) means "preceded by <word>"
So, this would split your string into pieces, using any spaces either preceded or followed by the "banana", and not the word itself. The actual word ends up in the list, just like the other parts of the string, so you don't need to print it out explicitly
This regex trick is called "positive look-around" ( ?= is look-ahead, ?<= is look-behind) in case you are wondering.