Getting a "string" from a "string" in java - substring

I have a string, "Shirt base price: $15"
(The string has the spaces in between "Shirt" and "base."
How can I pull out just the "Shirt" phrase of this string?

You can split the string at spaces to get the first word in it.
String firstWord = str.split("\\s+")[0];
The split method takes a regular expression to split around, the \\s+ argument tells it to split the string around sequences of one or more whitespace characters, and the [0] gets the first index of the array that was returned.

Related

How to convert String to UTF-8 to Integer in Swift

I'm trying to take each character (individual number, letter, or symbol) from a string file name without the extension and put each one into an array index as an integer of the utf-8 code (i.e. if the file name is "A1" without the extension, I would want "A" as an int "41" in first index, and "1" as int "31" in second index)
Here is the code I have but I'm getting this error "No exact matches in call to instance method 'append'", my guess is because .utf8 still keeps it as a string type:
for i in allNoteFiles {
var CharacterArray : [Int] = []
for character in i {
var utf8Character = String(character).utf8
CharacterArray.append(utf8Character) //error is here
}
....`//more code down here within the for in loop using CharacterArray indexes`
I'm sure the answer is probably simple, but I'm very new to Swift.
I've tried appending var number instead with:
var number = Int(utf8Character)
and
var number = (utf8Character).IntegerValue
but I get errors "No exact matches in call to initializer" and "Value of type 'String.UTF8View' has no member 'IntegerValue'"
Any help at all would be greatly appreciated. Thanks!
The reason
var utf8Character = String(character).utf8
CharacterArray.append(utf8Character)
doesn't work for you is because utf8Character is not a single integer, but a UTF8View: a lightweight way to iterate over the UTF-8 codepoints in a string. Every Character in a String can be made up of any number of UTF-8 bytes (individual integers) — while ASCII characters like "A" and "1" map to a single UTF-8 byte, the vast majority of characters do not: every UTF-8 code point maps to between 1 and 4 individual bytes. The Encoding section of UTF-8 on Wikipedia has a few very illustrative examples of how this works.
Now, assuming that you do want to split a string into individual UTF-8 bytes (either because you can guarantee your original string is ASCII-only, so the assumption that "character = byte" holds, or because you actually care about the bytes [though this is rarely the case]), there's a short and idiomatic solution to what you're looking for.
String.UTF8View is a Sequence of UInt8 values (individual bytes), and as such, you can use the Array initializer which takes a Sequence:
let characterArray: [UInt8] = Array(i.utf8)
If you need an array of Int values instead of UInt8, you can map the individual bytes ahead of time:
let characterArray: [Int] = Array(i.utf8.lazy.map { Int($0) })
(The .lazy avoids creating and storing an array of values in the middle of the operation.)
However, do note that if you aren't careful (e.g., your original string is not ASCII), you're bound to get very unexpected results from this operation, so keep that in mind.

How do i replace whitespace with underscore and encode values in scala array / list

I have a spark scala dataframe which has column "Name"
I have extracted the values of that column in to scala array[string]
org_name: Array[String] = Array(SARATOGA SENIOR HIGH SCHOOL)
I want to replace whitespaces with _ and encode that value in to utf-8 (any encoding is fine as long as it replaces special chars with something else)
so if there are any special chars those will be removed. later i want to use those in file path .
var org_name = orgsFlatDF.rdd.collect
.map( _.getString(2))
This is how i am extracting those vals ^^. I haven't found any method which I can use to do that. Replace or replaceall doesn't work on array
I tried this :
org_name.replace("\\s", "")
That didn't work .
Expected output : SARATOGA_SENIOR_HIGH_SCHOOL
if name is : new $ high school it should gets converted to new_$_high_school then encoded to new_%24_high_school
There are a couple of issues with what you are asking.
Java/Scala Arrays don't have a replace method. Even if they did have a replace method, would they replace the values they hold or the characters in a String they hold?
Let's assume this line org_name.replace("\\s", "") didn't compiled and org_name is indeed a an Array[String] holding one element.
scala> val org_name=Array("SARATOGA SENIOR HIGH SCHOOL")
val org_name: Array[String] = Array(SARATOGA SENIOR HIGH SCHOOL)
scala> org_name(0).replace(" ","_")
val res15: String = SARATOGA_SENIOR_HIGH_SCHOOL
replace("\\s","_") wouldn't work because it represents a \s string. "\" represents \. That's only way you'd be able to define strings containing other escape codes like \n or \t.
PS: to transform all the string in the array use org_name.map(_.replace(" ","_")), this gives you back another another array.

How do I parse out a number from this returned XML string in python?

I have the following string:
{\"Id\":\"135\",\"Type\":0}
The number in the Id field will vary, but will always be an integer with no comma separator. I'm not sure how to get just that value from that string given that it's string data type and not real "XML". I was toying with the replace() function, but the special characters are making it more complex than it seems it needs to be.
is there a way to convert that to XML or something that I can reference the Id value directly?
Maybe use a regular expression, e.g.
import re
txt = "{\"Id\":\"135\",\"Type\":0}"
x = re.search('"Id":"([0-9]+)"', txt)
if x:
print(x.group(1))
gives
135
It is assumed here that the ids are numeric and consist of at least one digit.
Non-regex answer as you asked
\" is an escape sequence in python.
So if {\"Id\":\"135\",\"Type\":0} is a raw string and if you put it into a python variable like
a = '{\"Id\":\"135\",\"Type\":0}'
gives
>>> a
'{"Id":"135","Type":0}'
OR
If the above string is python string which has \" which is already escaped, then do a.replace("\\","") which will give you the string without \.
Now just load this string into a dict and access element Id like below.
import json
d = json.loads(a)
d['Id']
Output :
135

How do I format a string from a string with %# in Swift

I am using Swift 4.2. I am getting extraneous characters when formatting one string (s1) from another string(s0) using the %# format code.
I have searched extensively for details of string formatting but have come up with only partial answers including the code in the second line below. I need to be able to format s1 so that I can customize output from a Swift process. I ask this because I have not found an answer while searching for ways to format a string from a string.
I tried the following three statements:
let s0:[String] = ["abcdef"]
let s1:[String] = [String(format:"%#",s0)]
print(s1)
...
The output is shown below. It may not be clear, here, but there are four leading spaces to the left of the abcdef string.
["(\n abcdef\n)"]
How can I format s1 so it does not include the brackets, the \n escape characters, and the leading spaces?
The issue here is you are using an array but a string in s0.
so the following index will help you.
let s0:[String] = ["abcdef"]
let s1:[String] = [String(format:" %#",s0[0])]
I am getting extraneous characters when formatting one string (s1) from another string (s0) ...
The s0 is not a string. It is an array of strings (i.e. the square brackets of [String] indicate an array and is the same as saying Array<String>). And your s1 is also array, but one that that has one element, whose value is the string representation of the entire s0 array of strings. That’s obviously not what you intended.
How can I format s1 so it does not include the brackets, the \n escape characters, and the leading spaces?
You’re getting those brackets because s1 is an array. You’re getting the string with the \n and spaces because its first value is the string representation of yet another array, s0.
So, if you’re just trying to format a string, s0, you can do:
let s0: String = "abcdef"
let s1: String = String(format: "It is ‘%#’", s0)
Or, if you really want an array of strings, you can call String(format:) for each using the map function:
let s0: [String] = ["abcdef", "ghijkl"]
let s1: [String] = s0.map { String(format: "It is ‘%#’", $0) }
By the way, in the examples above, I didn’t use a string format of just %#, because that doesn’t accomplish anything at all, so I assumed you were formatting the string for a reason.
FWIW, we generally don’t use String(format:) very often. Usually we do “string interpolation”, with \( and ):
let s0: String = "abcdef"
let s1: String = "It is ‘\(s0)’"
Get rid of all the unneccessary arrays and let the compiler figure out the types:
let s0 = "abcdef" // a string
let s1 = String(format:"- %# -",s0) // another string
print(s1) // prints "- abcdef -"

How to split string with trailing empty strings in result?

I am a bit confused about Scala string split behaviour as it does not work consistently and some list elements are missing. For example, if I have a CSV string with 4 columns and 1 missing element.
"elem1, elem2,,elem 4".split(",") = List("elem1", "elem2", "", "elem4")
Great! That's what I would expect.
On the other hand, if both element 3 and 4 are missing then:
"elem1, elem2,,".split(",") = List("elem1", "elem2")
Whereas I would expect it to return
"elem1, elem2,,".split(",") = List("elem1", "elem2", "", "")
Am I missing something?
As Peter mentioned in his answer, "string".split(), in both Java and Scala, does not return trailing empty strings by default.
You can, however, specify for it to return trailing empty strings by passing in a second parameter, like this:
String s = "elem1,elem2,,";
String[] tokens = s.split(",", -1);
And that will get you the expected result.
You can find the related Java doc here.
I believe that trailing empty spaces are not included in a return value.
JavaDoc for split(String regex) says: "This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array."
So in your case split(String regex, int limit) should be used in order to get trailing empty string in a return value.