Looking for an (elegant) solution for splitting a string and keeping the separator as item(s) in the array
example 1:
"hello world"
["hello", " ", "world"]
example 2:
" hello world"
[" ", "hello", " ", "world"]
thx.
Suppose you are splitting the string by a separator called separator, you can do the following:
let result = yourString.components(separatedBy: separator) // first split
.flatMap { [$0, separator] } // add the separator after each split
.dropLast() // remove the last separator added
.filter { $0 != "" } // remove empty strings
For example:
let result = " Hello World ".components(separatedBy: " ").flatMap { [$0, " "] }.dropLast().filter { $0 != "" }
print(result) // [" ", "Hello", " ", "World", " "]
For people who have a condition for their split, for example: splitting a camelCaseString based on uppercase condition:
extension Sequence {
func splitIncludeDelimiter(whereSeparator shouldDelimit: (Element) throws -> Bool) rethrows -> [[Element]] {
try self.reduce([[]]) { group, next in
var group = group
if try shouldDelimit(next) {
group.append([next])
} else {
group[group.lastIdx].append(next)
}
return group
}
}
}
For example:
"iAmCamelCase".splitIncludeDelimiter(whereSeparator: \.isUppercase)
=>
["i", "Am", "Camel", "Case"]
(If you want the imp of isUppercase)
extension CharacterSet {
static let uppercaseLetters = CharacterSet(charactersIn: "ABCDEFGHIJKLMNOPQRSTUVWXYZ")
}
extension Unicode.Scalar {
var isUppercase: Bool {
CharacterSet.uppercaseLetters.contains(self)
}
}
Just for fun, the Swift Algorithms package contains an algorithm called Intersperse
After adding the package and
import Algorithms
you can write
let string = "hello world"
let separator = " "
let result = Array(string
.components(separatedBy: separator)
.interspersed(with: separator))
print(result)
Your second example is barely correct, the result of splitting " hello world" by space is
["", "hello", "world"]
let sample = "a\nb\n\nc\n\n\nd\n\nddddd\n \n \n \n\n"
let sep = "\n"
let result = sample.components(separatedBy: sep).flatMap {
$0 == "" ? [sep] : [$0, sep]
}.dropLast()
debugPrint(result) // ArraySlice(["a", "\n", "b", "\n", "\n", "c", "\n", "\n", "\n", "d", "\n", "\n", "ddddd", "\n", " ", "\n", " ", "\n", " ", "\n", "\n"])
Related
I have: str1 = "this is the first day in my work" and str2 = "this is a great day" and I want to return the matched words as string from the previous two strings str1 & str2 and then store them in a new variable
The new variable str3: String should have this text "this is day"
I have found this in my searching but i need to return a string with matches ..
func isAnagram() -> Bool {
let str1 = "this is the first day in my work"
let str2 = "this is a great day"
func countedSet(string: String) -> NSCountedSet {
let array = string.map { (character) -> String in
return String(character)
}
return NSCountedSet(array: array)
}
return countedSet(string: str1).isEqual(countedSet(string: str2))
}
If order in the final string doesn't matter, this would be an easy solution:
let str1 = "this is the first day in my work"
let str2 = "this is a great day"
let words1 = Set(str1.split(separator: " "))
let words2 = Set(str2.split(separator: " "))
let str3 = words1.intersection(words2).reduce("") { $0 + $1 + " "}
If order matters:
...
let str3 = words1.intersection(words2).sorted {
words1.index(of: $0)! < words1.index(of: $1)!
}.reduce("") { $0 + $1 + " "}
You can use String method enumerateSubstrings(in: Range) using .byWords options to get the words in your string sentences and use filter to remove the words no contained in the second string:
extension StringProtocol where Index == String.Index {
var words: [String] {
var result: [String] = []
enumerateSubstrings(in: startIndex..., options: .byWords) { (substring, _, _, _) in
result.append(substring!)
}
return result
}
func matchingWords(in string: String) -> [String] {
return string.words.filter(words.contains)
}
}
Note that this preserves the order of occurrences and doesn't fail if there is punctuation in the string:
let str1 = "this is the first day in my work"
let str2 = "this is a great day"
let matchingWords = str1.matchingWords(in: str2) // ["this", "is", "day"]
let str3 = matchingWords.joined(separator: " ") // "this is day"
If i have a string something like: "Hello world (this is Sam)" i need to get the following array: ["Hello world", "this is Sam"] and the following ["Hello","World","this is Sam"] What would be the best way to achieve this in Swift?
Not sure if you still need this but you can try this.
let originalString = "Hello world (this is Sam) Hello world (this is Sam) (this is Sam) Hello world Hello world (this is Sam)"
let newArr = originalString.components(separatedBy: ["(", ")"])
var finalArr = [String]()
for (index, value) in newArr.enumerated() {
if (index + 1) % 2 == 1 {
finalArr.append(contentsOf: value.components(separatedBy: " ").filter { $0 != "" })
}
else {
finalArr.append(value)
}
}
print(finalArr) //["Hello", "world", "this is Sam", "Hello", "world", "this is Sam", "this is Sam", "Hello", "world", "Hello", "world", "this is Sam"]
Here is the way you can achieve that:
let originalString = "Hello world (this is Sam)"
let newArr = originalString.components(separatedBy: ["(", ")"]).filter { $0 != "" }
print(newArr) //["Hello world ", "this is Sam"]
I have mixed this and this post to achieve it.
Concerning the first case, you could just split the string using parentheses as separators. Then, omittingEmptySubsequences will remove any empty string that goes into the split. Finally, trim any whitespaces or new lines that were left on any of the splitted elements.
let splittedText = text.split(omittingEmptySubsequences: true) { separator in
return separator == "(" || separator == ")"
}.map { $0.trimmingCharacters(in: .whitespacesAndNewlines) }
I have a string:
first line
second line
first line
first line
second line
first line
How can I remove secondlines from this string? Secondlines are always different, firsts too. Only division between them is \n\n.
import Foundation
let string = "first line\n"
+ "second line\n"
+ "\n"
+ "first line\n"
+ "\n"
+ "first line\n"
+ "second line\n"
+ "\n"
+ "first line"
func removeSecondLines1(string: String) -> String {
let tokens = string.components(separatedBy: "\n")
var deletedString = tokens[0]
for i in 1...tokens.count - 1 {
if tokens[i] == "" || tokens[i - 1] == "" {
deletedString = deletedString + "\n" + tokens[i]
}
}
return deletedString
}
func removeSecondLines2(string: String) -> String {
let tokens = string.components(separatedBy: "\n\n")
var deletedTokens = [String]()
for token in tokens {
deletedTokens.append(token.components(separatedBy: "\n")[0])
}
return deletedTokens.joined(separator: "\n\n")
}
print(removeSecondLines1(string: string))
print(removeSecondLines2(string: string))
Both will output
first line
first line
first line
first line
Just for fun a solution with Regular Expression:
let string = "first line\nsecond line\n\nfirst line\n\nfirst line\nsecond line\n\nfirst line"
let pattern = "\\n[^\\n]+\\n\n"
let result = string.replacingOccurrences(of: pattern, with: "\n\n", options: .regularExpression)
print(result)
I write Swift application that parse log file.
log file string:
substr1 substr2 "substr 3" substr4
I need to get array: [substr1, substr2, substr 3, substr4]
But if I use something like:
print(stringLine.components(separatedBy: " "))
I got: [substr1, substr2, "substr, 3", substr4].
How to receive array: [substr1, substr2, substr 3, substr4]?
One of the possible solutions is to use map:
let testSting = "substr1 substr2 \"substr3\" substr4"
let mappedString = testString.components(separatedBy: " ").map({$0.replacingOccurrences(of: "\"", with: "")})
print(mappedString) //["substr1", "substr2", "substr3", "substr4"]
This case of the issue is required to use regular expression but this example is provided. So to solve problem in you're case it is possible to go in this way:
var testStingArray = testSting.replacingOccurrences(of: "\"", with: "").components(separatedBy: " ")
var arr = [String]()
var step = 0
while step < testStingArray.count {
var current = testStingArray[step]
var next = step + 1
if next < testStingArray.count {
if testStingArray[next].characters.count == 1 {
current += " " + testStingArray[next]
testStingArray.remove(at: next)
}
}
arr.append(current)
step += 1
}
print(arr)//["substr1", "substr2", "substr 3", "substr4"]
You'd better work with regular expression:
let pattern = "([^\\s\"]+|\"[^\"]+\")"
let regex = try! NSRegularExpression(pattern: pattern, options: [])
let line = "substr1 substr2 \"substr 3\" substr4"
let arr = regex.matches(in: line, options: [], range: NSRange(0..<line.utf16.count))
.map{(line as NSString).substring(with: $0.rangeAt(1)).trimmingCharacters(in: CharacterSet(charactersIn: "\""))}
print(arr) //->["substr1", "substr2", "substr 3", "substr4"]
Alternatively you could split the string based on a CharacterSet and then filter out the empty occurrences:
let stringLine = "substr1 substr2 \"substr3\" substr4"
let array = stringLine.components(separatedBy: CharacterSet(charactersIn: "\" ")).filter { !$0.isEmpty }
print (array)
Output: ["substr1", "substr2", "substr3", "substr4"]
But this will not work correctly if there is a " somewhere in one of the 'substrings', then that specific substring will also be split.
Or, simply iterate over the characters and maintain state about the quoted parts:
//: Playground - noun: a place where people can play
import UIKit
extension String {
func parse() -> [String] {
let delimiter = Character(" ")
let quote = Character("\"")
var tokens = [String]()
var pending = ""
var isQuoted = false
for character in self.characters {
if character == quote {
isQuoted = !isQuoted
}
else if character == delimiter && !isQuoted {
tokens.append(pending)
pending = ""
}
else {
pending.append(character)
}
}
// Add final token
if !pending.isEmpty {
tokens.append(pending)
}
return tokens
}
}
print ("substr1 substr2 \"substr 3\" substr4".parse()) // ["substr1", "substr2", "substr 3", "substr4"]
print ("\"substr 1\" substr2 \"substr 3\" substr4".parse()) // ["substr 1", "substr2", "substr 3", "substr4"]
print ("a b c d".parse()) // ["a", "b", "c", "d"]
Note: this code doesn't take into account that double quotes "" might be used to escape a single quote. But I don't know if that's a possibility in your case.
https://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/
How would I split a string to include the separators?
Lets say I had a string such as...
let myString = "apple banana orange grapes"
If I used
let separatedString = myString.componentsSeparatedByString(" ")
my resulting array would be
["apple","banana","orange","grapes"]
How would I achieve a result of
["apple ","banana ","orange ","grapes"]
array.map lets you process the resulting array an add the space back in.
let separatedString = myString
.componentsSeparatedByString(" ")
.map { "\($0) " }
That last line iterates over all strings in the split up array and puts them in $0, and returns a new string with the space added back in which gets used as the replacement for the original string.
Alternative using regular expression:
let myString = "apple banana orange grapes"
let pattern = "\\w+\\s?"
let regex = try! NSRegularExpression(pattern: pattern, options: [])
let matches = regex.matchesInString(myString, options:[], range: NSMakeRange(0, myString.characters.count))
.map { (myString as NSString).substringWithRange($0.range)}
print(matches) // -> ["apple ", "banana ", "orange ", "grapes"]
Solution
Since you updated your question, it looks now you no longer want a new space on the last word.
So here's my updated code
let text = "apple banana orange grapes"
let chunks: [String] = text
.componentsSeparatedByString(" ")
.reverse()
.enumerate()
.map { $0.element + ( $0.index == 0 ? "" : " ") }
.reverse()
print(chunks) // ["apple ", "banana ", "orange ", "grapes"]
Multiple separators
Thank to #vadian for the suggestion
let text = "apple banana\norange grapes"
let chunks: [String] = text
.componentsSeparatedByCharactersInSet(.whitespaceAndNewlineCharacterSet())
.reverse()
.enumerate()
.map { $0.element + ( $0.index == 0 ? "" : " ") }
.reverse()
print(chunks) // ["apple ", "banana ", "orange ", "grapes"]