Reading a string char by char is very slow in my swift implementation - swift

i have to read a file char by char in swift. The way I am doing it is to read a chunk from a FileHandler and returning the first character of a string.
This is my code so far:
/// Return next character, or nil on EOF.
func nextChar() -> Character? {
precondition(fileHandle != nil, "Attempt to read from closed file")
if atEof {
return nil
}
if self.stored.characters.count > 0 {
let c: Character = self.stored.characters.first!
stored.remove(at: self.stored.startIndex)
return c
}
let tmpData = fileHandle.readData(ofLength: (4096))
print("\n---- file read ---\n" , terminator: "")
if tmpData.count == 0 {
return nil
}
self.stored = NSString(data: tmpData, encoding: encoding.rawValue) as String!
let c: Character = self.stored.characters.first!
self.stored.remove(at: stored.startIndex)
return c
}
My problem with this is that the returning of a character is very slow.
This is my test implementation:
if let aStreamReader = StreamReader(path: file) {
defer {
aStreamReader.close()
}
while let char = aStreamReader.nextChar() {
print("\(char)", terminator: "")
continue
}
}
even without a print it took ages to read the file to the end.
for a sample file with 1.4mb it took more than six minutes to finish the task.
time ./.build/debug/read a.txt
real 6m22.218s
user 6m13.181s
sys 0m2.998s
Do you have an opinion how to speed up this part?
let c: Character = self.stored.characters.first!
stored.remove(at: self.stored.startIndex)
return c
Thanks a lot.
ps
++++ UPDATEED FUNCTION ++++
func nextChar() -> Character? {
//precondition(fileHandle != nil, "Attempt to read from closed file")
if atEof {
return nil
}
if stored_cnt > (stored_idx + 1) {
stored_idx += 1
return stored[stored_idx]
}
let tmpData = fileHandle.readData(ofLength: (chunkSize))
if tmpData.count == 0 {
atEof = true
return nil
}
if let s = NSString(data: tmpData, encoding: encoding.rawValue) as String! {
stored = s.characters.map { $0 }
stored_idx = 0
stored_cnt = stored.count
}
return stored[0];
}

Your implementation of nextChar is terribly inefficient.
You create a String and then call characters over and over and you update that set of characters over and over.
Why not create the String and then only store a reference to its characters. And then track an index into characters. Instead of updating it over and over, simply increment the index and return the next character. No need to update the string over and over.
Once you get to the last character, read the next piece of the file. Create a new string, reset the characters and the index.

Related

How to read a specific file's line in swift4?

Testing in Playground I read a whole file in an array of String, one string per line.
But what I need is a specific line only:
let dir = try? FileManager.default.url(for: .documentDirectory,
in: .userDomainMask, appropriateFor: nil, create: true)
let fileURL = dir!.appendingPathComponent("test").appendingPathExtension("txt")
let text: [String] = try String(contentsOf: fileURL).components(separatedBy: NSCharacterSet.newlines)
let i = 2 // computed before, here to simplify
print(text[i])
There is a way to avoid reading the complete big file?
I'm guessing you mean that you want to retrieve the index without manually searching the array with, say, a for-in loop.
In Swift 4 you can use Array.index(where:) in combination with the StringProtocol's generic contains(_:) function to find what you're looking for.
Let's imagine you're looking for the first line containing the text "important stuff" in your text: [String] array.
You could use:
text.index(where: { $0.contains("important stuff") })
Behind the scenes, Swift is looping to find the text, but with built-in enhancements, this should perform better than manually looping through the text array.
Note that the result of this search could be nil if no matching lines are present. Therefore, you'll need to ensure it's not nil before using the result:
Force unwrap the result (risking the dreaded fatal error: unexpectedly found nil while unwrapping an Optional value):
print(text[lineIndex!)
Or, use an if let statement:
if let lineIndex = stringArray.index(where: { $0.contains("important stuff") }) {
print(text[lineIndex])
}
else {
print("Sorry; didn't find any 'important stuff' in the array.")
}
Or, use a guard statement:
guard let lineIndex = text.index(where: {$0.contains("important stuff")}) else {
print("Sorry; didn't find any 'important stuff' in the array.")
return
}
print(text[lineIndex])
To find a specific line without reading the entire file in, you could use this StreamReader answer. It contains code that worked in Swift 3. I tested it in Swift 4, as well: see my GitHub repo, TEST-StreamReader, for my test code.
You would still have to loop to get to the right line, but then break the loop once you've retrieved that line.
Here's the StreamReader class from that SO answer:
class StreamReader {
let encoding : String.Encoding
let chunkSize : Int
var fileHandle : FileHandle!
let delimData : Data
var buffer : Data
var atEof : Bool
init?(path: String, delimiter: String = "\n", encoding: String.Encoding = .utf8,
chunkSize: Int = 4096) {
guard let fileHandle = FileHandle(forReadingAtPath: path),
let delimData = delimiter.data(using: encoding) else {
return nil
}
self.encoding = encoding
self.chunkSize = chunkSize
self.fileHandle = fileHandle
self.delimData = delimData
self.buffer = Data(capacity: chunkSize)
self.atEof = false
}
deinit {
self.close()
}
/// Return next line, or nil on EOF.
func nextLine() -> String? {
precondition(fileHandle != nil, "Attempt to read from closed file")
// Read data chunks from file until a line delimiter is found:
while !atEof {
if let range = buffer.range(of: delimData) {
// Convert complete line (excluding the delimiter) to a string:
let line = String(data: buffer.subdata(in: 0..<range.lowerBound), encoding: encoding)
// Remove line (and the delimiter) from the buffer:
buffer.removeSubrange(0..<range.upperBound)
return line
}
let tmpData = fileHandle.readData(ofLength: chunkSize)
if tmpData.count > 0 {
buffer.append(tmpData)
} else {
// EOF or read error.
atEof = true
if buffer.count > 0 {
// Buffer contains last line in file (not terminated by delimiter).
let line = String(data: buffer as Data, encoding: encoding)
buffer.count = 0
return line
}
}
}
return nil
}
/// Start reading from the beginning of file.
func rewind() -> Void {
fileHandle.seek(toFileOffset: 0)
buffer.count = 0
atEof = false
}
/// Close the underlying file. No reading must be done after calling this method.
func close() -> Void {
fileHandle?.closeFile()
fileHandle = nil
}
}
extension StreamReader : Sequence {
func makeIterator() -> AnyIterator<String> {
return AnyIterator {
return self.nextLine()
}
}
}

Preparing for Swift 4 - UnsafeMutablePointer migration to UnsafeMutableBufferPointer

I've got a char-by-char function that returns a character from a file. It is pretty performant, as far as i now.
Xcode says that the characterPointer.initialize(from: s.characters) "will be removed in Swift 4.0." And I have to "use 'UnsafeMutableBufferPointer.initialize(from:)' instead".
But I can't get it. Can you please explain it to me. How to use a quick iterator over characters with a UnsafeMutableBufferPointer? Do you have an example?
This is my function:
/// Return next character, or nil on EOF.
func nextChar() -> Character? {
//precondition(fileHandle != nil, "Attempt to read from closed file")
if atEof {
return nil
}
if stored_cnt > (stored_idx + 1) {
stored_idx += 1
let char = characterPointer.pointee
characterPointer = characterPointer.successor()
return char
}
let tmpData = fileHandle.readData(ofLength: (chunkSize))
if tmpData.count == 0 {
atEof = true
return nil
}
if var s = NSString(data: tmpData, encoding: encoding.rawValue) as String! {
characterPointer.initialize(from: s.characters)
stored_idx = 0
stored_cnt = s.characters.count
}
let char = characterPointer.pointee
characterPointer = characterPointer.successor()
return char
}
By the way, if there is a faster solution, please do not hesitate to tell me.
Thanks a lot.

How to increment String in Swift

I need to save files in an alphabetical order.
Now my code is saving files in numeric order
1.png
2.png
3.png ...
The problem is when i read this files again I read this files as described here
So I was thinking of changing the code and to save the files not in a numeric order but in an alphabetical order as:
a.png b.png c.png ... z.png aa.png ab.png ...
But in Swift it's difficult to increment even Character type.
How can I start from:
var s: String = "a"
and increment s in that way?
You can keep it numeric, just use the right option when sorting:
let arr = ["1.png", "19.png", "2.png", "10.png"]
let result = arr.sort {
$0.compare($1, options: .NumericSearch) == .OrderedAscending
}
// result: ["1.png", "2.png", "10.png", "19.png"]
If you'd really like to make them alphabetical, try this code to increment the names:
/// Increments a single `UInt32` scalar value
func incrementScalarValue(_ scalarValue: UInt32) -> String {
return String(Character(UnicodeScalar(scalarValue + 1)))
}
/// Recursive function that increments a name
func incrementName(_ name: String) -> String {
var previousName = name
if let lastScalar = previousName.unicodeScalars.last {
let lastChar = previousName.remove(at: previousName.index(before: previousName.endIndex))
if lastChar == "z" {
let newName = incrementName(previousName) + "a"
return newName
} else {
let incrementedChar = incrementScalarValue(lastScalar.value)
return previousName + incrementedChar
}
} else {
return "a"
}
}
var fileNames = ["a.png"]
for _ in 1...77 {
// Strip off ".png" from the file name
let previousFileName = fileNames.last!.components(separatedBy: ".png")[0]
// Increment the name
let incremented = incrementName(previousFileName)
// Append it to the array with ".png" added again
fileNames.append(incremented + ".png")
}
print(fileNames)
// Prints `["a.png", "b.png", "c.png", "d.png", "e.png", "f.png", "g.png", "h.png", "i.png", "j.png", "k.png", "l.png", "m.png", "n.png", "o.png", "p.png", "q.png", "r.png", "s.png", "t.png", "u.png", "v.png", "w.png", "x.png", "y.png", "z.png", "aa.png", "ab.png", "ac.png", "ad.png", "ae.png", "af.png", "ag.png", "ah.png", "ai.png", "aj.png", "ak.png", "al.png", "am.png", "an.png", "ao.png", "ap.png", "aq.png", "ar.png", "as.png", "at.png", "au.png", "av.png", "aw.png", "ax.png", "ay.png", "az.png", "ba.png", "bb.png", "bc.png", "bd.png", "be.png", "bf.png", "bg.png", "bh.png", "bi.png", "bj.png", "bk.png", "bl.png", "bm.png", "bn.png", "bo.png", "bp.png", "bq.png", "br.png", "bs.png", "bt.png", "bu.png", "bv.png", "bw.png", "bx.png", "by.png", "bz.png"]`
You will eventually end up with
a.png
b.png
c.png
...
z.png
aa.png
ab.png
...
zz.png
aaa.png
aab.png
...
Paste this code in the playground and check result. n numbers supported means you can enter any high number such as 99999999999999 enjoy!
you can uncomment for loop code to check code is working fine or not
but don't forget to assign a lesser value to counter variable otherwise Xcode will freeze.
var fileName:String = ""
var counter = 0.0
var alphabets = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
let totalAlphaBets = Double(alphabets.count)
let numFiles = 9999
func getCharacter(counter c:Double) -> String {
var chars:String
var divisionResult = Int(c / totalAlphaBets)
let modResult = Int(c.truncatingRemainder(dividingBy: totalAlphaBets))
chars = getCharFromArr(index: modResult)
if(divisionResult != 0){
divisionResult -= 1
if(divisionResult > alphabets.count-1){
chars = getCharacter(counter: Double(divisionResult)) + chars
}else{
chars = getCharFromArr(index: divisionResult) + chars
}
}
return chars
}
func getCharFromArr(index i:Int) -> String {
if(i < alphabets.count){
return alphabets[i]
}else{
print("wrong index")
return ""
}
}
for _ in 0...numFiles {
fileName = getCharacter(counter: counter)+".png"
print(fileName)
counter += 1
}
fileName = getCharacter(counter: Double(numFiles))+".png"
print(fileName)

streamReader for server URLs

I've been working with the code form this answer, provided by Martin R. The code is awesome and it is very useful. However, it doesn't work with the links, while working fine with files. After putting some NSLogs and breaks, I have actually found that problem is in this code block:
init?(path: String, delimiter: String = "\n", encoding: UInt = NSUTF8StringEncoding, chunkSize : Int = 4096) {
self.chunkSize = chunkSize
self.encoding = encoding
self.fileHandle = NSFileHandle(forReadingFromURL: NSURL(string: path)!, error: nil)
println("PATH IS \(path)")
println("FILE HANDLE IS \(fileHandle)")
if self.fileHandle == nil {
println("FILE HANDLE IS NIL!")
return nil
}
The code above actually contains some minor changes, compared with Martin's answer. Also Apple says that it is possible to use fileHandle with forReadingFromURL and it shouldn't return nil.
But here is the console output:
PATH IS http://smth.com
FILE HANDLE IS nil
FILE HANDLE IS NIL!!!!!
The question is what's wrong?
UPDATE
As Martin R has kindly explained to me, this code won't work with URLs, this answer states the same, so I have rewritten the code, guiding by previous answers:
import Foundation
import Cocoa
class StreamReader {
let encoding : UInt
let chunkSize : Int
var atEof : Bool = false
var streamData : NSData!
var fileLength : Int
var urlRequest : NSMutableURLRequest
var currentOffset : Int
var streamResponse : NSString
var fileHandle : NSFileHandle!
let buffer : NSMutableData!
let delimData : NSData!
var reponseError: NSError?
var response: NSURLResponse?
init?(path: NSURL, delimiter: String = "\n", encoding: UInt = NSUTF8StringEncoding, chunkSize : Int = 10001000) {
println("YOUR PATH IS \(path)")
self.chunkSize = chunkSize
self.encoding = encoding
self.currentOffset = 0
urlRequest = NSMutableURLRequest(URL: path)
streamData = NSURLConnection.sendSynchronousRequest(urlRequest, returningResponse:&response, error:&reponseError)
streamResponse = NSString(data:streamData!, encoding:NSUTF8StringEncoding)!
self.fileLength = streamData.length
//println("WHAT IS STREAMDATA \(streamData)")
//println("WHAT IS URLREQUEST \(urlRequest)")
if streamData == nil {
println("LINK HAS NO CONTENT!!!!!")
}
self.fileLength = streamResponse.length
println("FILE LENGTH IS \(fileLength)")
self.buffer = NSMutableData(capacity: chunkSize)!
// Create NSData object containing the line delimiter:
delimData = delimiter.dataUsingEncoding(NSUTF8StringEncoding)!
println("WHAT DOES THE DELIMITER \(delimiter)LOOK LIKE?")
println("WHAT IS DELIMDATA \(delimData)")
}
deinit {
self.close()
}
/// Return next line, or nil on EOF.
func nextLine() -> String? {
if atEof {
println("AT THE END OF YOUR FILE!!!")
return nil
}
// Read data chunks from file until a line delimiter is found:
if currentOffset >= fileLength {
return nil
}
var blockLength : Int = buffer.length
var range = buffer.rangeOfData(delimData, options: NSDataSearchOptions(0), range: NSMakeRange(currentOffset, blockLength))
//println("STREAM DATA \(streamData)")
println("RANGE IS \(range)")
while range.location == NSNotFound {
var nRange = NSMakeRange(currentOffset, chunkSize)
println("nRange is \(nRange)")
var tmpData = streamData.subdataWithRange(nRange)
//println("TMP data length \(tmpData.length)")
currentOffset += blockLength
//println("TMPDATA is \(tmpData)")
if tmpData.length == 0 {
// EOF or read error.
println("ERROR ????")
atEof = true
if buffer.length > 0 {
// Buffer contains last line in file (not terminated by delimiter).
let line = NSString(data: buffer, encoding: encoding);
buffer.length = 0
println("THE LINE IS \(line)")
return line
}
// No more lines.
return nil
}
buffer.appendData(tmpData)
range = buffer.rangeOfData(delimData, options: NSDataSearchOptions(0), range: NSMakeRange(0, buffer.length))
}
// Convert complete line (excluding the delimiter) to a string:
let line = NSString(data: buffer.subdataWithRange(NSMakeRange(0, range.location)),
encoding: encoding)
// Remove line (and the delimiter) from the buffer:
buffer.replaceBytesInRange(NSMakeRange(0, range.location + range.length), withBytes: nil, length: 0)
return line
}
/// Start reading from the beginning of file.
func rewind() -> Void {
//streamData.seekToFileOffset(0)
buffer.length = 0
atEof = false
}
/// Close the underlying file. No reading must be done after calling this method.
func close() -> Void {
if streamData != nil {
streamData = nil
}
}
}
extension StreamReader : SequenceType {
func generate() -> GeneratorOf<String> {
return GeneratorOf<String> {
return self.nextLine()
}
}
}
But actually this code is very far from being perfect and I would like to see any recommendations on improving it. Please, be kind. I am very amateur and very inexperienced( but sooner or later I will learn it)
Finally, it is working. And probably the last problem is left, the code doesn't stop, it continues to read file from the beginning.
So now the question is probably more close to 'What's wrong with my code?', compared with previous: 'What's wrong?'
UPDATE
I have rewritten last parts of code like this:
let line = NSString(data: buffer.subdataWithRange(NSMakeRange(0, range.location + 1)),
encoding: encoding)
buffer.replaceBytesInRange(NSMakeRange(0, range.location + range.length), withBytes: nil, length: 0)
println("COMPLETE LINE IS \(line)")
if line!.containsString("\n"){
println("CONTAINS NEW LINE")
//println("BUFFER IS \(buffer)")
//
println("COMPLETE LINE IS \(line)")
return line
}
else {
println("NO LINE!")
atEof == true
return nil
}
The idea is to go through all the lines which contain \n and to exclude one line, which is the last one and shouldn't have the \n. But! Despite the fact that I have checked non-printing characters and there was no \n, here is the surprising console output: Optional("lastline_blablabla\n")
Probably now the question is, how to stop at the last line even if it contains \n?
If you need to retrieve data from the URL my code above (the one under first update) will work
But my own problem with \n has several ways to be solved. One of which I used in my code. As it's very individual I won't post the solution to \n at the end of file issue. Also I am not sure that my both solutions for urlStreamReader and \n are the best, so if you advice any better solution it will be much appreciated by me and probably some other people.
Great thanks to Martin R, who explained a lot, wrote great code and was very nice

Read lines from big text file in Swift until new line is empty: the Swift way

I have the following text file structure (the text file is pretty big, around 100,000 lines):
A|a1|111|111|111
B|111|111|111|111
A|a2|222|222|222
B|222|222|222|222
B|222|222|222|222
A|a3|333|333|333
B|333|333|333|333
...
I need to extract a piece of text related to a given key. For example, if my key is A|a2, I need to save the following as a string:
A|a2|222|222|222
B|222|222|222|222
B|222|222|222|222
For my C++ and Objective C projects, I used the C++ getline function as follows:
std::ifstream ifs(dataPathStr.c_str());
NSString* searchKey = #"A|a2";
std::string search_string ([searchKey cStringUsingEncoding:NSUTF8StringEncoding]);
// read and discard lines from the stream till we get to a line starting with the search_string
std::string line;
while( getline( ifs, line ) && line.find(search_string) != 0 );
// check if we have found such a line, if not report an error
if( line.find(search_string) != 0 )
{
data = DATA_DEFAULT ;
}
else{
// we need to form a string that would include the whole set of data based on the selection
dataStr = line + '\n' ; // result initially contains the first line
// now keep reading line by line till we get an empty line or eof
while(getline( ifs, line ) && !line.empty() )
{
dataStr += line + '\n'; // append this line to the result
}
data = [NSString stringWithUTF8String:navDataStr.c_str()];
}
As I am doing a project in Swift, I am trying to get rid of getline and replace it with something "Cocoaish". But I cannot find a good Swift solution to address the above problem. If you have an idea, I would really appreciate it. Thanks!
Using the StreamReader class from Read a file/URL line-by-line in Swift, you could do that it Swift like this:
let searchKey = "A|a2"
let bundle = NSBundle.mainBundle()
let pathNav = bundle.pathForResource("data_apt", ofType: "txt")
if let aStreamReader = StreamReader(path: pathNav!) {
var dataStr = ""
while let line = aStreamReader.nextLine() {
if line.rangeOfString(searchKey, options: nil, range: nil, locale: nil) != nil {
dataStr = line + "\n"
break
}
}
if dataStr == "" {
dataStr = "DATA_DEFAULT"
} else {
while let line = aStreamReader.nextLine() {
if countElements(line) == 0 {
break
}
dataStr += line + "\n"
}
}
aStreamReader.close()
println(dataStr)
} else {
println("cannot open file")
}