I'm writing a fairly large text file (it's actually more like ascii-encoded data), and it's... very slow. And uses a lot of memory.
Here's a minimalistic version of the code I'm using to test how to write files more quickly. writeFileIncrementally writes one line at a time in a for loop, while writeFileFromBigData creates a large string and then dumps it to disk. I was fully expecting writeFileFromBigData to be faster, but it's 20 times faster! That's a bit more than I expected. For size=10_000_000, it takes 20-25 seconds to write it incrementally and 1-1.5 seconds to write it one go. Plus, the incremental version actually allocates more and more memory as it goes. By the end of it, it's well into the GiB range. I don't understand what's going on here.
func writeFileIncrementally(toUrl url: URL, size: Int) {
// ensure file exists and is empty
try? "".write(to: url, atomically: true, encoding: .ascii)
guard let handle = try? FileHandle(forWritingTo: url) else {return}
defer {
handle.closeFile()
}
for i in 0..<size {
let s = "\(i)\n"
handle.write(s.data(using: .ascii)!)
}
}
func writeFileFromBigData(toUrl url: URL, size: Int) {
let s = (0..<size).map{String($0)}.joined(separator: "\n")
try? s.write(to: url, atomically: true, encoding: .ascii)
}
Compare that to the same thing in Python. The create-string-then-write-it is faster in Python as well. That's reasonable, but the difference in Python is it takes about 2.7 seconds to write it incrementally (about 98% user time) and about 1 second to write it in one go (including creating the string). Additionally, the incremental version has constant memory usage. It does not go up as the file is being written.
def writeFileIncrementally(path, size):
with open(path, "w+") as f:
for i in range(size):
f.write(f"{i}\n")
def writeFileFromBigData(path, size):
with open(path, "w+") as f:
f.write("\n".join(str(i) for i in range(size)))
So my question is twofold:
Why is my writeFileIncrementally function so slow and why does it use so much memory? I was hoping to be able to write incrementally to reduce memory usage.
Is there some better approach for incrementally writing a large text file in Swift?
For memory, see Duncan C's answer. You need an autoreleasepool. But for speed, you have a small problem and a large problem.
The small problem is this line:
handle.write(s.data(using: .ascii)!)
Rewriting that will save about 40% of your time (from 27s to 17s in my tests):
handle.write(Data(s.utf8))
Strings are generally stored internally in UTF8. While ASCII is a perfect subset of that, your code requires checking for anything that isn't ASCII. Using .utf8 can often just grab the internal buffer directly. It also avoids creating and unwrapping an Optional.
But 17s is still a lot more than 1-2s. That's due to your big problem.
Every call to write has to get the data all the way to the OS's file buffers. Not all the way to the disk, but still, it's an expensive operation. Unless the data is precious, you generally want to chunk it into larger blocks (4k is very common). If you do this, the write time goes down to 1.5s:
let bufferSize = 4*1024
var buffer = Data(capacity: bufferSize)
for i in 0..<size {
autoreleasepool {
let s = "\(i)\n"
buffer.append(contentsOf: s.utf8)
if buffer.count >= bufferSize {
handle.write(buffer)
buffer.removeAll(keepingCapacity: true)
}
}
}
// Write the final buffer
handle.write(buffer)
This is "pretty close" to the the "big data" function's 1.1s on my system. There's still a lot of memory allocation and cleanup going on. And in my experience, at least recently, [UInt8] is much faster than Data. I'm not sure that was always true, but all my most recent tests on Mac go that way. So, writing with the newer write(contentsOf:) interface is:
let bufferSize = 4*1024
var buffer: [UInt8] = []
buffer.reserveCapacity(bufferSize)
for i in 0..<size {
autoreleasepool {
let s = "\(i)\n"
buffer.append(contentsOf: s.utf8)
if buffer.count >= bufferSize {
try? handle.write(contentsOf: buffer)
buffer.removeAll(keepingCapacity: true)
}
}
}
// Write the final buffer
try? handle.write(contentsOf: buffer)
And that's faster than the big data function, because it doesn't have to make a Data. (830ms on my machine)
But wait, it gets better. This code doesn't need the autorelease pool, and if you remove that, I can write this file in 730ms.
let bufferSize = 4*1024
var buffer: [UInt8] = []
buffer.reserveCapacity(bufferSize)
for i in 0..<size {
let s = "\(i)\n"
buffer.append(contentsOf: s.utf8)
if buffer.count >= bufferSize {
try? handle.write(contentsOf: buffer)
buffer.removeAll(keepingCapacity: true)
}
}
// Write the final buffer
try? handle.write(contentsOf: buffer)
But what about Python? Why doesn't it need buffers to be fast? Because it gives you buffers by default. Your open call returns a BufferedWriter with an 8k buffer that works more or less like the above code. You'd need to write in binary mode and also pass buffering=0 to turn it off. See the docs on open for the details.
I'm not sure why the incremental writing version is so slow.
If you're worried about memory use, though, you could make your memory footprint much smaller by wrapping your inner loop with a call to autoreleasepool():
for i in 0..<size {
autoreleasepool {
let s = "\(i)\n"
handle.write(s.data(using: .ascii)!)
if i.isMultiple(of: 100000) {
print(i)
}
}
}
(Internally, Swift's ARC memory management sometimes allocates temporary storage on the heap as "autoreleased", which means it sticks around in memory until the current call chain returns and the app revisits the event loop. If you have a processing loop that allocates a whole bunch of local variables they can accumulate on the heap until you finish and return. It's really only a problem if you push up against the memory limits of the device however.)
Edit:
I think this might be a case of premature optimization however. It looks to me like the max memory consumption for the "write it all at once" with 10,000,000 items is about 150 mb, which is not a problem for a device that's able to run current iOS versions. Just use the "write all at once" version and be done with it. If you need to write billions of lines at once, then write hybrid code that breaks it into chunks of 10 million at a time and appends each chunk to the file. (with the inner loop wrapped in a call to autoreleasepool(), as shown above.
Related
I am trying to do some calculations on a large number of objects. The objects are saved in an array and the results of the operation should be saved in a new array. To speed up the processing, I‘m trying to break up the task into multiple subtasks which can run concurrently on different threads. The simplified example code below replaces the actual operation with two seconds of wait.
I have tried multiple ways of solving this issue, using both DispatchQueues and Tasks.
Using DispatchQueue
The basic setup I used is the following:
import Foundation
class Main {
let originalData = ["a", "b", "c"]
var calculatedData = Set<String>()
func doCalculation() {
//calculate length of array slices.
let totalLength = originalData.count
let sliceLength = Int(totalLength / 3)
var start = 0
var end = 0
let myQueue = DispatchQueue(label: "Calculator", attributes: .concurrent)
var allPartialResults = [Set<String>]()
for i in 0..<3 {
if i != 2 {
start = sliceLength * i
end = start + sliceLength - 1
} else {
start = totalLength - sliceLength * (i - 1)
end = totalLength - 1
}
allPartialResults.append(Set<String>())
myQueue.async {
allPartialResults[i] = self.doPartialCalculation(data: Array(self.originalData[start...end]))
}
}
myQueue.sync(flags: .barrier) {
for result in allPartialResults {
self.calculatedData.formUnion(result)
}
}
//do further calculations with the data
}
func doPartialCalculation(data: [String]) -> Set<String> {
print("began")
sleep(2)
let someResultSet: Set<String> = ["some result"]
print("ended")
return someResultSet
}
}
As expected, the Console Log is the following (with all three "ended" appearing at once, two seconds after all three "began" appeared at once):
began
began
began
ended
ended
ended
When measuring performance using os_signpost (and using real data and calculations), this approach reduces the time needed for the entire doCalculation() function to run from 40ms to around 14ms.
Note that to avoid data races when appending the results to the final calculatedData Set, I created an array of partial Data sets of which every DispatchQueue only accesses one index (which is not a solution I like and the main reason why I am not satisfied with this approach). What I would have liked to do is to call DispatchQueue.main from within myQueue and add the new data to the calculatedData Set on the main thread, however calling DispatchQueue.main.sync causes a deadlock and using the async version leads to the barrier flag not working as intended.
Using Tasks
In a second attempt, I tried using Tasks to run code concurrently. As I understand it, there are two options for running code concurrently with Tasks. async let and withTaskGroup. For the purpose of retrieving a variable quantity of partial results form a variable amount of concurrent tasks, I figured using withTaskGroup was the best option for me.
I modified the code to look like this:
class Main {
let originalData = ["a", "b", "c"]
var calculatedData = Set<String>()
func doCalculation() async {
//calculate length of array slices.
let totalLength = originalData.count
let sliceLength = Int(totalLength / 3)
var start = 0
var end = 0
await withTaskGroup(of: Set<String>.self) { group in
for i in 0..<3 {
if i != 2 {
start = sliceLength * i
end = start + sliceLength - 1
} else {
start = totalLength - sliceLength * (i - 1)
end = totalLength - 1
}
group.addTask {
return await self.doPartialCalculation(data: Array(self.originalData[start...end]))
}
}
for await newSet in group {
calculatedData.formUnion(newSet)
}
}
//do further calculations with the data
}
func doPartialCalculation(data: [String]) async -> Set<String> {
print("began")
try? await Task.sleep(nanoseconds: UInt64(1e9))
let someResultSet: Set<String> = ["some result"]
print("ended")
return someResultSet
}
}
However, the Console Log prints the following (with every "ended" coming 2 seconds after the preceding "before"):
began
ended
began
ended
began
ended
Measuring performance using os_signpost revealed that the operation takes 40ms to complete. Therefore it is not running concurrently.
With that being said, what is the best course of action for this problem?
Using DispatchQueue, how do you call the Main Queue to avoid data races from within a queue, while at the same time preserving a barrier flag later on in the code?
Using Task, how do can you actually make them run concurrently?
EDIT
Running the code on a real device instead of the simulator and changing the sleep function inside the Task from sleep() to Task.sleep(), I was able to achieve concurrent behavior in that the Console prints the expected log. However, the operation time for the task remains upwards of 40-50ms and is highly variable, sometimes reaching 200ms or more. This problem remains after adding the .userInitiated property to the Task.
Why does it take so much longer to run the same operation concurrently using Task compared to using DispatchQueue? Am I missing something?
A few observations:
One possible performance difference is that the simulator artificially constrains the “cooperative thread pool” used by async-await. See Maximum number of threads with async-await task groups. This is one cause of a lack of full concurrency (on the simulator).
In the async-await test, another factor that can affect concurrency is an actor. If an actor is enforcing serial execution, then consider declaring doPartialCalculation as nonisolated, so that it allows concurrent execution. Failure to do so can prevent any concurrent execution (with your sleep scenario, for example).
The fact that you saw a significant performance difference when you went from sleep to Task.sleep makes me wonder if might have done this within an actor. Actors are “reentrant” and Task.sleep suspends execution and lets the actor to switch to another task. So it allows concurrency for a series of async methods.
But Task.sleep is not analogous to some computationally intensive task that will tie up the thread. But by declaring the function as nonisolated, that will achieve concurrent execution for computationally intensive processes. That can achieve performance results that are nearly equivalent to what you achieved with a GCD implementation.
That having being said, you might still find that async-await is a tiny bit slower than pure GCD implementations. Then again, Swift concurrency offers more native protections and compile-time warnings to ensure thread-safety.
E.g., here are 100 compute-heavy tasks in both GCD and async-await, performed twice for each:
So, you simply have to ask yourself whether the benefits of async-await warrant the modest performance impact or not.
A few unrelated asides on the GCD implementation:
It should be noted that your GCD example is not thread-safe and so the comparison of your two code snippets is not entirely fair. You should make the GCD implementation thread-safe. (Perhaps consider temporarily testing with TSAN. See “Detect Data Races Among Your App’s Threads” section of Diagnosing Memory, Thread, and Crash Issues Early.) You should perform doPartialCalculation in parallel, but you must synchronize the update of allPartialResults (or any shared resource). You can use GCD serial queue for this. Or since you seem to be so concerned about performance, perhaps a NSLock or os_unfair_lock (though care must be taken with the latter). See the GCD example at the end of this answer.
If your dispatched blocks are taking ~50 msec, that simply might not be enough work to justify the overhead of concurrency. You may even find that a simple, serial, rendition is faster!
Often, to maximize the amount of work done per thread, we would “stride” through our index (which is what you appear to be doing with your “slice” logic). But if, even after striding, the time per concurrent loop is still measured in milliseconds, then it may turn out that concurrency is unwarranted altogether. Some tasks are so trivial that they simply will not benefit from concurrent execution.
In your GCD example, you are dispatching to a concurrent queue, which if you have too many iterations, can lead to “thread explosion”, exhausting a very limited worker thread pool. You are only doing three iterations, so that’s not a problem now, but if the number of iterations grows, you would want to abandon that pattern, and adopt concurrentPerform (as seen here). It’s a great way to make full use of the hardware capabilities while avoiding the exhausting of the worker thread pool.
As an aside, I would be wary of using any of the sleep methods as a proxy for a time consuming task. You actually want to keep the CPU busy. I personally use an inefficient π calculation as my general proxy for “do something slow”. That is what I used above.
func performHeavyTask(iteration: Int) {
let id = OSSignpostID(log: poi)
os_signpost(.begin, log: poi, name: #function, signpostID: id, "%d", iteration)
let pi = calculatePi(iterations: 100_000_000)
os_signpost(.end, log: poi, name: #function, signpostID: id, "%f", pi)
}
// calculate pi using Gregory-Leibniz series
func calculatePi(iterations: Int) -> Double {
var result = 0.0
var sign = 1.0
for i in 0 ..< iterations {
result += sign / Double(i * 2 + 1)
sign *= -1
}
return result * 4
}
E.g. here is a GCD example which
uses concurrentPerform;
performs calculation in parallel but synchronizes array updates;
performs update of model on main thread;
uses Sequence<String> rather than [String] to eliminate expensive array creation:
func doCalculation() {
DispatchQueue.global().async { [originalData] in // gives me the willies to see asynchronous routine accessing property, so I might capture it here in case it ever changes to mutable property; or, better, it should be parameter of `doCalculation`
let totalLength = originalData.count
let iterations = 3 // avoid brittle pattern of repeating this number (of values based upon it) repeatedly
let sliceLength = totalLength / iterations
let queue = DispatchQueue(label: "Calculator") // serial queue for synchronization
var allResults = Set<String>()
DispatchQueue.concurrentPerform(iterations: iterations) { i in
let start = i * sliceLength
let end = min(start + sliceLength, totalLength)
let result = self.doPartialCalculation(with: originalData[start..<end]) // do calculation in parallel
queue.sync { allResults.formUnion(result) } // synchronize update
}
// personally, I would not update a property from this method,
// but rather would use local var and supply the results in a completion
// handler parameter, and let caller update model as it sees fit.
//
// But if you are going to do this, synchronize the update somehow,
// e.g., do it on the main thread.
DispatchQueue.main.async { // update on main thread
self.calculatedData = allResults // or `self.calculatedData.formUnion(allResults)`, if that's what you really mean
}
}
}
// note, rather than taking `[String]`, which requires us to create a new
// `Array` instance, let's change this to take `Sequence<String>` as
// input ... that way we can supply array slices directly
func doPartialCalculation<S>(with data: S) -> Set<String> where S: Sequence, S.Element == String {
print("began")
sleep(2)
let someResultSet: Set<String> = ["some result"]
print("ended")
return someResultSet
}
Or, alternatively, you could do the updates of the local var asynchronously and keep track of them with a DispatchGroup, performing the final update (or call to the completion handler) on the .main queue:
func doCalculation() {
DispatchQueue.global().async { [originalData] in // gives me the willies to see asynchronous routine accessing property, so I might capture it here in case it ever changes to mutable property; or, better, it should be parameter of `doCalculation`
let totalLength = originalData.count
let iterations = 3 // avoid brittle pattern of repeating this number (of values based upon it) repeatedly
let sliceLength = totalLength / iterations
let queue = DispatchQueue(label: "Calculator") // serial queue for synchronization
let group = DispatchGroup()
var allResults = Set<String>()
DispatchQueue.concurrentPerform(iterations: iterations) { i in
let start = i * sliceLength
let end = min(start + sliceLength, totalLength)
let result = self.doPartialCalculation(with: originalData[start..<end]) // do calculation in parallel
queue.async(group: group) { allResults.formUnion(result) } // synchronize update
}
// personally, I would not update a property from this method,
// but rather would use local var and supply the results in a completion
// handler parameter, and let caller update model as it sees fit.
//
// But if you are going to do this, synchronize the update somehow,
// e.g., do it on the main thread.
group.notify(queue: .main) {
self.calculatedData = allResults // or `self.calculatedData.formUnion(allResults)`, if that's what you really mean
}
}
}
You can benchmark this and see whether the asynchronous update has any material impact. It probably will not in this case, but the proof is in the pudding.
Your Task-based example looks like it should execute concurrently. I ran it and am able to get concurrent execution.
Probably the issue you're having is that Swift concurrency tries to limit Task concurrency to the number of available cores. And (I don't think this is well documented!) Swift playgrounds and the iOS simulators seem to execute in a single-core environment.
So if you run your code in a Swift playground, you'll get serial task execution. If you make a Mac app and run it in that, or on an iOS device, you should get parallel execution.
This WWDC talk from last year has a discussion of why it works that way: https://developer.apple.com/videos/play/wwdc2021/10254/?time=652
That's worth paying attention to. You'll of course be fine scheduling 3 blocks on a concurrent queue, but if your example is standing in for a real workload that might have hundreds or thousands, it's easy to cause thread explosion and create new, harder to understand performance issues.
Was trying to fix a 300MB memory-leak, and after finding leak-reason;
(Which was calls to NSString's stringFromUTF8String:, from C++ thread (without #autoreleasepool-block wrapper))
I edited the code, to enforce reference-counting (instead of auto-release), something like below:
public func withNSString(
_ chars: UnsafePointer<Int8>,
_ callback: (NSString) -> Void
) {
let result: NSString = NSString(utf8String: chars)!;
callback(result);
}
As personal policy, with a Unit-Test, like:
import Foundation
import XCTest
#testable import MyApp
class AppTest: XCTestCase {
func testWithNSString_hasNoMemoryLeak() {
weak var weakRef: NSString? = nil
autoreleasepool {
let chars = ("some data" as NSString).utf8String!;
withNSString(chars, { strongRef in
weakRef = strongRef;
XCTAssertNotNil(weakRef);
})
// Checks if reference-counting is used.
XCTAssertNil(weakRef); // Fails, so no reference-counting.
}
// Checks if autoreleased.
XCTAssertNil(weakRef); // Fails, OMG! what is this?
}
}
But now, not even auto-release seems to work anymore (-_- )
Why does last XCTAssertNil call fail?
(In other words, how can I fix memory-leaks?)
The problem is that you're using a very short string. It's getting inlined onto the stack, so it's not released until the entire stack frame goes out of scope. If you made the string a little bit longer (2 characters longer), this would behave the way you expect. This is an implementation detail, of course, and could change due to different versions of the compiler, different versions of the OS, different optimization settings, or different architectures.
Keep in mind that testing this kind of thing with static strings of any kind can be tricky, since static strings are placed into the binary. So if the compiler notices that you've indirectly made a pointer to a static string, then it might optimize out the indirection and not release it.
In none of these cases is there a memory leak, though. Your memory leak is more likely in the calling code of withNSString. I would mostly suspect that you're not properly dealing with the bytes passed as chars. We would need to see more about why you think there's a leak to evaluate that. (Foundation also has some small leaks, and Instruments has false positives on leaks, so if you're chasing an allocation that is smaller than 50 bytes and doesn't recur on every operation, you probably are chasing ghosts.)
Note that this is a bit dangerous:
let chars = ("some data" as NSString).utf8String!
withNSString(chars, { strongRef in
The utf8String inner pointer is not promised to live longer than the NSString, and Swift is free to destroy objects after their last reference (which may be before they go out of scope). As the docs note:
This C string is a pointer to a structure inside the string object, which may have a lifetime shorter than the string object and will certainly not have a longer lifetime. Therefore, you should copy the C string if it needs to be stored outside of the memory context in which you use this property.
In this case the object is a constant string, which is in the binary and cannot be destroyed. But in more general cases this is is a classic cause of crashes. I would highly recommend moving away from the NSString interfaces and using String. It offers utf8CString, which returns a proper ContinguousArray, which is much safer.
let chars = "some data".utf8CString
chars.withUnsafeBufferPointer { buffer in
withNSString(buffer.baseAddress!, { strongRef in
weakRef = strongRef;
XCTAssertNotNil(weakRef);
})
}
withUnsafeBufferPointer ensures that chars cannot be destroyed before the block completes.
You can also ensure the lifetime of the string if needed (this is mostly useful for fixing older code you don't want to rewrite in safer ways):
let string = "some data"
withExtendedLifetime(string) {
let chars = string.utf8CString
chars.withUnsafeBufferPointer { buffer in
withNSString(buffer.baseAddress!, { strongRef in
weakRef = strongRef;
XCTAssertNotNil(weakRef);
})
}
}
I have been looking into potential use cases of UnsafePointer and related UnsafeX in Swift, and am wondering what the use case is i Swift. It sounds like the main use case is performance, but then at the same time types are supposed to offer compiler optimizations and so performance, so I'm not sure when they are actually useful. I would like to know if all things can be refactored to not use them with the same or better performance, or if not, what a specific example with description of the code and perhaps some code or pseudocode that demonstrates how it offers a performance advantage. I would basically like to have a reference of a specific example demoing a performance advantage of unsafe pointers and unsafe stuff.
Some things I've found related to Swift:
However, UnsafePointer is an important API for interoperability and building high performance data structures. - http://atrick.github.io/proposal/voidpointer.html
But typing allows for compiler optimizations. I'm wondering what advantages using the Unsafe features gives you.
True Unsafe Code Performance
https://nbsoftsolutions.com/blog/high-performance-unsafe-c-code-is-a-lie
https://www.reddit.com/r/csharp/comments/67oi9p/can_anyone_enlighten_me_on_why_unsafe_code_is/
https://medium.com/#vCabbage/go-are-pointers-a-performance-optimization-a95840d3ef85
Some places you see the use of this is in Metal code, such as here:
// Create buffers used in the shader
guard let uniformBuffer = device.makeBuffer(length: MemoryLayout<Uniforms>.stride) else { throw Error.failedToCreateMetalBuffer(device: device) }
uniformBuffer.label = "me.dehesa.metal.buffers.uniform"
uniformBuffer.contents().bindMemory(to: Uniforms.self, capacity: 1)
// or here
let ptr = uniformsBuffer.contents().assumingMemoryBound(to: Uniforms.self)
ptr.pointee = Uniforms(modelViewProjectionMatrix: modelViewProjectionMatrix, modelViewMatrix: modelViewMatrix, normalMatrix: normalMatrix)
I don't really understand what's going on with the pointers too well yet, but I wanted to ask to see if these use cases offer performance enhancements or if they could be refactored to use a safe version that had similar or even better performance.
Saw it here too:
func setBit(_ index: Int, value: Bool, pointer: UnsafeMutablePointer<UInt8>) {
let bit: UInt8 = value ? 0xFF : 0
pointer.pointee ^= (bit ^ pointer.pointee) & (1 << UInt8(index))
}
More metal:
uniforms = UnsafeMutableRawPointer(uniformBuffer.contents()).bindMemory(to:GUniforms.self, capacity:1)
vertexBuffer = device?.makeBuffer(length: 3 * MemoryLayout<GVertex>.stride * 6, options: .cpuCacheModeWriteCombined)
vertices = UnsafeMutableRawPointer(vertexBuffer!.contents()).bindMemory(to:GVertex.self, capacity:3)
vertexBuffer1 = device?.makeBuffer(length: maxCount * maxCount * MemoryLayout<GVertex>.stride * 4, options: .cpuCacheModeWriteCombined)
vertices1 = UnsafeMutableRawPointer(vertexBuffer1!.contents()).bindMemory(to:GVertex.self, capacity: maxCount * maxCount * 4)
Stuff regarding images:
func mapIndicesRgba(_ imageIndices: Data, size: Size2<Int>) -> Data {
let palette = self
var pixelData = Data(count: size.area * 4)
pixelData.withUnsafeMutableBytes() { (pixels: UnsafeMutablePointer<UInt8>) in
imageIndices.withUnsafeBytes { (indices: UnsafePointer<UInt8>) in
var pixel = pixels
var raw = indices
for _ in 0..<(size.width * size.height) {
let colorIndex = raw.pointee
pixel[0] = palette[colorIndex].red
pixel[1] = palette[colorIndex].green
pixel[2] = palette[colorIndex].blue
pixel[3] = palette[colorIndex].alpha
pixel += 4
raw += 1
}
}
}
return pixelData
}
Stuff regarding input streams:
fileprivate extension InputStream {
fileprivate func loadData(sizeHint: UInt) throws -> Data {
let hint = sizeHint == 0 ? BUFFER_SIZE : Int(sizeHint)
var buffer = UnsafeMutablePointer<UInt8>.allocate(capacity: hint)
var totalBytesRead = read(buffer, maxLength: hint)
while hasBytesAvailable {
let newSize = totalBytesRead * 3 / 2
// Ehhhh, Swift Foundation's Data doesnt have `increaseLength(by:)` method anymore
// That is why we have to go the `realloc` way... :(
buffer = unsafeBitCast(realloc(buffer, MemoryLayout<UInt8>.size * newSize), to: UnsafeMutablePointer<UInt8>.self)
totalBytesRead += read(buffer.advanced(by: totalBytesRead), maxLength: newSize - totalBytesRead)
}
if streamStatus == .error {
throw streamError!
}
// FIXME: Probably should use Data(bytesNoCopy: .. ) instead, but will it deallocate the tail of not used buffer?
// leak check must be done
let retVal = Data(bytes: buffer, count: totalBytesRead)
free(buffer)
return retVal
}
}
http://metalkit.org/2017/05/26/working-with-memory-in-metal-part-2.html
Swift semantics allows it to make copies of certain data types for safety when reading and potentially writing non-atomic-sized chunks of memory (copy-on-write allocations, etc.). This data copy operation possibly requires a memory allocation, which potentially can cause a lock with unpredictable latency.
An unsafe pointer can be used to pass a reference to a (possibly)mutable array (or block of bytes), or slice thereof, that should not be copied, no matter how (unsafely) accessed or passed around between functions or threads. This potentially reduces the need for the Swift runtime to do as many memory allocations.
I had one prototype iOS application where Swift was spending significant percentages of CPU (and likely the user’s battery life) allocating and copying multi-megabyte-sized slices of regular Swift arrays passed to functions at a very high rate, some mutating, some not mutating them (for near-real-time RF DSP analysis). A large GPU texture, sub-texture-slice accessed each frame refresh, possibly could have similar issues. Switching to unsafe pointers referencing C allocations of memory stopped this performance/battery waste in my vanilla Swift prototype (the extraneous allocate and copy operations disappeared from the performance profiling).
Below is my source code, every time I execute the function, the memory usage increases dramatically. Please help to point out what is the problem.
func loadfontsFromDatabase(code:String)->[String] {
let documentsPath : AnyObject = NSSearchPathForDirectoriesInDomains(.documentDirectory,.userDomainMask,true)[0] as AnyObject
let databasePath = documentsPath.appending("/bsmcoding.sqlite")
let contactDB = FMDatabase(path: databasePath as String)
var c:[String]=[]
let querySQL = "SELECT FONT FROM BSMCODE WHERE BSMCODE.CODE = '\(code)' ORDER BY NO DESC"
NSLog("query:\(querySQL)")
let results:FMResultSet? = Constants.contactDB?.executeQuery(querySQL, withArgumentsIn: nil)
while (results?.next())! {
c.append((results?.string(forColumn: "FONT"))!)
}
results?.close()
return c
}
There's nothing here that would account for any substantial memory loss. I would suggest using the "Debug Memory Graph" feature in Xcode 8 to identify what objects are being created and not being released, but I suspect the problem rests elsewhere in your code. Or use Instruments to track it down what's leaking and debug from there. See https://stackoverflow.com/a/30993476/1271826.
There are unrelated issues here, though:
You are creating local contactDB, but you never open it and you never use it. It will be released when the routine exits, but it's completely unnecessary if you're going to use Constants.contactDB, anyway.
I'd advise against using string interpolation when building your SQL. Use ? placeholder and pass the code in as a parameter. This is much safer, in case the code ever contained something that couldn't be represented in SQL statement. (This is especially true if the code was supplied by the user, in which case you'd be susceptible to SQL injection attacks or innocent input errors that could lead to crashes.)
For example, you could do something like:
func loadfontsFromDatabase(code: String) -> [String] {
var c = [String]()
let querySQL = "SELECT FONT FROM BSMCODE WHERE BSMCODE.CODE = ? ORDER BY NO DESC"
let results = try! Constants.contactDB!.executeQuery(querySQL, values: [code])
while results.next() {
c.append((results.string(forColumn: "FONT"))!)
}
return c
}
If you don't like the forced unwrapping, you can do optional unwrapping if you want, but personally I'd rather know immediately when debugging during the development phase if there's some logic mistake (e.g. the contactDB wasn't open, the SQL is incorrect, etc.). But you can do optional binding and add the necessary guard statements if you want. But don't just do optional binding and silently return a value suggesting that everything is copacetic, leaving you with a debugging challenge of tracking down the problem if you don't get what you expected.
But the key point is to avoid inserting values into your SQL directly. Use ? placeholders.
OK... so I have no idea why this happens but:
Compare the following two lines:
let pointCurve: [AnyObject] = self.curve.map{NSValue(point:$0)}
and
let pointCurve: [NSPoint] = self.curve.map{$0}
In either case, the variable is local and not used at all after assignment. The line resides in a method that is called repeatedly and very quickly. The first case results in terrible and ever faster growing of memory usage. But when I change it to the second, the memory stats are flat as a disc.
You may say, "oh, you're not doing anything in the second line". So I tried the following:
var pointCurve: [AnyObject] = []
for c in self.curve {
pointCurve.append(NSValue(point:NSPoint(x:1, y:1))
}
vs
var pointCurve: [NSPoint] = []
for c in self.curve {
pointCurve.append(NSPoint(x: 1, y: 1))
}
Now I see the exact same results. The culprit seems to be NSValue. I checked with Instruments that a whole bunch of NSConcreteValues are allocated, and I read online these are related to NSValue. But I didn't find anything about them causing memory leaks.
The question is what can I do about this. I'm supposed to send an array of points to some ObjC code, and until I figure out how to fix this, I can't do it without huge performance issues.
Try:
func pointCurvy() {
autoreleasepool {
let pointCurve: [AnyObject] = self.curve.map{NSValue(point:$0)}
// Do something with pointCurve.
}
}