Questions about the Dart memory policy - flutter

Suppose you have the following code.
List<File> files = [file1, file2, file3, file4];
Each file is 10mb in size.
So how much memory will it use? It sounds simple, but it's confusing.
It will use 40mb. Because it had 4 10mb files.
I don't know exactly, but I'll use much less than 40mb.This is because memory is not used as much as the actual size of the file, but as much as the size of a pointer pointing to the memory address of the file.
1 or 2?

The currently accepted answer is incorrect. Dart is always pass-by-value and whether it is or not is irrelevant to this question.
Now for your question. The amount of memory that you're using when working with files depends on what you're doing with them. The example you show is a list of file handles, not the files themselves. It would be inefficient to load the contents of a file into memory just by instantiating a reference to the file with a File object.
So your list is a list of file handles. This makes both of your proposed explanations incorrect since the data is not loaded into memory, though you're on the right track with the second explanation. The contents of the file is not in memory, so it will not use 40 MB of memory for 4 10 MB files. It's also not a pointer to the memory either. A pointer would still indicate that the file data is in memory and you File object just an address for that block. It's more accurate to says that it's a reference to the file given by the OS or a reference to the file's place in storage. The amount of memory that the file handle will take up is going to be relatively small, though I don't know any exact values.
Data will be loaded into memory if you explicitly request that data with a function like readAsBytes. Then the whole file will be loaded into memory and it will likely use the full 10 MB per file. There might be some optimization I don't know about, but it will be closer to the full size of the file. This will of course be in a separate variable and the File objects themselves will take up very little memory.

Dart uses pass-by-value for primitive types like int or String and uses pass-by-reference for more complex data types like List. Here the files are probably an object so it will take on the memory of the pointers.
This is a simple example for pass-by-value.
void main() {
void addOne(int x) {
x += 1;
print("x here is $x");
}
int x = 1;
addOne(x); //Output: "x here is 2"
print(x); //1
}
Here the value "x" is passed by value so, even if the function incremented x, the value in the main function will still remain 1.
example for pass-by-reference:
void main() {
void addList(List<int> x) {
x.add(12);
print("x here is $x");
}
List<int> x = [1, 2, 3, 4, 5];
addList(x); //x here is [1, 2, 3, 4, 5, 12]
print(x); //[1, 2, 3, 4, 5, 12]
}
Here the value of x is passed by reference so it would get updated in the main function as well
Passing the reference means to pass the memory location and not the actual value which is very efficient in case of big list or arrays or even complex objects. It can save much more memory than pass-by-value.
In your case the files variable is a List of File(List<File>) objects. That means that file1, file2, etc are File objects and shall thus be passed-by-reference in dart

Related

Thread safe operations on XDP

I was able to confirm from the documentation that bpf_map_update_elem is an atomic operation if done on HASH_MAPs. Source (https://man7.org/linux/man-pages/man2/bpf.2.html). [Cite: map_update_elem() replaces existing elements atomically]
My question is 2 folds.
What if the element does not exist, is the map_update_elem still atomic?
Is the XDP operation bpf_map_delete_elem thread safe from User space program?
The map is a HASH_MAP.
Atomic ops, race conditions and thread safety are sort of complex in eBPF, so I will make a broad answer since it is hard to judge from your question what your goals are.
Yes, both the bpf_map_update_elem command via the syscall and the helper function update the maps 'atmomically', which in this case means that if we go from value 'A' to value 'B' that the program always sees either 'A' or 'B' not some combination of the two(first bytes of 'B' and last bytes of 'A' for example). This is true for all map types. This holds true for all map modifying syscall commands(including bpf_map_delete_elem).
This however doesn't make race conditions impossible since the value of the map may have changed between a map_lookup_elem and the moment you update it.
What is also good to keep in mind is that the map_lookup_elem syscall command(userspace) works differently from the helper function(kernelspace). The syscall will always return a copy of the data which isn't mutable. But the helper function will return a pointer to the location in kernel memory where the map value is stored, and you can directly update the map value this way without using the map_update_elem helper. That is why you often see hash maps used like:
value = bpf_map_lookup_elem(&hash_map, &key);
if (value) {
__sync_fetch_and_add(&value->packets, 1);
__sync_fetch_and_add(&value->bytes, skb->len);
} else {
struct pair val = {1, skb->len};
bpf_map_update_elem(&hash_map, &key, &val, BPF_ANY);
}
Note that in this example, __sync_fetch_and_add is used to update parts of the map value. We need to do this since updating it like value->packets++; or value->packets += 1 would result in a race condition. The __sync_fetch_and_add emits a atomic CPU instruction which in this case fetches, adds and writes back all in one instruction.
Also, in this example, the two struct fields are atomically updated, but not together, it is still possible for the packets to have incremented but bytes not yet. If you want to avoid this you need to use a spinlock(using the bpf_spin_lock and bpf_spin_unlock helpers).
Another way to sidestep the issue entirely is to use the _PER_CPU variants of maps, where you trade-off congestion/speed and memory use.

Is there a way to accurately measure heap allocations in Unity for unit testing?

Allocating temporary objects on the heap every frame in Unity is costly, and we all do our best to avoid this by caching heap objects and avoiding garbage generating functions. It's not always obvious when something will generate garbage though. For example:
enum MyEnum {
Zero,
One,
Two
}
List<MyEnum> MyList = new List<MyEnum>();
MyList.Contains(MyEnum.Zero); // Generates garbage
MyList.Contains() generates garbage because the default equality comparer for List uses objects which causes boxing of the enum value types.
In order to prevent inadvertent heap allocations like these, I would like to be able to detect them in my unit tests.
I think there are 2 requirements for this:
A function to return the amount of heap allocated memory
A way to prevent garbage collection occurring during the test
I haven't found a clean way to ensure #2. The closest thing I've found for #1 is GC.GetTotalMemory()
[UnityTest]
IEnumerator MyTest()
{
long before = GC.GetTotalMemory(false);
const int numObjects = 1;
for (int i = 0 ; i < numObjects; ++i)
{
System.Version v = new System.Version();
}
long after = GC.GetTotalMemory(false);
Assert.That(before == after);
}
The problem is that GC.GetTotalMemory() returns the same value before and after in this test. I suspect that Unity/Mono only allocates memory from the system heap in chunks, say 4kb, so you need to allocate <= 4kb before Unity/Mono will actually request more memory from the system heap, at which point GC.GetTotalMemory() will return a different value. I confirmed that if I change numObjects to 1000, GC.GetTotalMemory() returns different values for before and after.
So in summary, 1. how can i accurately get amount of heap allocated memory, accurate to the byte and 2. can the garbage collector run during the body of my test, and if so, is there any non-hacky way of disabling GC for the duration of my test
TL;DR
Thanks for your help!
I posted the same question over on Unity answers and got a reply:
https://answers.unity.com/questions/1535588/is-there-a-way-to-accurately-measure-heap-allocati.html
No, basically it's not possible. You could run your unit test a bunch of times in a loop and hope that it generates enough garbage to cause a change in the value returned by GC.GetTotalMemory(), but that's about it.

Swift pointer from ArraySlice

I'm trying to determine the "Swift-y" way of creating my own contiguous memory containers (in my particular case, I'm building n-dimensional arrays). I want my containers to be as close to Swift's builtin Array as possible - in terms of functionality and usability.
I need to access the pointer to memory of my containers for stuff like Accelerate and BLAS operations.
I want to know whether an ArraySlice's pointer would point to the first element of the slice, or the first element of its base.
When I tried to test UnsafePointer<Int>(array) == UnsafePointer<Int>(array[1...2]) it looks like Swift doesn't allow pointer construction from ArraySlices (or I just did it incorrectly).
I'm looking for advice on which way would be the most "Swift-y"?
I understand that when slicing an array the follow is true:
let array = [1, 2, 3]
array[1] == array[1...2][1]
and
array[1...2][0] != 2 # index out of bounds error
In other words, indexing is always performed relative to the base Array.
Which suggests: that we should return a pointer to the base's first element. Because slices are relative to their base.
However, iteration through a slice (obviously) only considers elements of that slice:
for i in array[1..2] # i takes on 2 followed by 3
Which suggests: that we should return a pointer to the slice's first element. Because slices have their own starting point.
If my user wanted to operate on a slice in a BLAS operation it would be intuitive to expect:
mmul(matrix1[1...2, 0...1].pointer, matrix2[4...5, 0...1].pointer)
to point to the first elements of slice, but I don't know if this is the way a Swift ArraySlice would do things.
My Question: Should a container slice object's pointer point to the first element of the slice, or, the first element of the base container.
This operation is unsafe:
UnsafePointer<Int>(array)
What you mean is:
array.withUnsafeBufferPointer { ... }
This applies to your types as well, and is the pattern you should employ to interoperate with BLAS and Accelerate. You should not try to use a pointer method IMO.
There is no promise that array will continue to exist by the time you actually access the pointer, even if that happens in the same line of code. ARC is free to destroy that memory shockingly quickly.
UnsafeBufferPointer is actually a very nice type in that it is already promised to be contiguous and it behaves as a Collection.
My suggestion here would be to manage your own memory internally, probably with a ManagedBuffer, but maybe just with a UnsafeMutablePointer that you alloc and destroy yourself. It's very important that you manage the layout of the data so that it's compatible with Accelerate. You don't want Array<Array<UInt8>>. That's going to add too much structure. You want a blob of bytes that you index into in the good-ol' C ways (row*width+column, etc). You probably don't want your slices to return pointers at all directly. Your mmul function is likely going to need special logic to understand how to pull the pieces it needs out of slices with minimal copying so that it works with vDSP_mmul. "Generic" and "Accelerate" seldom go together.
For example, considering this:
mmul(matrix1[1...2, 0...1].pointer, matrix2[4...5, 0...1].pointer)
(Obviously I assume your real matrices are dramatically larger; this kind of matrix doesn't make much sense to send to vDSP.)
You're going to have to write your own mmul here obviously since this memory isn't laid out correctly. So you might as well pass the slices. Then you'd do something like (totally untested, uncompiled, and I'm sure the syntax is wildly wrong):
mmul(m1: MatrixSlice, m2: MatrixSlice) -> Matrix {
var s1 = UnsafeMutablePointer<Float>.alloc(m1.rows * m1.columns)
// use vDSP_vgathr to copy each sliced row out of m1 into s1
var s2 = UnsafeMutablePointer<Float>.alloc(m2.rows * m2.columns)
// use vDSP_vgathr to copy each sliced row out of m2 into s2
var result = UnsafeMutablePointer<Float>.alloc(m1.rows * m2.columns)
vDSP_mmul(s1, 1, s2, 1, result, 1, m1.rows, m2.columns, m1.columns)
s1.destroy()
s2.destroy()
// This will need to call result.move() or moveInitializeFrom or something
return Matrix(result)
}
I'm just throwing out stuff here, but this is probably the kind of structure you'd want.
To your underlying question about whether the pointer to the container or to the data is usually passed by Swift, the answer is unfortunately "magic" for Array and no one else. Passing an Array to something that wants a pointer will magically (by the compiler, not the stdlib) pass a pointer to the storage of the Array. No other type gets this magic. Not even ContiguousArray gets this magic. If you pass a ContiguousArray to something that wants a pointer, you'll pass the pointer to the container (and if it's mutable, corrupt the container; true story, hated that one…)
Thanks in part to #RobNapier the answer to my question is: ArraySlice's pointer should point to the slice's first element.
The way I verified this was simply:
var array = [5,4,3,325,67,7,3]
array.withUnsafeBufferPointer{ $0 } != array[3...6].withUnsafeBufferPointer{ $0 } # true
^--- points to 5's address ^--- points to 325's address

How to cast [Int8] to [UInt8] in Swift

I have a buffer that contains just characters
let buffer: [Int8] = ....
Then I need to pass this to a function process that takes [UInt8] as an argument.
func process(buffer: [UInt8]) {
// some code
}
What would be the best way to pass the [Int8] buffer to cast to [Int8]? I know following code would work, but in this case the buffer contains just bunch of characters, and it is unnecessary to use functions like map.
process(buffer.map{ x in UInt8(x) }) // OK
process([UInt8](buffer)) // error
process(buffer as! [UInt8]) // error
I am using Xcode7 b3 Swift2.
I broadly agree with the other answers that you should just stick with map, however, if your array were truly huge, and it really was painful to create a whole second buffer just for converting to the same bit pattern, you could do it like this:
// first, change your process logic to be generic on any kind of container
func process<C: CollectionType where C.Generator.Element == UInt8>(chars: C) {
// just to prove it's working...
print(String(chars.map { UnicodeScalar($0) }))
}
// sample input
let a: [Int8] = [104, 101, 108, 108, 111] // ascii "Hello"
// access the underlying raw buffer as a pointer
a.withUnsafeBufferPointer { buf -> Void in
process(
UnsafeBufferPointer(
// cast the underlying pointer to the type you want
start: UnsafePointer(buf.baseAddress),
count: buf.count))
}
// this prints [h, e, l, l, o]
Note withUnsafeBufferPointer means what it says. It’s unsafe and you can corrupt memory if you get this wrong (be especially careful with the count). It works based on your external knowledge that, for example, if any of the integers are negative then your code doesn't mind them becoming corrupt unsigned integers. You might know that, but the Swift type system can't, so it won't allow it without resort to the unsafe types.
That said, the above code is correct and within the rules and these techniques are justifiable if you need the performance edge. You almost certainly won’t unless you’re dealing with gigantic amounts of data or writing a library that you will call a gazillion times.
It’s also worth noting that there are circumstances where an array is not actually backed by a contiguous buffer (for example if it were cast from an NSArray) in which case calling .withUnsafeBufferPointer will first copy all the elements into a contiguous array. Also, Swift arrays are growable so this copy of underlying elements happens often as the array grows. If performance is absolutely critical, you could consider allocating your own memory using UnsafeMutablePointer and using it fixed-size style using UnsafeBufferPointer.
For a humorous but definitely not within the rules example that you shouldn’t actually use, this will also work:
process(unsafeBitCast(a, [UInt8].self))
It's also worth noting that these solutions are not the same as a.map { UInt8($0) } since the latter will trap at runtime if you pass it a negative integer. If this is a possibility you may need to filter them first.
IMO, the best way to do this would be to stick to the same base type throughout the whole application to avoid the whole need to do casts/coercions. That is, either use Int8 everywhere, or UInt8, but not both.
If you have no choice, e.g. if you use two separate frameworks over which you have no control, and one of them uses Int8 while another uses UInt8, then you should use map if you really want to use Swift. The latter 2 lines from your examples (process([UInt8](buffer)) and
process(buffer as! [UInt8])) look more like C approach to the problem, that is, we don't care that this area in memory is an array on singed integers we will now treat it as if it is unsigneds. Which basically throws whole Swift idea of strong types to the window.
What I would probably try to do is to use lazy sequences. E.g. check if it possible to feed process() method with something like:
let convertedBuffer = lazy(buffer).map() {
UInt8($0)
}
process(convertedBuffer)
This would at least save you from extra memory overhead (as otherwise you would have to keep 2 arrays), and possibly save you some performance (thanks to laziness).
You cannot cast arrays in Swift. It looks like you can, but what's really happening is that you are casting all the elements, one by one. Therefore, you can use cast notation with an array only if the elements can be cast.
Well, you cannot cast between numeric types in Swift. You have to coerce, which is a completely different thing - that is, you must make a new object of a different numeric type, based on the original object. The only way to use an Int8 where a UInt8 is expected is to coerce it: UInt8(x).
So what is true for one Int8 is true for an entire array of Int8. You cannot cast from an array of Int8 to an array of UInt8, any more than you could cast one of them. The only way to end up with an array of UInt8 is to coerce all the elements. That is exactly what your map call does. That is the way to do it; saying it is "unnecessary" is meaningless.

async_work_group_copy from __constant

Code like this:
__constant char a[1] = "x";
...
__local char b[1];
async_work_group_copy(b, a, 1, 0);
throws a compile error:
no instance of overloaded function "async_work_group_copy" matches the argument list
So it seems that this function cannot be used to copy from __constant address space. Am I right? If yes, what's the preferred method to make a copy of __constant data to __local memory for faster access? Now I use a simple for loop, where each workitem copies several elements.
async_work_group_copy() is defined to copy between local and global memory only (see here: http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/).
As far as I know, there is no method to perform bulk copy from constant to local memory. Maybe the reason is that constant memory is actually cached on all GPUs that I know of, which essentially means that it works at the same speed as local memory.
The vloadn() family of functions can load whole vectors for all types of memory, including constant, so that may partially match what you need. However, it is not bulk copy.