Thread safe operations on XDP - ebpf

I was able to confirm from the documentation that bpf_map_update_elem is an atomic operation if done on HASH_MAPs. Source (https://man7.org/linux/man-pages/man2/bpf.2.html). [Cite: map_update_elem() replaces existing elements atomically]
My question is 2 folds.
What if the element does not exist, is the map_update_elem still atomic?
Is the XDP operation bpf_map_delete_elem thread safe from User space program?
The map is a HASH_MAP.

Atomic ops, race conditions and thread safety are sort of complex in eBPF, so I will make a broad answer since it is hard to judge from your question what your goals are.
Yes, both the bpf_map_update_elem command via the syscall and the helper function update the maps 'atmomically', which in this case means that if we go from value 'A' to value 'B' that the program always sees either 'A' or 'B' not some combination of the two(first bytes of 'B' and last bytes of 'A' for example). This is true for all map types. This holds true for all map modifying syscall commands(including bpf_map_delete_elem).
This however doesn't make race conditions impossible since the value of the map may have changed between a map_lookup_elem and the moment you update it.
What is also good to keep in mind is that the map_lookup_elem syscall command(userspace) works differently from the helper function(kernelspace). The syscall will always return a copy of the data which isn't mutable. But the helper function will return a pointer to the location in kernel memory where the map value is stored, and you can directly update the map value this way without using the map_update_elem helper. That is why you often see hash maps used like:
value = bpf_map_lookup_elem(&hash_map, &key);
if (value) {
__sync_fetch_and_add(&value->packets, 1);
__sync_fetch_and_add(&value->bytes, skb->len);
} else {
struct pair val = {1, skb->len};
bpf_map_update_elem(&hash_map, &key, &val, BPF_ANY);
}
Note that in this example, __sync_fetch_and_add is used to update parts of the map value. We need to do this since updating it like value->packets++; or value->packets += 1 would result in a race condition. The __sync_fetch_and_add emits a atomic CPU instruction which in this case fetches, adds and writes back all in one instruction.
Also, in this example, the two struct fields are atomically updated, but not together, it is still possible for the packets to have incremented but bytes not yet. If you want to avoid this you need to use a spinlock(using the bpf_spin_lock and bpf_spin_unlock helpers).
Another way to sidestep the issue entirely is to use the _PER_CPU variants of maps, where you trade-off congestion/speed and memory use.

Related

Is BPF_LOOP for Linkedlist iteration, viable?, if not what other way I can do it

I am trying to iterate a linked list in BPF, can I use bpf_loop for this?, if so how?
Tried using bpf_loops, but it required a fixed number to bound the loops, what way I can do that?
These are the comments for bpf_loop.
bpf_loop
For nr_loops, call callback_fn function
with callback_ctx as the context parameter.
The callback_fn should be a static function and
the callback_ctx should be a pointer to the stack.
The flags is used to control certain aspects of the helper.
Currently, the flags must be 0. Currently, nr_loops is
limited to 1 << 23 (~8 million) loops.
long (*callback_fn)(u32 index, void *ctx);
where index is the current index in the loop. The index
is zero-indexed.
If callback_fn returns 0, the helper will continue to the next
loop. If return value is 1, the helper will skip the rest of
the loops and return. Other return values are not used now,
and will be rejected by the verifier.
Returns
The number of loops performed, -EINVAL for invalid flags,
-E2BIG if nr_loops exceeds the maximum number of loops.
You can pass the linked list via the callback_ctx which is shared between iterations.
You can hardcode the number of loops to the max of 1 << 23 (~8 million) (perhaps less, it depends on how the verifier imposes penalties).
If you can advance the linked list you do so and perform whatever logic you want.
If you are done or at the end of the linked list you return 1 to
break out of the loop.
If you expect to need more than 8M iterations, you can at least detect when you are not done by looking at the return value (perhaps running a second loop, I don't know if that is allowed or not)

Swift Mutating Data at Range

I am building a Data object that looks something like this:
struct StructuredData {
var crc: UInt16
var someData: UInt32
var someMoreData: UInt64
// etc.
}
I'm running a CRC algorithm that will start at byte 2 and process length 12.
When the CRC is returned, it must exist at the beginning of the Data object. As I see it, my options are:
Generate a Data object that does not include the CRC, process it, and then build another Data object that does (so that the CRC value that I now have will be at the start of the Data object.
Generate the data object to include a zero'd out CRC to start with and then mutate the data at range [0..<2].
Obviously, 2 would be preferable, as it uses less memory and less processing, but this type of an optimization I'm not sure is necessary anymore. I'd still rather go with 2, except I do not know how to mutate the data at a given index range. Any help is greatly appreciated.
I do not recommend the way of mutating Data using this:
data.replaceSubrange(0..<2, with: UnsafeBufferPointer(start: &self.crc, count: 1))
Please try this:
data.replaceSubrange(0..<2, with: &self.crc, count: 2)
It is hard to explain why, but I'll try...
In Swift, inout parameters work in copy-in-copy-out semantics. When you write something like this:
aMethod(&param)
Swift allocates some region of enough size to hold the content of param,
copies param into the region, (copy-in)
calls the method with passing the address of the region,
and when returned from the call, copies the content of the region back to param (copy-out).
In many cases, Swift optimizes (which may happen even in -Onone setting) the steps by just passing the actual address of param, but it is not clearly documented.
So, when an inout parameter is passed to the initializer of UnsafeBufferPointer, the address received by UnsafeBufferPointer may be pointing to a temporal region, which will be released immediately after the initializer is finished.
Thus, replaceSubrange(_:with:) may copy the bytes in the already released region into the Data.
I believe the first code would work in this case as crc is a property of a struct, but if there is a simple and safe alternative, you should better avoid the unsafe way.
ADDITION for the comment of Brandon Mantzey's own answer.
data.append(UnsafeBufferPointer(start: &self.crcOfRecordData, count: 1))
Using safe in the meaning above. This is not safe, for the same reason described above.
I would write it as:
data.append(Data(bytes: &self.crcOfRecordData, count: MemoryLayout<UInt16>.size))
(Assuming the type of crcOfRecordData as UInt16.)
If you do not prefer creating an extra Data instance, you can write it as:
withUnsafeBytes(of: &self.crcOfRecordData) {urbp in
data.append(urbp.baseAddress!.assumingMemoryBound(to: UInt8.self), count: MemoryLayout<UInt16>.size)
}
This is not referred in the comment, but in the meaning of above safe, the following line is not safe.
let uint32Data = Data(buffer: UnsafeBufferPointer(start: &self.someData, count: 1))
All the same reason.
I would write it as:
let uint32Data = Data(bytes: &self.someData, count: MemoryLayout<UInt32>.size)
Though, observable unexpected behavior may happen in some very limited condition and with very little probability.
Such behavior would happen only when the following two conditions are met:
Swift compiler generates non-Optimized copy-in-copy-out code
Between the very narrow period, since the temporal region is released till the append method (or Data.init) finishes copying the whole content, the region is modified for another use.
The condition #1 becomes true only limited cases in the current implementation of Swift.
The condition #2 happens very rarely only in the multi-threaded environment. (Though, Apple's framework uses many hidden threads as you can find in the debugger of Xcode.)
In fact, I have not seen any questions regarding the unsafe cases above, my safe may be sort of overkill.
But alternative safe codes are not so complex, are they?
In my opinion, you should better be accustomed to use all-cases-safe code.
I figured it out. I actually had a syntax error that was boggling me because I hadn't seen that before.
Here's the answer:
data.replaceSubrange(0..<2, with: UnsafeBufferPointer(start: &self.crc, count: 1))

Swift pointer from ArraySlice

I'm trying to determine the "Swift-y" way of creating my own contiguous memory containers (in my particular case, I'm building n-dimensional arrays). I want my containers to be as close to Swift's builtin Array as possible - in terms of functionality and usability.
I need to access the pointer to memory of my containers for stuff like Accelerate and BLAS operations.
I want to know whether an ArraySlice's pointer would point to the first element of the slice, or the first element of its base.
When I tried to test UnsafePointer<Int>(array) == UnsafePointer<Int>(array[1...2]) it looks like Swift doesn't allow pointer construction from ArraySlices (or I just did it incorrectly).
I'm looking for advice on which way would be the most "Swift-y"?
I understand that when slicing an array the follow is true:
let array = [1, 2, 3]
array[1] == array[1...2][1]
and
array[1...2][0] != 2 # index out of bounds error
In other words, indexing is always performed relative to the base Array.
Which suggests: that we should return a pointer to the base's first element. Because slices are relative to their base.
However, iteration through a slice (obviously) only considers elements of that slice:
for i in array[1..2] # i takes on 2 followed by 3
Which suggests: that we should return a pointer to the slice's first element. Because slices have their own starting point.
If my user wanted to operate on a slice in a BLAS operation it would be intuitive to expect:
mmul(matrix1[1...2, 0...1].pointer, matrix2[4...5, 0...1].pointer)
to point to the first elements of slice, but I don't know if this is the way a Swift ArraySlice would do things.
My Question: Should a container slice object's pointer point to the first element of the slice, or, the first element of the base container.
This operation is unsafe:
UnsafePointer<Int>(array)
What you mean is:
array.withUnsafeBufferPointer { ... }
This applies to your types as well, and is the pattern you should employ to interoperate with BLAS and Accelerate. You should not try to use a pointer method IMO.
There is no promise that array will continue to exist by the time you actually access the pointer, even if that happens in the same line of code. ARC is free to destroy that memory shockingly quickly.
UnsafeBufferPointer is actually a very nice type in that it is already promised to be contiguous and it behaves as a Collection.
My suggestion here would be to manage your own memory internally, probably with a ManagedBuffer, but maybe just with a UnsafeMutablePointer that you alloc and destroy yourself. It's very important that you manage the layout of the data so that it's compatible with Accelerate. You don't want Array<Array<UInt8>>. That's going to add too much structure. You want a blob of bytes that you index into in the good-ol' C ways (row*width+column, etc). You probably don't want your slices to return pointers at all directly. Your mmul function is likely going to need special logic to understand how to pull the pieces it needs out of slices with minimal copying so that it works with vDSP_mmul. "Generic" and "Accelerate" seldom go together.
For example, considering this:
mmul(matrix1[1...2, 0...1].pointer, matrix2[4...5, 0...1].pointer)
(Obviously I assume your real matrices are dramatically larger; this kind of matrix doesn't make much sense to send to vDSP.)
You're going to have to write your own mmul here obviously since this memory isn't laid out correctly. So you might as well pass the slices. Then you'd do something like (totally untested, uncompiled, and I'm sure the syntax is wildly wrong):
mmul(m1: MatrixSlice, m2: MatrixSlice) -> Matrix {
var s1 = UnsafeMutablePointer<Float>.alloc(m1.rows * m1.columns)
// use vDSP_vgathr to copy each sliced row out of m1 into s1
var s2 = UnsafeMutablePointer<Float>.alloc(m2.rows * m2.columns)
// use vDSP_vgathr to copy each sliced row out of m2 into s2
var result = UnsafeMutablePointer<Float>.alloc(m1.rows * m2.columns)
vDSP_mmul(s1, 1, s2, 1, result, 1, m1.rows, m2.columns, m1.columns)
s1.destroy()
s2.destroy()
// This will need to call result.move() or moveInitializeFrom or something
return Matrix(result)
}
I'm just throwing out stuff here, but this is probably the kind of structure you'd want.
To your underlying question about whether the pointer to the container or to the data is usually passed by Swift, the answer is unfortunately "magic" for Array and no one else. Passing an Array to something that wants a pointer will magically (by the compiler, not the stdlib) pass a pointer to the storage of the Array. No other type gets this magic. Not even ContiguousArray gets this magic. If you pass a ContiguousArray to something that wants a pointer, you'll pass the pointer to the container (and if it's mutable, corrupt the container; true story, hated that one…)
Thanks in part to #RobNapier the answer to my question is: ArraySlice's pointer should point to the slice's first element.
The way I verified this was simply:
var array = [5,4,3,325,67,7,3]
array.withUnsafeBufferPointer{ $0 } != array[3...6].withUnsafeBufferPointer{ $0 } # true
^--- points to 5's address ^--- points to 325's address

How safe are swift collections when used with invalidated iterators / indices?

I'm not seeing a lot of info in the swift stdlib reference. For example, Dictionary says certain methods (like remove) will invalidate indices, but that's it.
For a language to call itself "safe", it needs a solution to the classic C++ footguns:
get pointer to element in a vector, then add more elements (pointer is now invalidated), now use pointer, crash
start iterating through a collection. while iterating, remove some elements (either before or after the current iterator position). continue iterating, crash.
(edit: in c++, you're lucky to crash - worse case is memory corruption)
I believe 1 is solved by swift because if a collection stores classes, taking a reference (e.g. strong pointer) to an element will increase the refcount. However, I don't know the answer for 2.
It would be super useful if there was a comparison of footguns in c++ that are/are not solved by swift.
EDIT, due to Robs answer:
It does appear that there's some undocumented snapshot-like behavior going on
with Dictionary and/or for loops. The iteration creates a snapshot / hidden
copy of it when it starts.
Which gives me both a big "WAT" and "cool, that's sort of safe, I guess", and "how expensive is this copy?".
I don't see this documented either in Generator or in for-loop.
The below code prints two logical snapshots of the dictionary. The first
snapshot is userInfo as it was at the start of the iteration loop, and does
not reflect any modifications made during the loop.
var userInfo: [String: String] = [
"first_name" : "Andrei",
"last_name" : "Puni",
"job_title" : "Mad scientist"
]
userInfo["added_one"] = "1" // can modify because it's var
print("first snapshot:")
var hijacked = false
for (key, value) in userInfo {
if !hijacked {
userInfo["added_two"] = "2" // doesn't error
userInfo.removeValueForKey("first_name") // doesn't error
hijacked = true
}
print("- \(key): \(value)")
}
userInfo["added_three"] = "3" // modify again
print("final snapshot:")
for (key, value) in userInfo {
print("- \(key): \(value)")
}
As you say, #1 is not an issue. You do not have a pointer to the object in Swift. You either have its value or a reference to it. If you have its value, then it's a copy. If you have a reference, then it's protected. So there's no issue here.
But let's consider the second and experiment, be surprised, and then stop being surprised.
var xs = [1,2,3,4]
for x in xs { // (1)
if x == 2 {
xs.removeAll() // (2)
}
print(x) // Prints "1\n2\n3\n\4\n"
}
xs // [] (3)
Wait, how does it print all the values when we blow away the values at (2). We are very surprised now.
But we shouldn't be. Swift arrays are values. The xs at (1) is a value. Nothing can ever change it. It's not "a pointer to memory that includes an array structure that contains 4 elements." It's the value [1,2,3,4]. At (2), we don't "remove all elements from the thing xs pointed to." We take the thing xs is, create an array that results if you remove all the elements (that would be [] in all cases), and then assign that new array to xs. Nothing bad happens.
So what does the documentation mean by "invalidates all indices?" It means exactly that. If we generated indices, they're no good anymore. Let's see:
var xs = [1,2,3,4]
for i in xs.indices {
if i == 2 {
xs.removeAll()
}
print(xs[i]) // Prints "1\n2\n" and then CRASH!!!
}
Once xs.removeAll() is called, there's no promise that the old result of xs.indices means anything anymore. You are not permitted to use those indices safely against the collection they came from.
"Invalidates indices" in Swift is not the same as C++'s "invalidates iterators." I'd call that pretty safe, except the fact that using collection indices is always a bit dangerous and so you should avoid indexing collections when you can help it; iterate them instead. Even if you need the indexes for some reason, use enumerate to get them without creating any of the danger of indexing.
(Side note, dict["key"] is not indexing into dict. Dictionaries are a little confusing because their key is not their index. Accessing dictionaries by their DictionaryIndex index is just as dangerous as accessing arrays by their Int index.)
Note also that the above doesn't apply to NSArray. If you modify NSArray while iterating it, you'll get a "mutated collection while iterating" error. I'm only discussing Swift data types.
EDIT: for-in is very explicit in how it works:
The generate() method is called on the collection expression to obtain a value of a generator type—that is, a type that conforms to the GeneratorType protocol. The program begins executing a loop by calling the next() method on the stream. If the value returned is not None, it is assigned to the item pattern, the program executes the statements, and then continues execution at the beginning of the loop. Otherwise, the program does not perform assignment or execute the statements, and it is finished executing the for-in statement.
The returned Generator is a struct and contains a collection value. You would not expect any changes to some other value to modify its behavior. Remember: [1,2,3] is no different than 4. They're both values. When you assign them, they make copies. So when you create a Generator over a collection value, you're going to snapshot that value, just like if I created a Generator over the number 4. (This raises an interesting problem, because Generators aren't really values, and so really shouldn't be structs. They should be classes. Swift stdlib has been fixing that. See the new AnyGenerator for instance. But they still contain an array value, and you would never expect changes to some other array value to impact them.)
See also "Structures and Enumerations Are Value Types" which goes into more detail on the importance of value types in Swift. Arrays are just structs.
Yes, that means there's logically copying. Swift has many optimizations to minimize actual copying when it's not needed. In your case, when you mutate the dictionary while it's being iterated, that will force a copy to happen. Mutation is cheap if you're the only consumer of a particular value's backing storage. But it's O(n) if you're not. (This is determined by the Swift builtin isUniquelyReferenced().) Long story short: Swift Collections are Copy-on-Write, and simply passing an array does not cause real memory to be allocated or copied.
You don't get COW for free. Your own structs are not COW. It's something that Swift does in stdlib. (See Mike Ash's great discussion of how you would recreate it.) Passing your own custom structs causes real copies to happen. That said, the majority of the memory in most structs is stored in collections, and those collections are COW, so the cost of copying structs is usually pretty small.
The book doesn't spend a lot of time drilling into value types in Swift (it explains it all; it just doesn't keep saying "hey, and this is what that implies"). On the other hand, it was the constant topic at WWDC. You may be interested particularly in Building Better Apps with Value Types in Swift which is all about this topic. I believe Swift in Practice also discussed it.
EDIT2:
#KarlP raises an interesting point in the comments below, and it's worth addressing. None of the value-safety promises we're discussing are related to for-in. They're based on Array. for-in makes no promises at all about what would happen if you mutated a collection while it is being iterated. That wouldn't even be meaningful. for-in doesn't "iterate over collections," it calls next() on Generators. So if your Generator becomes undefined if the collection is changed, then for-in will blow up because the Generator blew up.
That means that the following might be unsafe, depending on how strictly you read the spec:
func nukeFromOrbit<C: RangeReplaceableCollectionType>(var xs: C) {
var hijack = true
for x in xs {
if hijack {
xs.removeAll()
hijack = false
}
print(x)
}
}
And the compiler won't help you here. It'll work fine for all of the Swift collections. But if calling next() after mutation for your collection is undefined behavior, then this is undefined behavior.
My opinion is that it would be poor Swift to make a collection that allows its Generator to become undefined in this case. You could even argue that you've broken the Generator spec if you do (it offers no UB "out" unless the generator has been copied or has returned nil). So you could argue that the above code is totally within spec and your generator is broken. Those arguments tend to be a bit messy with a "spec" like Swift's which doesn't dive into all the corner cases.
Does this mean you can write unsafe code in Swift without getting a clear warning? Absolutely. But in the many cases that commonly cause real-world bugs, Swift's built-in behavior does the right thing. And in that, it is safer than some other options.

In javascript, what are the trade-offs for defining a function inline versus passing it as a reference?

So, let's say I have a large set of elements to which I want to attach event listeners. E.g. a table where I want each row to turn red when clicked.
So my question is which of these is the fastest, and which uses the least memory. I understand that it's (usually) a tradeoff, so I would like to know my best options for each.
Using the table example, let's say there's a list of all the row elements, "rowList":
Option 1:
for(var r in rowList){
rowList[r].onclick = function(){ this.style.backgroundColor = "red" };
}
My gut feeling is that this is the fastest, since there is one less pointer call, but the most memory intensive, since each rowlist will have its own copy of the function, which might get serious if the onclick function is large.
Option 2:
function turnRed(){
this.style.backgroundColor = "red";
}
for(var r in rowList){
rowList[r].onclick = turnRed;
}
I'm guessing this is going to be only a teensy bit slower than the one above (oh no, one more pointer dereference!) but a lot less memory intensive, since the browser only needs to keep track of one copy of the function.
Option 3:
var turnRed = function(){
this.style.backgroundColor = "red";
}
for(var r in rowList){
rowList[r].onclick = turnRed;
}
I assume this is the same as option 2, but I just wanted to throw it out there. For those wondering what the difference between this and option 2 is: JavaScript differences defining a function
Bonus Section: Jquery
Same question with:
$('tr').click(function(){this.style.backgroundColor = "red"});
Versus:
function turnRed(){this.style.backgroundColor = "red"};
$('tr').click(turnRed);
And:
var turnRed = function(){this.style.backgroundColor = "red"};
$('tr').click(turnRed);
Here's your answer:
http://jsperf.com/function-assignment
Option 2 is way faster and uses less memory. The reason is that Option 1 creates a new function object for every iteration of the loop.
In terms of memory usage, your Option 1 is creating a distinct function closure for each row in your array. This approach will therefore use more memory than Option 2 and Option 3, which only create a single function and then pass around a reference to it.
For this same reason I would also expect Option 1 to be the slowest of the three. Of course, the difference in terms of real-world performance and memory usage will probably be quite small, but if you want the most efficient one then pick either Option 2 or Option 3 (they are both pretty much the same, the only real difference between the two is the scope at which turnRed is visible).
As for jQuery, all three options will have the same memory usage and performance characteristics. In every case you are creating and passing a single function reference to jQuery, whether you define it inline or not.
And one important note that is not brought up in your question is that using lots of inline functions can quickly turn your code into an unreadable mess and make it more difficult to maintain. It's not a big deal here since you only have a single line of code in your function, but as a general rule if your function contains more than 2-3 lines it is a good idea to avoid defining it inline. Instead define it as in Option 2 or Option 3 and then pass around a reference to it.