Forcing an Encoder's UnkeyedEncodingContainer to only contain one type of value - swift

As part of a custom Encoder, I am coding an UnkeyedEncodingContainer. However, the specific format I am making it for asks that all elements of an array be of the same type. Specifically, arrays can contain :
Integers one same size
Floats or Doubles
Other arrays (not necessarily all containing the same kinds of elements)
Objects
Here is the type of answer I need : The basis of an UnkeyedEncodingContainer implementation that conforms to the protocol, and enforces that all elements be of one same type among the above specified ones.
As requested, here are examples of things that should or should not be encodable :
var valid1 = []
var valid2 = [3, 3, 5, 9]
var valid3 = ["string", "array"]
var invalid1 = [3, "test"]
var invalid2 = [5, []]
var invalid3 = [[3, 5], {"hello" : 3}]
// These may not all even be valid Swift arrays, they are only
// intended as examples
As an example, here is the best I have come up with, which does not work :
The UnkeyedEncodingContainer contains a function, checkCanEncode, and an instance variable, ElementType :
var elementType : ElementType {
if self.count == 0 {
return .None
} else {
return self.storage[0].containedType
}
}
func checkCanEncode(_ value : Any?, compatibleElementTypes : [ElementType]) throws {
guard compatibleElementTypes.contains(self.elementType) || self.elementType == .None else {
let context = EncodingError.Context(
codingPath: self.nestedCodingPath,
debugDescription: "Cannot encode value to an array of \(self.elementType)s"
)
throw EncodingError.invalidValue(value as Any, context)
}
}
// I know the .None is weird and could be replaced by an optional,
// but it is useful as its rawValue is 0. The Encoder has to encode
// the rawValue of the ElementType at some point, so using an optional
// would actually be more complicated
Everything is then encoded as a contained singleValueContainer :
func encode<T>(_ value: T) throws where T : Encodable {
let container = self.nestedSingleValueContainer()
try container.encode(value)
try checkCanEncode(value, compatibleElementTypes: [container.containedType])
}
// containedType is an instance variable of SingleValueContainer that is set
// when a value is encoded into it
But this causes an issue when it comes to nestedContainer and nestedUnkeyedContainer : (used for stored dictionaries and arrays respectively)
// This violates the protocol, this function should not be able to throw
func nestedContainer<NestedKey>(keyedBy keyType: NestedKey.Type) throws -> KeyedEncodingContainer<NestedKey> where NestedKey : CodingKey {
let container = KeyedContainer<NestedKey>(
codingPath: self.nestedCodingPath,
userInfo: self.userInfo
)
try checkCanEncode(container, compatibleElementTypes: [.Dictionary])
self.storage.append(container)
return KeyedEncodingContainer(container)
}
As you can see, since I need checkCanEncode to know whether it is even possible to create a NestedContainer in the first place (because if the array already has stuff inside that aren't dictionaries, then adding dictionaries to it is invalid), I have to make the function throw. But this breaks the UnkeyedEncodingContainer protocol which demands non-throwing versions.
But I can't just handle the error inside the function ! If something tries to put an array inside an array of integers, it must fail. Therefore this is an invalid solution.
Additional remarks :
Checking after having encoded the values already feels sketchy, but checking only when producing the final encoded payload is definitely a violation of the "No Zombies" principle (fail as soon as the program enters an invalid state) which I would rather avoid. However if no better solution is possible I may accept it as a last resort.
One other solution I have thought about is encoding the array as a dictionary with numbered keys, since dictionaries in this format may contain mixed types. However this is likely to pose decoding issues, so once again, it is a last resort.
You will be advised not to edit other people’s questions. If you have edits to suggest please do so in the comments, otherwise mind your own business

Unless anyone has a better idea, here is the best I could come up with :
Do not enforce that all elements be of the same type inside the UnkeyedEncodingContainer
If all elements are the same type, encode it as an array
If elements have varying types, encode it as a dictionary with integers as keys
This is completely fine as far as the encoding format goes, has minimal costs and only slightly complicates decoding (check whether keys contain integers) and greatly widens how many different Swift object will be compatible with the format.
Note : Remember that the "real" encoding step where the data is generated is not actually part of the protocol. That is where I am proposing the shenanigans should take place 😈

I don't know whether this helps you but in Swift you can overload functions. This means that you can declare functions with the same signature but with different parameter types or constraints. The compiler will take always the right choice. It's much more efficient than a type check at runtime.
The first method is called if the array conforms to Encodable
func encode<T: Encodable>(object: [T]) throws -> Data {
return try JSONEncoder().encode(object)
}
Otherwise the second method is called. If the array cannot even be encoded with JSONSerialization you can add custom encoding logic.
func encode<T>(object: [T]) throws -> Data {
do {
return try JSONSerialization.data(withJSONObject: object)
} catch {
// do custom encoding and return Data
return try myCustomEncoding(object)
}
}
This example
let array = [["1":1, "2":2]]
try encode(object: array)
calls the first method.
On the other hand this – even if the actual type is not heterogenous – calls the second method
let array : [[String:Any]] = [["1":1, "2":2]]
try encode(object: array)

Related

Swift method needs to sort a polymorphic collection of UInt32

Confused (and a little frustrated) with Swift5 at the moment.
I have a method:
func oidsMany(_ oids:Array<UInt32>) -> MCCommandBuilder {
let sorted_oids:[UInt32] = oids.sorted()
...
}
Discovered I have a case where I want to pass a Set to this method just as well. Either way, I'm going to sort an Array or a Set to an Array right away.
Waded through the many many many protocols that both Set and Array conform to, noticed that they both conform to [Sequence][1] and that Sequence responds to sorted. Perfect.
But when I change the above to:
func oidsMany(_ Sequence<UInt32>) -> MCCommandBuilder {
let sorted_oids:[UInt32] = oids.sorted()
...
}
I get the following error hints:
Cannot specialize non-generic type 'Sequence'
Member 'sorted' cannot be used on value of protocol type 'Sequence'; use a generic constraint instead
What's the right way to approach this? I could just add a second oidsMany(_ Set...) that casts its arg as an array and recalls. But I feel like I'm missing something fundamental here. My experience from other languages is not mapping over well here.
You can as the error message suggest use it as a generic constraint instead
func oidsMany2<Sortable: Sequence>(_ oids: Sortable) -> MCCommandBuilder where Sortable.Element: Comparable {
let sorted_oids:[Sortable.Element] = oids.sorted()
//...
}
if you only want to accept collections where the element is UInt32 you can change the where condition to
where Sortable.Element == UInt32

Does Swift know not to initialize an object if my set already contains it?

I have a global var notes: Set<Note> that contains notes initialized with downloaded data.
In the code below, does Swift know to skip the initialization of my Note object if notes already contains it?
for dictionary in downloadedNoteDictionaries {
let note = Note(dictionary: dictionary)
notes.insert(note)
}
I'm wondering because my app downloads dozens of notes per request and initializing a Note object seems rather computationally expensive.
If the answer to my question is no, then how could I improve my code's performance?
My Note class—which I just realized should probably be a struct instead—has the property let id: Int64 as its sole essential component, but apparently, you can't access an element of a set by its hash value? I don't want to use Set's instance method first(where:) because it has a complexity of O(n), and notes could contain millions of Note objects.
You cannot rely on Swift to eliminate the construction of a new Note in your code. Your Set needs to ask the Note for its hashValue, and may need to call == with your Note as an argument. Those computations require the Note object. Possibly if Swift can inline everything, it can notice that your hashValue and == depend only on the id property, but it is certainly not guaranteed to notice or to act on that information.
It sounds like you should be using an [Int64: Note] instead of a Set<Note>.
No, Swift will not avoid creating the new Note object. The problem here is trying to determine if an object already exists in a set. In order to check if an object already exists in the set, you must have some way to identify this object consistently and have that identification persist across future reads and writes to that set. Assuming we want to adopt Swift's hashing enhancements which deprecates the old methods for having to manually provide a hashValue for any object conforming to the Hashable, we should not use a Set as a solution to this problem. The reason is because Swift's new recommended hashing methods use a random seed to generate hashes for added security. Depending on hash values to identify an element in a set alone would therefore not be possible.
It seems that your method of identifying these objects are by an id. I would do as Rob suggests and use a dictionary where the keys are the id. This would help you answer existential questions in order avoid instantiating a Note object.
If you need the resulting dictionary as a Set, you can still create a new Set from this resulting dictionary sequence.
It is possible to pull out a particular element from a set as long as you know how to identify it, and the operations would be O(1). You would use these two methods:
https://developer.apple.com/documentation/swift/set/2996833-firstindex
https://developer.apple.com/documentation/swift/set/2830801-subscript
Here is an example that I was able to run in a playground. However, this assumes that you have a way to identify in your data / model the id of the Note object beforehand in order to avoid creating the Note object. In this case, I am assuming that a Note shares an id with a Dictionary it holds:
import UIKit
struct DictionaryThatNoteUses {
var id: String
init(id: String = UUID().uuidString) {
self.id = id
}
}
struct Note: Hashable {
let dictionary: DictionaryThatNoteUses
var id: String {
return dictionary.id
}
func hash(into hasher: inout Hasher) {
hasher.combine(id)
}
static func == (lhs: Note, rhs: Note) -> Bool {
return lhs.id == rhs.id
}
}
var downloadedNoteDictionaries: [DictionaryThatNoteUses] = [
DictionaryThatNoteUses(),
DictionaryThatNoteUses(),
DictionaryThatNoteUses()
]
var notesDictionary: [String: Note] = [:]
var notesSet: Set<Note> = []
// Add a dictionary with the same id as the first dictionary
downloadedNoteDictionaries.append(downloadedNoteDictionaries.first!) // so now there are four Dictionary objects in this array
func createDictionaryOfNotes() {
func avoidCreatingDuplicateNotesObject(with id: String) {
print("avoided creating duplicate notes object with id \(id)") // prints once because of the duplicated note at the end matching the first, so a new Note object is not instantiated
}
downloadedNoteDictionaries.forEach {
guard notesDictionary[$0.id] == nil else { return avoidCreatingDuplicateNotesObject(with: $0.id) }
let note = Note(dictionary: $0)
notesDictionary[note.id] = note
}
}
createDictionaryOfNotes()
// Obtain a set for set operations on those unique Note objects
notesSet = Set<Note>(notesDictionary.values)
print("number of items in dictionary = \(notesDictionary.count), number of items in set = \(notesSet.count)") // prints 3 and 3 because the 4th object was a duplicate
// Grabbing a specific element from the set
// Let's test with the second Note object from the notesDictionary
let secondNotesObjectFromDictionary = notesDictionary.values[notesDictionary.values.index(notesDictionary.values.startIndex, offsetBy: 1)]
let thirdNotesObjectFromDictionary = notesDictionary.values[notesDictionary.values.index(notesDictionary.values.startIndex, offsetBy: 2)]
if let secondNotesObjectIndexInSet = notesSet.firstIndex(of: secondNotesObjectFromDictionary) {
print("do the two objects match: \(notesSet[secondNotesObjectIndexInSet] == secondNotesObjectFromDictionary)") // prints true
print("does the third object from dictionary match the second object from the set: \(thirdNotesObjectFromDictionary == notesSet[secondNotesObjectIndexInSet])") // prints false
}

Array values optional, or not?

Could you explain why:
when I access an array value using array.first it's optional
when I access from an index value it is not?
Example:
var players = ["Alice", "Bob", "Cindy", "Dan"]
let firstPlayer = players.first
print(firstPlayer) // Optional("Alice")
let firstIndex = players[0]
print(firstIndex) // Alice
(The short answers to this question are great, and exactly what you need. I just wanted to go a bit deeper into the why and how this interacts with Swift Collections more generally and the underlying types. If you just want "how should I use this stuff?" read the accepted answer and ignore all this.)
Arrays follow the rules of all Collections. A Collection must implement the following subscript:
subscript(position: Self.Index) -> Self.Element { get }
So to be a Collection, Array's subscript must accept its Index and unconditionally return an Element. For many kinds of Collections, it is impossible to create an Index that does not exist, but Array uses Int as its Index, so it has to deal with the possibility that you pass an Index that is out of range. In that case, it is impossible to return an Element, and its only option is to fail to return at all. This generally takes the form of crashing the program since it's generally more useful than hanging the program, which is the other option.
(This hides a slight bit of type theory, which is that every function in Swift technically can return "crash," but we don't track that in the type system. It's possible to do that to distinguish between functions that can crash and ones that cannot, but Swift doesn't.)
This should naturally raise the question of why Dictionary doesn't crash when you subscript with a non-existant key. The reason is that Dictionary's Index is not its Key. It has a little-used subscript that provides conformance to Collection (little-used in top-level code, but very commonly used inside of stdlib):
subscript(position: Dictionary<Key, Value>.Index) -> Dictionary.Element { get }
Array could have done this as well, having an Array.Index type that was independent of Int, and making the Int subscript return an Optional. In Swift 1.0, I opened a radar to request exactly that. The team argued that this would make common uses of Array too difficult and that programmers coming to Swift were used to the idea that out-of-range was a programming error (crash). Dictionary, on the other hand, is common to access with non-existant keys, so the Key subscript should be Optional. Several years using Swift has convinced me they were right.
In general you shouldn't subscript arrays unless you got the index from the array (i.e. using index(where:)). But many Cocoa patterns make it very natural to subscript (cellForRow(at:) being the most famous). Still, in more pure Swift code, subscripting with arbitrary Ints often suggests a design problem.
Instead you should often use Collection methods like first and first(where:) which return Optionals and generally safer and clearer, and iterate over them using for-in loops rather than subscripts.
if you want to use subscript and you don't want to have a crash, you can add this extension to your code:
extension Collection {
subscript (safe index: Index) -> Iterator.Element? {
return indices.contains(index) ? self[index] : nil
}
}
and then use it:
let array = [0, 1, 2]
let second = array[safe:1] //Optional(1)
let fourth = array[safe:3] //nil instead of crash
The behavior of first and index subscription is different:
first is declared safely: If the array is empty it returns nil, otherwise the (optional) object.
index subscription is unsafe for legacy reasons: If the array is empty it throws an out-of-range exception otherwise it returns the (non-optional) object
This is because with first, if the Array is empty, the value will be nil. That is why it is an optional. If it is not empty, the first element will be returned.
However, with a subscript (or index value), your program will crash with an error
fatal error: Index out of range
If it is out of range (or is empty) and not return an optional. Else, it will return the element required.
There are default behavior of array property. Array is generic type of Element. When you try to access using first it return as optional.
public var first: Element? { get }
This is available in Array class.

Swift semantics regarding dictionary access

I'm currently reading the excellent Advanced Swift book from objc.io, and I'm running into something that I don't understand.
If you run the following code in a playground, you will notice that when modifying a struct contained in a dictionary a copy is made by the subscript access, but then it appears that the original value in the dictionary is replaced by the copy. I don't understand why. What exactly is happening ?
Also, is there a way to avoid the copy ? According to the author of the book, there isn't, but I just want to be sure.
import Foundation
class Buffer {
let id = UUID()
var value = 0
func copy() -> Buffer {
let new = Buffer()
new.value = self.value
return new
}
}
struct COWStruct {
var buffer = Buffer()
init() { print("Creating \(buffer.id)") }
mutating func change() -> String {
if isKnownUniquelyReferenced(&buffer) {
buffer.value += 1
return "No copy \(buffer.id)"
} else {
let newBuffer = buffer.copy()
newBuffer.value += 1
buffer = newBuffer
return "Copy \(buffer.id)"
}
}
}
var array = [COWStruct()]
array[0].buffer.value
array[0].buffer.id
array[0].change()
array[0].buffer.value
array[0].buffer.id
var dict = ["key": COWStruct()]
dict["key"]?.buffer.value
dict["key"]?.buffer.id
dict["key"]?.change()
dict["key"]?.buffer.value
dict["key"]?.buffer.id
// If the above `change()` was made on a copy, why has the original value changed ?
// Did the copied & modified struct replace the original struct in the dictionary ?
dict["key"]?.change() // Copy
is semantically equivalent to:
if var value = dict["key"] {
value.change() // Copy
dict["key"] = value
}
The value is pulled out of the dictionary, unwrapped into a temporary, mutated, and then placed back into the dictionary.
Because there's now two references to the underlying buffer (one from our local temporary value, and one from the COWStruct instance in the dictionary itself) – we're forcing a copy of the underlying Buffer instance, as it's no longer uniquely referenced.
So, why doesn't
array[0].change() // No Copy
do the same thing? Surely the element should be pulled out of the array, mutated and then stuck back in, replacing the previous value?
The difference is that unlike Dictionary's subscript which comprises of a getter and setter, Array's subscript comprises of a getter and a special accessor called mutableAddressWithPinnedNativeOwner.
What this special accessor does is return a pointer to the element in the array's underlying buffer, along with an owner object to ensure that the buffer isn't deallocated from under the caller. Such an accessor is called an addressor, as it deals with addresses.
Therefore when you say:
array[0].change()
you're actually mutating the actual element in the array directly, rather than a temporary.
Such an addressor cannot be directly applied to Dictionary's subscript because it returns an Optional, and the underlying value isn't stored as an optional. So it currently has to be unwrapped with a temporary, as we cannot return a pointer to the value in storage.
In Swift 3, you can avoid copying your COWStruct's underlying Buffer by removing the value from the dictionary before mutating the temporary:
if var value = dict["key"] {
dict["key"] = nil
value.change() // No Copy
dict["key"] = value
}
As now only the temporary has a view onto the underlying Buffer instance.
And, as #dfri points out in the comments, this can be reduced down to:
if var value = dict.removeValue(forKey: "key") {
value.change() // No Copy
dict["key"] = value
}
saving on a hashing operation.
Additionally, for convenience, you may want to consider making this into an extension method:
extension Dictionary {
mutating func withValue<R>(
forKey key: Key, mutations: (inout Value) throws -> R
) rethrows -> R? {
guard var value = removeValue(forKey: key) else { return nil }
defer {
updateValue(value, forKey: key)
}
return try mutations(&value)
}
}
// ...
dict.withValue(forKey: "key") {
$0.change() // No copy
}
In Swift 4, you should be able to use the values property of Dictionary in order to perform a direct mutation of the value:
if let index = dict.index(forKey: "key") {
dict.values[index].change()
}
As the values property now returns a special Dictionary.Values mutable collection that has a subscript with an addressor (see SE-0154 for more info on this change).
However, currently (with the version of Swift 4 that ships with Xcode 9 beta 5), this still makes a copy. This is due to the fact that both the Dictionary and Dictionary.Values instances have a view onto the underlying buffer – as the values computed property is just implemented with a getter and setter that passes around a reference to the dictionary's buffer.
So when calling the addressor, a copy of the dictionary's buffer is triggered, therefore leading to two views onto COWStruct's Buffer instance, therefore triggering a copy of it upon change() being called.
I have filed a bug over this here. (Edit: This has now been fixed on master with the unofficial introduction of generalised accessors using coroutines, so will be fixed in Swift 5 – see below for more info).
In Swift 4.1, Dictionary's subscript(_:default:) now uses an addressor, so we can efficiently mutate values so long as we supply a default value to use in the mutation.
For example:
dict["key", default: COWStruct()].change() // No copy
The default: parameter uses #autoclosure such that the default value isn't evaluated if it isn't needed (such as in this case where we know there's a value for the key).
Swift 5 and beyond
With the unofficial introduction of generalised accessors in Swift 5, two new underscored accessors have been introduced, _read and _modify which use coroutines in order to yield a value back to the caller. For _modify, this can be an arbitrary mutable expression.
The use of coroutines is exciting because it means that a _modify accessor can now perform logic both before and after the mutation. This allows them to be much more efficient when it comes to copy-on-write types, as they can for example deinitialise the value in storage while yielding a temporary mutable copy of the value that's uniquely referenced to the caller (and then reinitialising the value in storage upon control returning to the callee).
The standard library has already updated many previously inefficient APIs to make use of the new _modify accessor – this includes Dictionary's subscript(_:) which can now yield a uniquely referenced value to the caller (using the deinitialisation trick I mentioned above).
The upshot of these changes means that:
dict["key"]?.change() // No copy
will be able to perform an mutation of the value without having to make a copy in Swift 5 (you can even try this out for yourself with a master snapshot).

Can a condition be used to determine the type of a generic?

I will first explain what I'm trying to do and how I got to where I got stuck before getting to the question.
As a learning exercise for myself, I took some problems that I had already solved in Objective-C to see how I can solve them differently with Swift. The specific case that I got stuck on is a small piece that captures a value before and after it changes and interpolates between the two to create keyframes for an animation.
For this I had an object Capture with properties for the object, the key path and two id properties for the values before and after. Later, when interpolating the captured values I made sure that they could be interpolated by wrapping each of them in a Value class that used a class cluster to return an appropriate class depending on the type of value it wrapped, or nil for types that wasn't supported.
This works, and I am able to make it work in Swift as well following the same pattern, but it doesn't feel Swift like.
What worked
Instead of wrapping the captured values as a way of enabling interpolation, I created a Mixable protocol that the types could conform to and used a protocol extension for when the type supported the necessary basic arithmetic:
protocol SimpleArithmeticType {
func +(lhs: Self, right: Self) -> Self
func *(lhs: Self, amount: Double) -> Self
}
protocol Mixable {
func mix(with other: Self, by amount: Double) -> Self
}
extension Mixable where Self: SimpleArithmeticType {
func mix(with other: Self, by amount: Double) -> Self {
return self * (1.0 - amount) + other * amount
}
}
This part worked really well and enforced homogeneous mixing (that a type could only be mixed with its own type), which wasn't enforced in the Objective-C implementation.
Where I got stuck
The next logical step, and this is where I got stuck, seemed to be to make each Capture instance (now a struct) hold two variables of the same mixable type instead of two AnyObject. I also changed the initializer argument from being an object and a key path to being a closure that returns an object ()->T
struct Capture<T: Mixable> {
typealias Evaluation = () -> T
let eval: Evaluation
let before: T
var after: T {
return eval()
}
init(eval: Evaluation) {
self.eval = eval
self.before = eval()
}
}
This works when the type can be inferred, for example:
let captureInt = Capture {
return 3.0
}
// > Capture<Double>
but not with key value coding, which return AnyObject:\
let captureAnyObject = Capture {
return myObject.valueForKeyPath("opacity")!
}
error: cannot invoke initializer for type 'Capture' with an argument list of type '(() -> _)'
AnyObject does not conform to the Mixable protocol, so I can understand why this doesn't work. But I can check what type the object really is, and since I'm only covering a handful of mixable types, I though I could cover all the cases and return the correct type of Capture. Too see if this could even work I made an even simpler example
A simpler example
struct Foo<T> {
let x: T
init(eval: ()->T) {
x = eval()
}
}
which works when type inference is guaranteed:
let fooInt = Foo {
return 3
}
// > Foo<Int>
let fooDouble = Foo {
return 3.0
}
// > Foo<Double>
But not when the closure can return different types
let condition = true
let foo = Foo {
if condition {
return 3
} else {
return 3.0
}
}
error: cannot invoke initializer for type 'Foo' with an argument list of type '(() -> _)'
I'm not even able to define such a closure on its own.
let condition = true // as simple as it could be
let evaluation = {
if condition {
return 3
} else {
return 3.0
}
}
error: unable to infer closure type in the current context
My Question
Is this something that can be done at all? Can a condition be used to determine the type of a generic? Or is there another way to hold two variables of the same type, where the type was decided based on a condition?
Edit
What I really want is to:
capture the values before and after a change and save the pair (old + new) for later (a heterogeneous collection of homogeneous pairs).
go through all the collected values and get rid of the ones that can't be interpolated (unless this step could be integrated with the collection step)
interpolate each homogeneous pair individually (mixing old + new).
But it seems like this direction is a dead end when it comes to solving that problem. I'll have to take a couple of steps back and try a different approach (and probably ask a different question if I get stuck again).
As discussed on Twitter, the type must be known at compile time. Nevertheless, for the simple example at the end of the question you could just explicitly type
let evaluation: Foo<Double> = { ... }
and it would work.
So in the case of Capture and valueForKeyPath: IMHO you should cast (either safely or with a forced cast) the value to the Mixable type you expect the value to be and it should work fine. Afterall, I'm not sure valueForKeyPath: is supposed to return different types depending on a condition.
What is the exact case where you would like to return 2 totally different types (that can't be implicitly casted as in the simple case of Int and Double above) in the same evaluation closure?
in my full example I also have cases for CGPoint, CGSize, CGRect, CATransform3D
The limitations are just as you have stated, because of Swift's strict typing. All types must be definitely known at compile time, and each thing can be of only one type - even a generic (it is resolved by the way it is called at compile time). Thus, the only thing you can do is turn your type into into an umbrella type that is much more like Objective-C itself:
let condition = true
let evaluation = {
() -> NSObject in // *
if condition {
return 3
} else {
return NSValue(CGPoint:CGPointMake(0,1))
}
}