CBOR diag. notation - express seqeunce of items - encoding

I am wondering if it's possible to express just a plain sequence of concrete CBOR items in a given order, using CBOR diagnostic notation. By plain, I mean that I want to avoid using arrays for this; what I want, is stream semantics. For example:
1, "foo", true, simple(53), { a: "bar", x: 30 }, [1, 2, 3]
not this:
[1, "foo", true, simple(53), { a: "bar", x: 30 }, [1, 2, 3]]
Moreover, is it also possible to do this with CDDL (the schema definition language for CBOR)?

I understand what you are asking about.
No, it appears the diagnostic notation (as implemented at http://cbor.me) does not currently support a naked sequence of CBOR objects (not an array), but yes the CBOR specification itself does permit it.
I guess this is a bug, but diagnostic notation isn't exactly a supported feature of CBOR. You might get some responses by posting to the CBOR mailing list, see https://datatracker.ietf.org/group/cbor/about/ for the e-mail address, how to subscribe, and a searchable archive.

Related

Principles of immutability and copy-on-write in polars python api

Hi I'm working on this fan fiction project of a full feature + syntax translation of pypolars to R called "minipolars".
I understand the pypolars API e.g. DataFrame in generel elicits immutable-behavior or isch the same as 'copy-on-write' behaviour. Most methods altering the DataFrame object will return a cheap copy.
Exceptions known to me are DataFrame.extend and the #columns.setter.
In R, most API's strive for a strictly immutable-behavior. I imagine to both support a strictly immutable behavoir, and optional pypolars-like behavior.
Rust-polars API has many mutable operations + lifetimes and what not, but it is understandably all about performance and expressiveness.
Are there many more central mutable behavoirs in the pypolars-API?
Would a pypolars-API with only immutable behavior suffer in performance and expressiveness?
The R library data.table API do stray away from immutable-behavoir some times. However all such operations that are mutable are prefixed set_ or use the set-operator :=.
Is there an obvious way in pypolars to recognize if an operation is mutable or not?
By mutable-behavoir I think of e.g. executing the method .extend() after defining variable df_mutable_copy and that still affects the value df_mutable_copy.
import polars as pl
df1 = pl.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
df2 = pl.DataFrame({"foo": [10, 20, 30], "bar": [40, 50, 60]})
df_copy = df1
df_copy_old_shape = df_copy.shape
df1.extend(df2)
df_copy_new_shape = df_copy.shape
#extend was a operation with mutable behaviour df_copy was affected.
df_copy_old_shape != df_copy_new_shape
Most of the python polars API is actually a wrapper around polars lazy.
(df.with_columns(..)
.join(..)
.groupby()
.select(..)
translates to:
```python
(df.lazy().with_columns(..).collect(no_optimization=True)
.lazy().join(..).collect(no_optimization=True)
.lazy().groupby().collect(no_optimization=True)
.lazy().select(..).collect(no_optimization=True)
That means that almost all expresions run on the polars query engine. The query engine itself determines if an operation can be done in place, or if it should clone the data (copy on write).
Polars actually has Copy on write on steroids, as it only copies if the data is not shared. If we are the only owner, we mutate in place. We can do this because Rust has a borrow checker, so if we own the data and the ref count is 1, we can mutate the data.
I would recommend you to implement your R-polars API similar to what we do in python. Then all operations can be made pure (e.g. return a new Series/Expr/DataFrame) and pollars will decide when to mutate in place.
Don't worry about copying data. All data buffers are wrapped in an Arc, so we only increment a reference count.

Dart, what is the "where" keyword

I've seen the occasional list.where() code on stackoverflow.
So I'm trying to figure this out.
How is it different from .map()?
Where can I read the documentation to learn in detail about this?
(firstWhere, startWith ... etc very complex queries with => keywords)
Here are the methods workings in example:
final list = [1, 2, 3, 4];
// List.where is used to filter out element based on a condition
list.where((element) => element.isOdd); // [1, 3]
// List.map allows you to map each value to a new one (not necessarily of the same type)
list.map((element) => 2*element); // [2, 4, 6, 8]
Where to read this information
One of the best thing of dart (and flutter) is its source code documentation. In most code editors, you can ctrl+click on a method to see the documentation. From there you can learn what it does and there are sometimes example.
Another great place is StackOverflow of course, since those question are asked over and over, especially basic ones like this which are often the same across different languages.
you can read it from here
the difference between where and map is basically this, where return a new list "iterable" with only the elements that satisfies the passed test, and map return also a new list (but with the same length), each element is mapped with the passed function to different value (wether the same type or a completely different type)

How exactly does a variadic parameter work?

I'm a Swift newbie and am having a bit of trouble understanding what a variadic parameter is exactly, and why it's useful. I'm currently following along with the online Swift 5.3 guide, and this is the example that was given for this type of parameter.
func arithmeticMean(_ numbers: Double...) -> Double {
var total: Double = 0
for number in numbers {
total += number
}
return total / Double(numbers.count)
}
arithmeticMean(1, 2, 3, 4, 5)
// returns 3.0, which is the arithmetic mean of these five numbers
arithmeticMean(3, 8.25, 18.75)
// returns 10.0, which is the arithmetic mean of these three numbers
Apparently, the variadic parameter called numbers has a type of Double..., which allows it to be used in the body of the function as a constant array. Why does the function return Double(numbers.count) instead of just numbers.count? And instead of creating a variadic parameter, why not just create a parameter that takes in an array that's outside of the function like this?
func addition(numbers : [Int]) -> Int
{
var total : Int = 0
for number in numbers
{
total += number
}
return total
}
let totalBruhs : [Int] = [4, 5, 6, 7, 8, 69]
addition(numbers: totalBruhs)
Also, why can there only be one variadic parameter per function?
Variadic parameters need (well, not need, but nice) to exist in Swift because they exist in C, and many things in Swift bridge to C. In C, creating a quick array of arbitrary length is not so simple as in Swift.
If you were building Swift from scratch with no backwards compatibility to C, then maybe they'd have been added, and maybe not. (Though I'm betting yes, just because so many Swift developers are used to languages where they exist. But then again, languages like Zig have intentionally gotten rid of variadic parameters, so I don't know. Zig also demonstrates that you don't need variadic parameters to bridge to C, but still, it's kind of nice. And #Rob's comments below are worth reading. He's probably not wrong. Also, his answer is insightful.)
But they're also convenient because you don't need to add the [...], which makes it much nicer when there's just one value. In particular, consider something like print:
func print(_ items: Any..., separator: String = " ", terminator: String = "\n")
Without variadic parameters, you'd need to put [...] in every print call, or you'd need overloads. Variadic doesn't change the world here, but it's kind of nice. It's particularly nice when you think about the ambiguities an overload would create. Say you didn't have variadics and instead had two overloads:
func print(_ items: [Any]) { ... }
func print(_ item: Any) { print([item]) }
That's actually a bit ambiguous, since Array is also a kind of Any. So print([1,2,3]) would print [[1,2,3]]. I'm sure there's some possible work-arounds, but variadics fix that up very nicely.
There can be only one because otherwise there are ambiguous cases.
func f(_ xs: Int..., _ ys: Int...)
What should f(1,2,3) do in this case? What is xs and what is ys?
The function you've shown here doesn't return Double(numbers.count). It converts numbers.count to a Double so it can be divided into another Double. The function returns total / Double(numbers.count).
And instead of creating a variadic parameter, why not just create a parameter that takes in an array that's outside of the function ... ?
I agree with you that it feels intuitive to use arrays for arithmetic functions like “mean”, “sum”, etc.
That having been said, there are situations where the variadic pattern feels quite natural:
There are scenarios where you are writing a function where using an array might not be logical or intuitive at the calling point.
Consider a max function that is supposed to be returning the larger of two values. It doesn’t feel quite right to impose a constraint that the caller must create an array of these values in order to return the larger of two values. You really want to allow a nice, simple syntax:
let result = max(a, b)
But at the same time, as an API developer, there’s also no reason to restrict the max implementation to only allow two parameters. Maybe the caller might want to use three. Or more. As an API developer, we design API’s for naturally calling points for the primary use cases, but provide as much flexibility as we can. So a variadic function parameter is both very natural and very flexible.
There are lots of possible example of this pattern, namely any function that naturally feels like it should take two parameters, but might take more. Consider a union function for two rectangles and you want the bounding rectangle. Again, you don’t want the caller to have to create an array for what might be a simple union of two rectangles.
Another common example would be where you might have a variable number of parameters but might not be dealing with arrays. The classic example would be printf pattern. Or another is where you are interacting with some SQL database and might be binding values to ? placeholders in the SQL or the like (to protect against SQL injection attacks):
let sql = "SELECT book_id, isbn FROM books WHERE title = ? AND author = ?"
let resultSet = db.query(sql, title, author)
Again, in these cases, suggesting that the caller must create an array for this heterogenous collection of values might not feel natural at the calling point.
So, the question isn’t “why would I use variadic parameter where arrays are logical and intuitive?” but rather “why would I force the use of array parameters where it might not be?”

Ambiguous use of 'lazy'

I have no idea why this example is ambiguous. (My apologies for not adding the code here, it's simply too long.)
I have added prefix (_ maxLength) as an overload to LazyDropWhileBidirectionalCollection. subscript(position) is defined on LazyPrefixCollection. Yet, the following code from the above example shouldn't be ambiguous, yet it is:
print([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2)[0]) // Ambiguous use of 'lazy'
It is my understanding that an overload that's higher up in the protocol hierarchy will get used.
According to the compiler it can't choose between two types; namely LazyRandomAccessCollection and LazySequence. (Which doesn't make sense since subscript(position) is not a method of LazySequence.) LazyRandomAccessCollection would be the logical choice here.
If I remove the subscript, it works:
print(Array([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2))) // [0, 1]
What could be the issue?
The trail here is just too complicated and ambiguous. You can see this by dropping elements. In particular, drop the last subscript:
let z = [0, 1, 2].lazy.drop(while: {_ in false}).prefix(2)
In this configuration, the compiler wants to type z as LazyPrefixCollection<LazyDropWhileBidirectionalCollection<[Int]>>. But that isn't indexable by integers. I know it feels like it should be, but it isn't provable by the current compiler. (see below) So your [0] fails. And backtracking isn't powerful enough to get back out of this crazy maze. There are just too many overloads with different return types, and the compiler doesn't know which one you want.
But this particular case is trivially fixed:
print([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2).first!)
That said, I would absolutely avoid pushing the compiler this hard. This is all too clever for Swift today. In particular overloads that return different types are very often a bad idea in Swift. When they're simple, yes, you can get away with it. But when you start layering them on, the compiler doesn't have a strong enough proof engine to resolve it. (That said, if we studied this long enough, I'm betting it actually is ambiguous somehow, but the diagnostic is misleading. That's a very common situation when you get into overly-clever Swift.)
Now that you describe it (in the comments), the reasoning is straightforward.
LazyDropWhileCollection can't have an integer index. Index subscripting is required to be O(1). That's the meaning of the Index subscript versus other subscripts. (The Index subscript must also return the Element type or crash; it can't return an Element?. That's way there's a DictionaryIndex that's separate from Key.)
Since the collection is lazy and has an arbitrary number of missing elements, looking up any particular integer "count" (first, second, etc.) is O(n). It's not possible to know what the 100th element is without walking through at least 100 elements. To be a collection, its O(1) index has to be in a form that can only be created by having previously walked the sequence. It can't be Int.
This is important because when you write code like:
for i in 1...1000 { print(xs[i]) }
you expect that to be on the order of 1000 "steps," but if this collection had an integer index, it would be on the order of 1 million steps. By wrapping the index, they prevent you from writing that code in the first place.
This is especially important in highly generic languages like Swift where layers of general-purpose algorithms can easily cascade an unexpected O(n) operation into completely unworkable performance (by "unworkable" I mean things that you expected to take milliseconds taking minutes or more).
Change the last row to this:
let x = [0, 1, 2]
let lazyX: LazySequence = x.lazy
let lazyX2: LazyRandomAccessCollection = x.lazy
let lazyX3: LazyBidirectionalCollection = x.lazy
let lazyX4: LazyCollection = x.lazy
print(lazyX.drop(while: {_ in false}).prefix(2)[0])
You can notice that the array has 4 different lazy conformations - you will have to be explicit.

How do I remove an element from a Vector?

This is what I'm doing now:
private var accounts = Vector.empty[Account]
def removeAccount(account: Account)
{
accounts = accounts.filterNot(_ == account)
}
Is there a more readable solution? Ideally, I'd like to write accounts = accounts.remove(account).
I'd use this:
accounts filterNot account.==
Which reads pretty well to me, but ymmv. I'd also like a count that doesn't take a predicate, but the collection library is really lacking in specialized methods where one with a predicate can generalize the operation.
Until 2.8.x, there was a - method, which got deprecated, iirc, because of semantic issues. It could actually have come back on 2.10 if my memory is serving me right, but it didn't. Edit: I checked it out, and saw that - is now reserved for a mutable method that modifies the collection it is applied on. I'd be all in favor of -:/:- though on sequences, where it makes sense to drop the first or last element equal to something. Anyone willing to front a ticket for that? I'd upvote it. :-)
There unfortunately is not, and worse still (perhaps), if the same account is present twice, filterNot will remove both of them. The only thing I can offer for readability is to use
accounts.filter(_ != account)
Another possibility is to use a collection type that does have a remove operation, such as a TreeSet (where it is called -). If you don't have duplicate entries anyway, a Set is perfectly okay. (It is slower for some operations, of course, but it probably is a better fit to the application--it's more efficient at removing individual entries; with a filterNot you basically have to build the entire Vector again.)
You could do something like this:
def removeFirst[T](xs: Vector[T], x: T) = {
val i = xs.indexOf(x)
if (i == -1) xs else xs.patch(i, Nil, 1)
}
then
accounts = removeFirst(accounts, account)
I think the nub of the problem, though, is that a Vector probably isn't the right collection type for a set of items where you want to pull things out (hint: try Set). If you want to index on an ID or an insertion index then Map could be what you're after (which does have a - method). If you want to index on multiple things efficiently, you need a database!
You can use the diff method which is defined for all sequences. It computes the multiset difference between two sequences - meaning it will remove as many occurrences of an element as you need.
Vector(1, 2, 1, 3, 2).diff(Seq(1)) =>
Vector(2, 1, 3, 2)
Vector(1, 2, 1, 3, 2).diff(Seq(1, 1)) =>
Vector(2, 3, 2)
Vector(1, 2, 1, 3, 2).diff(Seq(1, 1, 2))
Vector(3, 2)
If you prefer not to use the filterNot closure, you could use the more verbose yet more explicit for-comprehension style instead.
private var accounts = Vector.empty[Account]
def removeAccount(account: Account)
{
accounts = for { a <- accounts
if a != account } yield { a }
}
It's a matter of personal preference whether this is felt to be better in this case.
Certainly for more complex expressions involving nested flatMaps etc, I agree with Martin Odersky's advice that for-comprehensions can be quite a bit easier to read, especially for novices.