Subsets and shapeless extensible records - scala

How to define polymorphic function which:
accept any record containing specified set of fields (superset of fields)
return any subset of specified set of fields
with shapeless-2.3?
I've found solution for single field for 1. but I need to work with a set of fields. I've found suggestion to define class containing implicits for each of the field, but I think it should be less boilerplated way to define it in such advanced language like Scala. I've found an assumption than SelectAll trait can be used for that but not concrete example how exactly it can be used.

Looks like your question is the duplicate of this one:
Checking for subtype relationship between extensible records in shapeless
The functionality you look for is implemented as Extractor typeclass and will be present in shapeless 2.3.3 (github.com/milessabin/shapeless/pull/714)

Related

What are the disadvantages of using records instead of classes?

C# 9 introduces record reference types. A record provides some synthesized methods like copy constructor, clone operation, hash codes calculation and comparison/equality operations. It seems to me convenient to use records instead of classes in general. Are there reasons no to do so?
It seems to me that currently Visual Studio as an editor does not support records as well as classes but this will probably change in the future.
Firstly, be aware that if it's possible for a class to contain circular references (which is true for most mutable classes) then many of the auto generated record members can StackOverflow. So that's a pretty good reason to not use records for everything.
So when should you use a record?
Use a record when an instance of a class is entirely defined by the public data it contains, and has no unique identity of it's own.
This means that the record is basically just an immutable bag of data. I don't really care about that particular instance of the record at all, other than that it provides a convenient way of grouping related bits of data together.
Why?
Consider the members a record generates:
Value Equality
Two instances of a record are considered equal if they have the same data (by default: if all fields are the same).
This is appropriate for classes with no behavior, which are just used as immutable bags of data. However this is rarely the case for classes which are mutable, or have behavior.
For example if a class is mutable, then two instances which happen to contain the same data shouldn't be considered equal, as that would imply that updating one would update the other, which is obviously false. Instead you should use reference equality for such objects.
Meanwhile if a class is an abstraction providing a service you have to think more carefully about what equality means, or if it's even relevant to your class. For example imagine a Crawler class which can crawl websites and return a list of pages. What would equality mean for such a class? You'd rarely have two instances of a Crawler, and if you did, why would you compare them?
with blocks
with blocks provides a convenient way to copy an object and update specific fields. However this is always safe if the object has no identity, as copying it doesn't lose any information. Copying a mutable class loses the identity of the original object, as updating the copy won't update the original. As such you have to consider whether this really makes sense for your class.
ToString
The generated ToString prints out the values of all public properties. If your class is entirely defined by the properties it contains, then this makes a lot of sense. However if your class is not, then that's not necessarily the information you are interested in. A Crawler for example may have no public fields at all, but the private fields are likely to be highly relevant to its behavior. You'll probably want to define ToString yourself for such classes.
All properties of a record are per default public
All properties of a record are per default immutable
By default, I mean when using the simple record definition syntax.
Also, records can only derive from records and you cannot derive a regular class from a record.

How do I choose the collection type to return in Scala?

In case of Set or List, the choice seems to be easier, but what do I do for Java's Collection, Iterable equivalent? Do I go for Seq? Traversable? GenTraversableOnce?
You need to decide based on your need. For example: According to Scala Documentation the definition of Seq is
Sequences are special cases of iterable collections of class Iterable. Unlike iterables, sequences always have a defined order of elements. Sequences provide a method apply for indexing.
So if you want to benefit ordering or you want to retrieve element by index you can use Seq
Again according to Scala Documentation if you are mainly interested in iteration over your Collection Traversable is sufficient
Just notice that there is a general good practice that for your function signature like function return type, use more general (abstract) data type to prevent unnecessary performance penalty for the function callers.
As often, it will depend on the needs of your caller.
Traversable is pretty high level (you only get foreach) but it might be sufficient. Seq would be used if you need a defined order of elements. GenTraversableOnce would be a bit abstract for me and possibly for your fellow coders.

casbah mongodb more typesafe way to access object parameters

In casbah, there are two methods called .getAs and .getAsOrElse in MongoDBObject, which returns the relevant fields' values in the type which given as the type parameter.
val dbo:MongoDBObject = ...
dbo.getAs[String](param)
This must be using type casting, because we can get a Long as a String by giving it as the type parameter, which might caused to type cast exception in runtime. Is there any other typesafe way to retrieve the original type in the result?
This must be possible because the type information of the element should be there in the getAs's output.
Check out this excellent presentation on Salat by it's author. What you're looking for is Salat grater which can convert to and from DBObject.
Disclamer: I am biased as I'm the author of Subset
I built this small library "Subset" exactly for the reason to be able to work effectively with DBObject's fields (both scalar and sub-documents) in a type-safe manner. Look through Examples and see if it fits your needs.
The problem is that mongodb can store multiple types for a single field, so, I'm not sure what you mean by making this typesafe. There's no way to enforce it on the database side, so were you hoping that there is a way to enforce it on the casbah side? You could just do get("fieldName"), and get an Object, to be safest--but that's hardly an improvement, in my opinion.
I've been happy using Salat + Casbah, and when my database record doesn't match my Salat case class, I get a runtime exception. I just know that I have to run migration scripts when I change the types in my model, or create a new model for the new types (multiple models can be stored in the same collection). At least the Salat grater/DAO methods make it less of a hassle (you don't have to specify types every time you access a variable).

Theoretical difference between classes and types

This question has been asked on here a few times, but none of the replies really answered it in the more abstract, theoretical sense that I am looking for.
Most answers are something along the lines of "A class has implementations for methods that its objects can respond to, while a type just specifies which methods can be responded to".
Well, this seems kind of like an odd definition to me. Take ints, floats, and chars in a language like C. It may never be explicitly located in the code, but there are definitely methods built in to the language for responding to the messages ("plus", "minus", etc.) that these types receive.
And as all interfaces must have methods defined somewhere, it seems to me that types are the same thing as classes, except the word "class" carries a mental image of a more substantial programming structure than a "type".
Which leads to me to believe that the drawbacks that apply to any class-based language (the "expression problem" for example) would similarly apply to any language with types (Haskell, etc.)
There is no widely applicable, generally accepted definition of the term "class" that I'm aware of, not even wrt type systems. So your question pretty much depends on the context.
If you are talking about classes in object-oriented languages then the description you quote is relatively accurate. Types are specifications, descriptions (of objects or other values). Classes are implementations, definitions (of object factories).
However, in many OO languages, class definitions also introduce distinct type names, and these type names are often the only means to type objects. That's an unfortunate limitation and conflation of concepts, that also leads to the well-known confusion of subtyping and inheritance. At least some languages separate these concepts properly, e.g. Ocaml.
In any case, the reason why the distinction is seemingly at odds with ints and floats in C is simple: those are not objects. Despite what OO ideology tries to preach, not everything is an object, and certainly not in every language.
Simply put, a class will often have methods that manipulate the data contained within an instance. A type will not; it only is meant to hold and return data.
Although it is true that there may be methods specified somewhere for the type, there will only be one way to change the data contained within an instance of a type - storing a new value in it. The methods are generally along the lines of presenting the data in different ways, instead of actually manipulating the data.
This rule can, of course, be broken; C is full of examples, due to how it is structured (or, rather, not structured). Generally speaking, though, you don't want to have a type with a function that does fancy logic internally.
"Class" and "type" mean different things in different languages and environments; I will try to show here a synthesis that helps me think about the issue.
Classes have objects, and types have values. I think it is easier to understand the difference between objects and values, than between classes and types. An object has 2 independent properties: its identity, and its state/behaviour. So, you can have two different objects with the same class and state. This is not true for values: you cannot have 2 different values of a type that have the exact same state (or form, shape) and behaviour: you cannot have 2 "twoes". A value of a type does not have an identity independent of its state and behaviour.
Mixing both concepts together, you might say that a value of a given type does not necessarily have a class, but an object of a given class necessarily has a type, (e.g. object), and its value is given both by its state/beheviour and by its identity.
Haskell has types, and definable ones if I am correct. It is from Haskell that I am taking the "type" concept I am using. Python has classes and types mixed into the same "type" system, with some primitive types and rich definable classes. The concept of object that I am using is that of the type system of Python, minus its primitive types: int, str, etc.
Another key difference between types and classes would be in their definition. Types are tipically defined by a set of predicates or constraints that "give" all at once all of the values of the type. Therefore, you can use a literal value without first having to "create" it: 23438573. The definition of a class involves a procedure to create objects, and all objects of that class must be created before they are used.

How do I implement a collection in Scala 2.8?

In trying to write an API I'm struggling with Scala's collections in 2.8(.0-beta1).
Basically what I need is to write something that:
adds functionality to immutable sets of a certain type
where all methods like filter and map return a collection of the same type without having to override everything (which is why I went for 2.8 in the first place)
where all collections you gain through those methods are constructed with the same parameters the original collection had (similar to how SortedSet hands through an ordering via implicits)
which is still a trait in itself, independent of any set implementations.
Additionally I want to define a default implementation, for example based on a HashSet. The companion object of the trait might use this default implementation. I'm not sure yet if I need the full power of builder factories to map my collection type to other collection types.
I read the paper on the redesign of the collections API but it seems like things have changed a bit since then and I'm missing some details in there. I've also digged through the collections source code but I'm not sure it's very consistent yet.
Ideally what I'd like to see is either a hands-on tutorial that tells me step-by-step just the bits that I need or an extensive description of all the details so I can judge myself which bits I need. I liked the chapter on object equality in "Programming in Scala". :-)
But I appreciate any pointers to documentation or examples that help me understand the new collections design better.
I'd have a look at the implementation of collection.immutable.BitSet. It's a bit spread out, reusing things from collection.BitSetLike and collection.generic.BitSetFactory. But it does exactly what you specified: implement an immutable set of a certain element type that adds new functionality.