I have the unfortunate task of converting a networking library that I wrote couple years back from scala to java due to lack of offshore scala resources.
One of the tricker areas : converting the package object and its type aliases and case classes. Here is an excerpt:
package object xfer {
type RawData = Array[Byte]
type DataPtr = String
type PackedData = RawData
// type PackedData = (DataPtr, RawData, RawData)
// type UnpackedData = (DataPtr, Any, RawData)
type UnpackedData = Any
case class TaggedEntry(tag: String, data: Array[Byte])
case class TypedEntry[T](tag: String, t: T)
case class XferWriteParams(tag: String, config: XferConfig, data: RawData, md5: RawData) {
override def toString: DataPtr = s"XferWriteParams: config=$config datalen=${data.length}} md5len=${md5.length}}"
}
As an example the RawData has 32 usages. I suppose that one approach could be to do simple Find/Replace of all 32 instances with byte[]: but is there a more elegant way?
For the case class'es .. I'm leery of creating another few top level files in this package for each of them - and likewise another few top level files in each of a dozen other packages .. but is there any alternative?
ADT-esque trait-case-class clusters like
trait T
case class C1 extends T
case class C2 extends T
could be converted to an abstract base class T, with nested static classes C1, C2:
abstract class T {
static class C1 extends T { ... }
static class C2 extends T { ... }
}
This would at least eliminate the need to explode each such enumeration into thousand top-level classes in separate files.
For type-aliases, you might consider promoting them to full fledged wrapper classes, but you would have to be very careful whenever instantiating classes like RawData, using ==/equals, hashCode, or putting them in HashSets or HashMaps. Something like this might work:
class RawData {
private final byte[] value;
public RawData(byte[] v) { ... }
public byte[] getValue() { ... }
public static RawData of(byte[] value) {
return new RawData(value);
}
#Override public int hashCode() {
return value.hashCode();
}
#Override public boolean equals(Object other) {
if (other instanceof RawData) {
return value.equals(((RawData) other).value);
} else {
return false;
}
}
#Override public String toString() {
return String.valueOf(value);
}
}
This would keep the signatures similar, and might even enhance type-safety to some extent. In cases where it is really performance critical, I'd propose to just find/replace all occurrences by byte[].
Related
I'm trying to map a column which is a Map with frozen type
My column family has a field
batsmen_data map<text, frozen<bat_card>>
bat_card has two fields
bat_id int,
bat_name text,
Map the column field
object batsmenData extends MapColumn[ScoreCardData, ScoreCard, String ,Batting](this) {
override lazy val name="batsmen_data"
}
This is not an ideal way to do it. Because MapColumn Supports only primitive types. Can any one help me out on how to create a UDT Columns
I've figured out a way to map the UDT using phantom. But not sure this is right way.
I have a column
batsmen_data map<text, frozen<bat_card>>
To map this, we write the below code
object batsmenData extends MapColumn[ScoreCardData, ScoreCard, String, Batting](this) with CustomPrimitives {
override lazy val name="batsmen_data"
}
When you compile, error will be shown No Primitive found for class Batting
This is because Phantom has defined Primitives for native types like String, Int etc. To avoid you have to define primitive for class Batting as shown below(You have to extend CustomPrimitives trait in the class, else you will get the same error).
trait CustomPrimitives extends Config {
implicit object BattingPrimitive extends Primitive[Batting]{
override type PrimitiveType = Batting
override def clz: Class[Batting] = classOf[Batting]
override def cassandraType: String = {
Connector.session.getCluster.getMetadata.getKeyspace(cassandraKeyspace).getUserType("bat_card").toString()
}
override def fromRow(column: String, row: dsl.Row): Try[Batting] = ???
override def fromString(value: String): Batting = ???
override def asCql(value: Batting): String = ???
}
After this one more error will be shown saying Codec not found for requested operation: [frozen <-> Batting]. This is because cassandra expects a codec for the custom UDT type to serialize and deserialze the data. To Avoid this you have to write a CodecClass which will help to deserialize(as i need only deserializing) the UDT value into custom object.
public class BattingCodec extends TypeCodec<Batting>{
protected BattingCodec(TypeCodec<UDTValue> innerCodec, Class<Batting> javaType) {
super(innerCodec.getCqlType(), javaType);
}
#Override
public ByteBuffer serialize(Batting value, ProtocolVersion protocolVersion) throws InvalidTypeException {
return null;
}
#Override
public Batting deserialize(ByteBuffer bytes, ProtocolVersion protocolVersion) throws InvalidTypeException {
return null;
}
#Override
public Batting parse(String value) throws InvalidTypeException {
return null;
}
#Override
public String format(Batting value) throws InvalidTypeException {
return null;
}
}
Once the codec is defined last step is to register this codec into codec registry.
val codecRegistry = CodecRegistry.DEFAULT_INSTANCE
val bat_card = Connector.session.getCluster.getMetadata.getKeyspace(cassandraKeyspace).getUserType("bat_card")
val batCodec = new BattingCodec(TypeCodec.userType(bat_card), classOf[Batting])
codecRegistry.register(batCodec)
Now using deserialize function in BattingCodec, we can map the bytes to required object.
This method is working fine. But i'm not sure this is the ideal way to achieve the UDT functionality using Phantom
I'm looking for some insight into scala internals. We've just come out the other side of a painful debug session, and found out our problem was caused by a unexpected null value that we had thought would be pre-initialised. We can't fathom why that would be the case.
Here is an extremely cut down example of the code which illustrates the problem (if it looks convoluted it's because it's much more complicated in real code, but i've left the basic structure alone in case it's significant).
trait A {
println("in A")
def usefulMethod
def overrideThisMethod = {
//defaultImplementation
}
val stubbableFunction = {
//do some stuff
val stubbableMethod = overrideThisMethod
//do some other stuff with stubbableMethod
}
}
class B extends A {
println("in B")
def usefulMethod = {
//do something with stubbableFunction
}
}
class StubB extends B {
println("in StubB")
var usefulVar = "super useful" //<<---this is the val that ends up being null
override def overrideThisMethod {
println("usefulVar = " + usefulVar)
}
}
If we kick off the chain of initialisation, this is what is printed to the console:
scala> val stub = new StubB
in A
usefulVar = null
in B
in StubB
My assumptions
I assume that in order to instantiate StubB, first we instantiate trait A, and then B and finally StubB: hence the printing order of ("in A ", "in B", "in StubB"). I assume stubbableFunction in trait A is evaluated on initialisation because it's a val, same for stubbableMethod.
From here on is where i get confused.
My question
When val overrideThisMethod is evaluated in trait A, i would expect the classloader to follow the chain downwards to StubB (which it does, you can tell because of the printing of "usefulVal = null") but... why is the value null here? How can overrideThisMethod in StubB be evaluated without first initialising the StubB class and therefore setting usefulVal? I didnt know you could have "orphaned" methods being evaluated this way - surely methods have to belong to a class which has to be initialised before you can call the method?
We actually solved the problem by changing the val stubbableFunction = to def stubbableFunction = in trait A, but we'd still really like to understand what was going on here. I'm looking forward to learning something interesting about how Scala (or maybe Java) works under the hood :)
edit: I changed the null value to be var and the same thing happens - question updated for clarity in response to m-z's answer
I stripped down the original code even more leaving the original behavior intact. I also renamed some methods and vals to express the semantics better (mostly function vs value):
trait A {
println("in A")
def overridableComputation = {
println("A::overridableComputation")
1
}
val stubbableValue = overridableComputation
def stubbableMethod = overridableComputation
}
class StubB extends A {
println("in StubB")
val usefulVal = "super useful" //<<---this is the val that ends up being null
override def overridableComputation = {
println("StubB::overridableComputation")
println("usefulVal = " + usefulVal)
2
}
}
When run it yields the following output:
in A
StubB::overridableComputation
usefulVal = null
in StubB
super useful
Here are some Scala implementation details to help us understand what is happening:
the main constructor is intertwined with the class definition, i.e. most of the code (except method definitions) between curly braces is put into the constructor;
each val of the class is implemented as a private field and a getter method, both field and method are named after val (JavaBean convention is not adhered to);
the value for the val is computed within the constructor and is used to initialize the field.
As m-z already noted, the initialization runs top down, i.e. the parent's class or trait constructor is called first, the child's constructor is called last. So here's what happens when you call new StubB():
A StubB object is allocated in heap, all its fields are set to default values depending on their types (0, 0.0, null, etc);
A::A is invoked first as the top-most constructor;
"in A" is printed;
in order to compute the value for stubbableValue overridableComputation is called, the catch is in fact that the overridden method is called, i.e. StubB::overridableComputation see What's wrong with overridable method calls in constructors? for more details;
"StubB::overridableComputation" is printed;
since usefulVal is not yet initialized by StubB::StubB it's default value is used, so "usefulVal = null" is printed;
2 is returned;
stubbableValue is initialized with the computed value of 2;
StubB::StubB is invoked as the next constructor in chain;
"in StubB" is printed;
the value for usefulVar is computed, in this case just the literal "super useful" is used;
usefulVar is initialized with the value of "super useful".
Since the value for stubbableValue is computed during constructor run
To prove these assumptions fernflower Java decompiler can be used. Here's how the above Scala code looks when decompiled to Java (I removed irrelevant #ScalaSignature annotations):
import scala.collection.mutable.StringBuilder;
public class A {
private final int stubbableValue;
public int overridableComputation() {
.MODULE$.println("A::overridableComputation");
return 1;
}
public int stubbableValue() {
return this.stubbableValue;
}
public int stubbableMethod() {
return this.overridableComputation();
}
public A() {
.MODULE$.println("in A");
// Note, that overridden method is called below!
this.stubbableValue = this.overridableComputation();
}
}
public class StubB extends A {
private final String usefulVal;
public String usefulVal() {
return this.usefulVal;
}
public int overridableComputation() {
.MODULE$.println("StubB::overridableComputation");
.MODULE$.println(
(new StringBuilder()).append("usefulVal = ")
.append(this.usefulVal())
.toString()
);
return 2;
}
public StubB() {
.MODULE$.println("in StubB");
this.usefulVal = "super useful";
}
}
In case A is a trait instead of a class the code is a bit more verbose, but behavior is consistent with the class A variant. Since JVM doesn't support multiple inheritance Scala compiler splits a trait into a abstract helper class which only contains static members and an interface:
import scala.collection.mutable.StringBuilder;
public abstract class A$class {
public static int overridableComputation(A $this) {
.MODULE$.println("A::overridableComputation");
return 1;
}
public static int stubbableMethod(A $this) {
return $this.overridableComputation();
}
public static void $init$(A $this) {
.MODULE$.println("in A");
$this.so32501595$A$_setter_$stubbableValue_$eq($this.overridableComputation());
}
}
public interface A {
void so32501595$A$_setter_$stubbableValue_$eq(int var1);
int overridableComputation();
int stubbableValue();
int stubbableMethod();
}
public class StubB implements A {
private final String usefulVal;
private final int stubbableValue;
public int stubbableValue() {
return this.stubbableValue;
}
public void so32501595$A$_setter_$stubbableValue_$eq(int x$1) {
this.stubbableValue = x$1;
}
public String usefulVal() {
return this.usefulVal;
}
public int overridableComputation() {
.MODULE$.println("StubB::overridableComputation");
.MODULE$.println(
(new StringBuilder()).append("usefulVal = ")
.append(this.usefulVal())
.toString()
);
return 2;
}
public StubB() {
A$class.$init$(this);
.MODULE$.println("in StubB");
this.usefulVal = "super useful";
}
}
Remember that a val is rendered into a field and a method? Since several traits can be mixed into a single class, a trait cannot be implemented as a class. Therefore, the method part of a val is put into an interface, while a field part is put into the class that a trait gets mixed into.
The abstract class contains the code of all the trait's methods, access to the member fields is provided by passing $this explicitly.
Based on the Play (java) documentation, let's say I have the following example:
public class UserForm {
public String name;
public List<MyClass> itmes;
}
and
#helper.inputText(userForm("name"))
#helper.repeat(userForm("items"), min = 1) { itemField =>
#helper.inputText(itemField)
}
However, in MyClass I have an overridden implementation of compareTo(). I also have a getter getSortedItems() that will return the list in the proper sorted order.
Currently, using the repeat() helper does not get my list of items in the ordering that I want. Is there a way to specify the ordering for the repeat() helper? Or can I give it a List as a parameter? It seems like this would be possible to do in Scala.
Any help would be appreciated, thanks!
You could replace List<MyClass> with a sorted set:
case class MyClass(id: Int, name: String)
val sorted = new mutable.TreeSet[MyClass]()(new Ordering[MyClass] {
def compare(a: MyClass, b: MyClass): Int = {
Ordering.Int.compare(a.id,b.id)
}
})
sorted.add(MyClass(2,"bob"))
sorted.add(MyClass(1,"bill"))
sorted.add(MyClass(3,"jane"))
I assume that list will contain only unique instances of MyClass, so a set should work fine and every time you add an item, the set will make sure it stays sorted.
The java version should be pretty close:
import java.util.Comparator;
import java.util.TreeSet;
public class MyClass {
public int id;
public String name;
}
public class MyClassComparator implements Comparator<MyClass> {
#Override
public int compare(MyClass a, MyClass b) {
return Integer.compare(a.id,b.id);
}
}
TreeSet<MyClass> sorted = new TreeSet<>(new MyClassComparator());
If function accepts structural type, it can be defined as:
def doTheThings(duck: { def walk; def quack }) { duck.quack }
or
type DuckType = { def walk; def quack }
def doTheThings(duck: DuckType) { duck.quack }
Then, you can use that function in following way:
class Dog {
def walk { println("Dog walk") }
def quack { println("Dog quacks") }
}
def main(args: Array[String]) {
doTheThings(new Dog);
}
If you decompile (to Java) the classes generated by scalac for my example, you can see that argument of doTheThings is of type Object and the implementation uses reflection to call methods on the argument (i.e.duck.quack)
My question is why reflection? Isn't it possible just to use anonymous and invokevirtual instead of reflection?
Here is way to translate(implement) the structural type calls for my example (Java syntax, but the point is the bytecode):
class DuckyDogTest {
interface DuckType {
void walk();
void quack();
}
static void doTheThing(DuckType d) {
d.quack();
}
static class Dog {
public void walk() { System.out.println("Dog walk"); }
public void quack() { System.out.println("Dog quack"); }
}
public static void main(String[] args) {
final Dog d = new Dog();
doTheThing(new DuckType() {
public final void walk() { d.walk(); }
public final void quack() { d.quack();}
});
}
}
Consider a simple proposition:
type T = { def quack(): Unit; def walk(): Unit }
def f(a: T, b: T) =
if (a eq b) println("They are the same duck!")
else println("Different ducks")
f(x, x) // x is a duck
It would print Different ducks under your proposal. You could further refine it, but you just cannot keep referential equality intact using a proxy.
A possible solution would be to use the type class pattern, but that would require passing another parameter (even if implicit). Still, it's faster. But that's mostly because of the lameness of Java's reflection speed. Hopefully, method handles will get around the speed problem. Unfortunately, Scala is not scheduled to give up on Java 5, 6 and 7 (which do not have method handles) for some time...
In addition to your proxy object implementing methods on the structural type, it would also need to have appropriate pass-through implementations of all of the methods on Any (equals, hashCode, toString, isInstanceOf, asInstanceOf) and AnyRef(getClass, wait, notify, notifyAll, and synchronized). While some of these would be straightforward, some would be almost impossible to get right. In particular, all of the methods listed are "final" on AnyRef (for Java compatability and security) and so couldn't be properly implemented by your proxy object.
Is there any difference between case object and object in scala?
Here's one difference - case objects extend the Serializable trait, so they can be serialized. Regular objects cannot by default:
scala> object A
defined module A
scala> case object B
defined module B
scala> import java.io._
import java.io._
scala> val bos = new ByteArrayOutputStream
bos: java.io.ByteArrayOutputStream =
scala> val oos = new ObjectOutputStream(bos)
oos: java.io.ObjectOutputStream = java.io.ObjectOutputStream#e7da60
scala> oos.writeObject(B)
scala> oos.writeObject(A)
java.io.NotSerializableException: A$
Case classes differ from regular classes in that they get:
pattern matching support
default implementations of equals and hashCode
default implementations of serialization
a prettier default implementation of toString, and
the small amount of functionality that they get from automatically inheriting from scala.Product.
Pattern matching, equals and hashCode don't matter much for singletons (unless you do something really degenerate), so you're pretty much just getting serialization, a nice toString, and some methods you probably won't ever use.
scala> object foo
defined object foo
scala> case object foocase
defined object foocase
Serialization difference:
scala> foo.asInstanceOf[Serializable]
java.lang.ClassCastException: foo$ cannot be cast to scala.Serializable
... 43 elided
scala> foocase.asInstanceOf[Serializable]
res1: Serializable = foocase
toString difference:
scala> foo
res2: foo.type = foo$#7bf0bac8
scala> foocase
res3: foocase.type = foocase
A huge necro, but it is the highest result for this question in Google outside official tutorial which, as always, is pretty vague about the details. Here are some bare bones objects:
object StandardObject
object SerializableObject extends Serializable
case object CaseObject
Now, lets use the very useful feature of IntelliJ 'decompile Scala to Java' on compiled .class files:
//decompiled from StandardObject$.class
public final class StandardObject$ {
public static final StandardObject$ MODULE$ = new StandardObject$();
private StandardObject$() {
}
}
//decompiled from StandardObject.class
import scala.reflect.ScalaSignature;
#ScalaSignature(<byte array string elided>)
public final class StandardObject {
}
As you can see, a pretty straightforward singleton pattern, except for reasons outside the scope of this question, two classes are generated: the static StandardObject (which would contain static forwarder methods should the object define any) and the actual singleton instance StandardObject$, where all methods defined in the code end up as instance methods. Things get more intresting when you implement Serializable:
//decompiled from SerializableObject.class
import scala.reflect.ScalaSignature;
#ScalaSignature(<byte array string elided>)
public final class SerializableObject {
}
//decompiled from SerializableObject$.class
import java.io.Serializable;
import scala.runtime.ModuleSerializationProxy;
public final class SerializableObject$ implements Serializable {
public static final SerializableObject$ MODULE$ = new SerializableObject$();
private Object writeReplace() {
return new ModuleSerializationProxy(SerializableObject$.class);
}
private SerializableObject$() {
}
}
The compiler doesn't limit itself to simply making the 'instance' (non-static) class Serializable, it adds a writeReplace method. writeReplace is an alternative to writeObject/readObject; what it does, it serializes a different object whenether the Serializable class having this method is being serialized. On deserializention then, that proxy object's readResolve method is invoked once it is deserialized. Here, a ModuleSerializableProxy instance is serialized with a field carrying the Class[SerializableObject], so it knows what object needs to be resolved. The readResolve method of that class simply returns SerializableObject - as it is a singleton with a parameterless constructor, scala object is always structurally equal to itself between diffrent VM instances and different runs and, in this way, the property that only a single instance of that class is created per one VM instance is preserved. A thing of note is that there is a security hole here: no readObject method is added to SerializableObject$, meaning an attacker can maliciously prepare a binary file which matches standard Java serialization format for SerializableObject$ and a separate instance of the 'singleton' will be created.
Now, lets move to the case object:
//decompiled from CaseObject.class
import scala.collection.Iterator;
import scala.reflect.ScalaSignature;
#ScalaSignature(<byte array string elided>)
public final class CaseObject {
public static String toString() {
return CaseObject$.MODULE$.toString();
}
public static int hashCode() {
return CaseObject$.MODULE$.hashCode();
}
public static boolean canEqual(final Object x$1) {
return CaseObject$.MODULE$.canEqual(var0);
}
public static Iterator productIterator() {
return CaseObject$.MODULE$.productIterator();
}
public static Object productElement(final int x$1) {
return CaseObject$.MODULE$.productElement(var0);
}
public static int productArity() {
return CaseObject$.MODULE$.productArity();
}
public static String productPrefix() {
return CaseObject$.MODULE$.productPrefix();
}
public static Iterator productElementNames() {
return CaseObject$.MODULE$.productElementNames();
}
public static String productElementName(final int n) {
return CaseObject$.MODULE$.productElementName(var0);
}
}
//decompiled from CaseObject$.class
import java.io.Serializable;
import scala.Product;
import scala.collection.Iterator;
import scala.runtime.ModuleSerializationProxy;
import scala.runtime.Statics;
import scala.runtime.ScalaRunTime.;
public final class CaseObject$ implements Product, Serializable {
public static final CaseObject$ MODULE$ = new CaseObject$();
static {
Product.$init$(MODULE$);
}
public String productElementName(final int n) {
return Product.productElementName$(this, n);
}
public Iterator productElementNames() {
return Product.productElementNames$(this);
}
public String productPrefix() {
return "CaseObject";
}
public int productArity() {
return 0;
}
public Object productElement(final int x$1) {
Object var2 = Statics.ioobe(x$1);
return var2;
}
public Iterator productIterator() {
return .MODULE$.typedProductIterator(this);
}
public boolean canEqual(final Object x$1) {
return x$1 instanceof CaseObject$;
}
public int hashCode() {
return 847823535;
}
public String toString() {
return "CaseObject";
}
private Object writeReplace() {
return new ModuleSerializationProxy(CaseObject$.class);
}
private CaseObject$() {
}
}
A lot more is going on, as CaseObject$ now implements also Product0, with its iterator and accessor methods. I am unaware of a use case for this feature, it is probably done for consistency with case class which is always a product of its fields. The main practical difference here is that we get canEqual, hashCode and toString methods for free. canEqual is relevant only if you decide to compare it with a Product0 instance which is not a singleton object, toString saves us from implementing a single simple method, which is useful when case objects are used as enumeration constants without any behaviour implemented. Finally, as one might suspect, hashCode returns a constant, so it is the same for all VM instances. This matters if one serializes some flawed hash map implementation, but both standard java and scala hash maps wisely rehash all contents on deserialization, so it shouldn't matter. Note that equals is not overriden, so it is still reference equality, and that the security hole is still there. A huge caveat here: if a case object inherit equals/toString from some supertype other than Object, the corresponding methods are not generated, and the inherited definitions are used instead.
TL;DR: the only difference that matters in practice is the toString returning the unqualified name of the object.
I must make a disclamer here, though: I cannot guarantee that the compiler doesn't treat case objects specially in addition to what is actually in the bytecode. It certainly does so when patterm matching case classes, aside from them implementing unapply.
It's similar with case class and class ,we just use case object instead of case class when there isn't any fields representing additional state information.
case objects implicitly come with implementations of methods toString, equals, and hashCode, but simple objects don't.
case objects can be serialized while simple objects cannot, which makes case objects very useful as messages with Akka-Remote.
Adding the case keyword before object keyword makes the object serializable.
We know objects and "case class" before. But "case object" is a mix of both i.e it is a singleton similar to an object and with a lot of boilerplate as in a case class. The only difference is that the boilerplate is done for an object instead of a class.
case objects won't come with the below ones:
Apply, Un-apply methods.
here are no copy methods since this is a singleton.
No method for structural equality comparison.
No constructor as well.