How can I predict which implementation will be chosen when mixing in multiple traits with conflicting abstract overrides? - scala

Consider this example:
abstract class Writer {
def write(message: String): Unit
}
trait UpperCaseFilter extends Writer {
abstract override def write(message: String) =
super.write(message.toUpperCase)
}
trait LowerCaseFilter extends Writer {
abstract override def write(message: String) =
super.write(message.toLowerCase)
}
class StringWriter extends Writer {
val sb = new StringBuilder
override def write(message: String) =
sb.append(message)
override def toString = sb.toString
}
object Main extends App {
val writer = new StringWriter with UpperCaseFilter with LowerCaseFilter
writer.write("Hello, world!")
println(writer)
}
I was surprised by the output “HELLO, WORLD!” Why is the output not “hello, world!” or a compilation error?

The logic that decides it is called linearization. You can find more information about it here:
http://www.artima.com/pins1ed/traits.html#12.6
In your case the whole class hierarchy would linearized like this:
LowerCaseFilter > UpperCaseFilter > Writer > StringWriter > AnyRef > Any
So, as you can see, UpperCaseFilter was the last transformation that went to the StringWriter.

Related

Flink KafkaSource & KafkaSink with GenericRecords and Confluent Schema Registry

I was faced with the problem of reading / writing from and to Kafka using KafkaSource and KafkaSink with Flink v1.16 (Scala 2.12) and the Confluent Schema Registry. The events should be read and written as GenericRecords. Below is an overview of the approach that I came up with. It is far from perfect, but I hope it helps someone to get a general idea of how things can be tied together.
KafkaSource
First I created a POJO class to deserialize the event into (Why POJO). Be careful to adhere to the POJO conventions.
class InputEvent
(
var timestamp: Long
, var someValue: String
) extends Serializable {
def this() = this(Long, String)
def canEqual(other: Any): Boolean = ...
override def equals(other: Any): Boolean = ...
override def hashCode(): Int = ...
override def toString = ...
}
I then created a companion object that extends the SchemaProjectable[T] trait. It is important that the Schema is a string and only later parsed with the getSchema method (serialization issues).
object InputEvent extends SchemaProjectable[InputEvent] with Serializable {
val SCHEMA = "{\"type\":\"record\",\"name\":\"InputEvent\",\"namespace\":\"com.blog.post\",\"doc\":\"Input Events\",\"fields\":[{\"name\":\"timestamp\",\"type\":\"long\",\"doc\":\"Event timestamp in seconds\"},{\"name\":\"someValue\",\"type\":\"string\",\"doc\":\"Some Value Field\"}]}"
override def getSchema: Schema = new Schema.Parser().parse(SCHEMA)
override def projectFromGeneric(in: GenericRecord): InputEvent = {
new InputEvent(
in.getLong("timestamp"),
in.getString("someValue")
)
}
override def projectToGeneric(in: InputEvent): GenericRecord = new GenericData.Record(getSchema)
}
The SchemaProjectable[T] trait is used later for the de/serializer to have a common interface and to allow an event to be converted from and to a GenericRecord easily.
abstract class SchemaProjectable[T] extends Serializable {
def getSchema : Schema
def projectFromGeneric(in: GenericRecord) : T
def projectToGeneric(in: T) : GenericRecord
}
Subsequently, the builder of the KafkaSource is put into a separate object. The getKafkaSource[T] method requires the configuration properties as well as an implementation of trait SchemaProjectable[T].
object GenericKafkaSource {
def getKafkaSource[T]
(properties: Properties, schemaProjectable: SchemaProjectable[T])
(implicit tInfo: TypeInformation[T]): KafkaSource[T] = {
KafkaSource.builder[T]
.setProperties(properties)
.setTopics(properties.getProperty("topicName"))
.setStartingOffsets(configToOffset(properties.getProperty("offset")))
.setDeserializer(
new GenericDeserializationSchema(schemaProjectable, properties.getProperty("schema.registry.url"))
)
.build
}
}
As a last step, the generic deserializer is implemented using the ConfluentRegistryAvroDeserializationSchema
class GenericDeserializationSchema[T]
(schemaProjectable: SchemaProjectable[T], url: String)
(implicit tInfo: TypeInformation[T]) extends KafkaRecordDeserializationSchema[T]{
private val deserializationSchema = ConfluentRegistryAvroDeserializationSchema.forGeneric(schemaProjectable.getSchema,url)
override def deserialize(record: ConsumerRecord[Array[Byte], Array[Byte]], out: Collector[T]): Unit = {
out.collect(
schemaProjectable.projectFromGeneric(deserializationSchema.deserialize(record.value()))
)
}
override def getProducedType: TypeInformation[T] = Types.of[T]
}
Finally, the KafkaSource can be used as follows:
val source = GenericKafkaSource.getKafkaSource(consumerProperties, InputEvent)
KafkaSink
For the KafkaSink, we take a similar approach than with the KafkaSource. Again, we create a class and companion object for the output event.
class OutputEvent
(
var timestamp: Long
, var resultValue: String
) extends Serializable {
def this() = this(Long, String)
def canEqual(other: Any): Boolean = ...
override def equals(other: Any): Boolean = ...
override def hashCode(): Int = ...
override def toString = ...
}
object OutputEvent extends SchemaProjectable[OutputEvent] with Serializable{
val SCHEMA = "{\"type\":\"record\",\"name\":\"OutputEvent\",\"namespace\":\"com.blog.post\",\"fields\":[{\"name\":\"timestamp\",\"type\":[\"null\",\"long\"],\"doc\":\"Timestamp of the event\"},{\"name\":\"resultValue\",\"type\":[\"null\",\"string\"],\"doc\":\"Result value\"}]}"
override def getSchema: Schema = new Schema.Parser().parse(SCHEMA)
override def projectFromGeneric(in: GenericRecord): OutputEvent = new OutputEvent()
override def projectToGeneric(in: OutputEvent): GenericRecord = {
val record = new GenericData.Record(getSchema)
record.put("timestamp", in.timestamp)
record.put("resultValue", in.resultValue)
record
}
}
The KafkaSink builder is factored out to a separate object.
object GenericKafkaSink {
def getGenericKafkaSink[T]
(properties: Properties, schemaProjectable: SchemaProjectable[T])
(implicit tInfo: TypeInformation[T]): KafkaSink[T] = {
val topicName = properties.getProperty("topicName")
val url = properties.getProperty("schema.registry.url")
KafkaSink.builder[T]
.setKafkaProducerConfig(properties)
.setBootstrapServers(properties.getProperty("bootstrap.servers"))
.setRecordSerializer(
new GenericSerializationSchema[T](topicName, schemaProjectable, url)
)
.setTransactionalIdPrefix(properties.getProperty("transactionId"))
.build
}
}
Subsequently, the SerializationSchema is implemented as follows using the ConfluentRegistryAvroSerializationSchema.
class GenericSerializationSchema[T]
(topicName: String, schemaProjectable: SchemaProjectable[T], url: String)
(implicit tInfo: TypeInformation[T]) extends KafkaRecordSerializationSchema[T] with Serializable {
lazy private val serializationSchema: ConfluentRegistryAvroSerializationSchema[GenericRecord] = ConfluentRegistryAvroSerializationSchema.forGeneric(topicName+"-value",schemaProjectable.getSchema,url)
override def serialize(element: T, context: KafkaRecordSerializationSchema.KafkaSinkContext, timestamp: lang.Long): ProducerRecord[Array[Byte], Array[Byte]] = {
new ProducerRecord[Array[Byte], Array[Byte]](
topicName
, serializationSchema.serialize(schemaProjectable.projectToGeneric(element))
)
}
}
Finally, the KafkaSink can be used as follows:
val sink = GenericKafkaSink.getGenericKafkaSink(producerProperties, OutputEvent)
Any opinions on the approach? I'm glad for any feedback!
Kind Regards
Dominik

Syntax extension using type class methods in Scala

I want to bind a check method to the Test in such a way that the implementation does not contain an argument (look at the last line). It is necessary to use type classes here, but I'm new in Scala, so I have problems.
Object Checker is my attempts to solve the problem. Perhaps it is enough to make changes to it...
trait Test[+T] extends Iterable[T]
class TestIterable[+T](iterable: Iterable[T]) extends Test[T] {
override def iterator: Iterator[T] = iterable.iterator
}
object Test {
def apply[T](iterable: Iterable[T]): Test[T] = new TestIterable[T](iterable)
}
trait Check[M] {
def check(m: M): M
}
object Checker {
def apply[M](implicit instance: Check[M]): Check[M] = instance
implicit def checkSyntax[M: Check](m: M): CheckOps[M] = new CheckOps[M](m)
private implicit def test[T](m: Test[T]) : Check[Test[T]] = {
new Check[Test[T]] {
override def check(m: Test[T]) = m
}
}
final class CheckOps[M: Check](m: M) {
def x2: M = Checker[M].check(m)
}
}
import Checker._
val test123 = Test(Seq(1, 2, 3))
Test(test123).check

Configuring implicits in Scala

I have typeclass:
trait ProcessorTo[T]{
def process(s: String): T
}
and its implementation
class DefaultProcessor extends ProcessorTo[String]{
def process(s: String): String = s
}
trait DefaultProcessorSupport{
implicit val p: Processor[String] = new DefaultProcessor
}
To make it available for using I created
object ApplicationContext
extends DefaultProcessorSupport
with //Some other typeclasses
But now I have to add a processor which performs some DataBase - read. The DB URL etc are placed in condifguration file that is available only a runtime. For now I did the following.
class DbProcessor extends ProcessorTo[Int]{
private var config: Config = _
def start(config: Config) = //set the configuration, open connections etc
//Other implementation
}
object ApplicationContext{
implicit val p: ProcessorTo[Int] = new DbProcessor
def configure(config: Config) = p.asInstanceOf[DbProcessor].start(config)
}
It works for me, but I'm not sure about this technique. Looks strange for me a little bit. Is it a bad practice? If so, what would be a good solution?
I am a bit confused by the requirements as DbProcessor is missing the process implementation(???) and trait ProcessorTo[T] is missing start method which is defined in DbProcessor. So, I will assume the following while answering: the type class has both process and start methods
Define a type class:
trait ProcessorTo[T]{
def start(config: Config): Unit
def process(s: String): T
}
Provide implementations for the type class in the companion objects:
object ProcessorTo {
implicit object DbProcessor extends ProcessorTo[Int] {
override def start(config: Config): Unit = ???
override def process(s: String): Int = ???
}
implicit object DefaultProcessor extends ProcessorTo[String] {
override def start(config: Config): Unit = ???
override def process(s: String): String = s
}
}
and use it in your ApplicationContext as follows:
object ApplicationContext {
def configure[T](config: Config)(implicit ev: ProcessorTo[T]) = ev.start(config)
}
This is a nice blog post about Type Classes: http://danielwestheide.com/blog/2013/02/06/the-neophytes-guide-to-scala-part-12-type-classes.html
I don't really see why you need start. If your implicit DbProcessor has a dependency, why not make it an explicit dependency via constructor? I mean something like this:
class DbConfig(val settings: Map[String, Object]) {}
class DbProcessor(config: DbConfig) extends ProcessorTo[Int] {
// here goes actual configuration of the processor using config
private val mappings: Map[String, Int] = config.settings("DbProcessor").asInstanceOf[Map[String, Int]]
override def process(s: String): Int = mappings.getOrElse(s, -1)
}
object ApplicationContext {
// first create config then pass it explicitly
val config = new DbConfig(Map[String, Object]("DbProcessor" -> Map("1" -> 123)))
implicit val p: ProcessorTo[Int] = new DbProcessor(config)
}
Or if you like Cake pattern, you can do something like this:
trait DbConfig {
def getMappings(): Map[String, Int]
}
class DbProcessor(config: DbConfig) extends ProcessorTo[Int] {
// here goes actual configuration of the processor using config
private val mappings: Map[String, Int] = config.getMappings()
override def process(s: String): Int = mappings.getOrElse(s, -1)
}
trait DbProcessorSupport {
self: DbConfig =>
implicit val dbProcessor: ProcessorTo[Int] = new DbProcessor(self)
}
object ApplicationContext extends DbConfig with DbProcessorSupport {
override def getMappings(): Map[String, Int] = Map("1" -> 123)
}
So the only thing you do in your ApplicationContext is providing actual implementation of the DbConfig trait.

Referencing the parent class inside a Trait

I have a class that extends a Trait and I want to write Traits that can mix in with the class and override some of the methods.
The trait that my class is extending looks like this
trait CharSource {
def source: Iterator[Char]
def words: Iterator[String] = {
while(source.hasNext){
//Logic to get the next word from a Char Iterator
}
}
final def chars: Iterator[Char] = words.toString().toIterator
}
The class that extends CharSource
class IteratorDocumentSource(iterator: Iterator[Char]) extends CharSource {
def source = iterator
}
Now I want to write a trait to override the source def in IteratorDocutmentSource for some special behavior
trait PunctuationRemover extends Transformer {self: CharSource =>
override
/** Character source. Overriding should be in terms of `super.source` for
stackability. */
abstract val source: Iterator[Char] = {
super.source
//additional logic ...
}
}
The transformer trait that PunctioationRemover extends
trait Transformer {self: CharSource =>
protected def source: Iterator[Char]
def words: Iterator[String]
}
I get an error when making this call
new IteratorDocumentSource("Hello World!".iterator) with PunctuationRemover
Error:
An exception or error caused a run to abort:
SampleSuite$$anonfun$2$$anon$3.document$PunctuationRemover$$super$source()Lscala/collection/Iterator;
java.lang.AbstractMethodError:
SampleSuite$$anonfun$2$$anon$3.document$PunctuationRemover$$super$source()Lscala/collection/Iterator;
I referenced this post but I think my situation is a little different
Can I override a scala class method with a method from a trait?

Using trait method in the class constructor

I have a trait and a class that extends the trait. I can use the methods from the trait as follows:
trait A {
def a = ""
}
class B(s: String) extends A {
def b = a
}
However, when I use the trait's method in the constructor like this:
trait A {
def a = ""
}
class B(s: String) extends A {
def this() = this(a)
}
then the following error appears:
error: not found: value a
Is there some way to define default parameters for the construction of classes in the trait?
EDIT: To clarify the purpose: There is the akka-testkit:
class TestKit(_system: ActorSystem) extends { implicit val system = _system }
And each test looks like this:
class B(_system: ActorSystem) extends TestKit(_system) with A with ... {
def this() = this(actorSystem)
...
}
because I want to create common creation of the ActorSystem in A:
trait A {
val conf = ...
def actorSystem = ActorSystem("MySpec", conf)
...
}
It's a little bit tricky because of Scala initialization order. The simplest solution I found is to define a companion object for your class B with apply as factory method:
trait A {
def a = "aaaa"
}
class B(s: String) {
println(s)
}
object B extends A {
def apply() = new B(a)
def apply(s: String) = new B(s)
}