Importing data as dynamically as possible - scala

Im looking for some mini pattern:
The program should be able to support various formats as Input and then apply a transformation and in the last step load them into a database.
Its main purpose is to provide test data.
My initial idea was to "glue" different components together like this:
We have an extractor that extracts from a generic datasource [A] to an iterator of [B]
and then a transformator that maps [B] to [C] and finally a step that loads [C] into a database. I'm sure there must be a better way of approaching this. Is there a better , possibly more generic way of achieving this?
trait Importer[A, B, C] {
val extractor: Extractor[A, B]
val transformer: Transformator[B, C]
val loader: Loader[C]
/**
* this is the method call for chaining all events together
*/
def importAndTransformData(dataSource: A): Unit =
{
/**
* extraction step
*/
val output = extractor.extract(dataSource: A)
/**
* conversion method
*/
val transformed = output map (transformer.transform(_))
/**
* loading step
*/
transformed.foreach(loader.load(_))
}
}
with best regards,
Stefan

One common approach used in Scala is self-typing (especially as used in the Cake Pattern). In your case that would look something like:
trait Importer[A, B, C] {
self: Extractor[A, B] with Transformator[B, C] with Loader[C] =>
/**
* this is the method call for chaining all events together
*/
def importAndTransformData(dataSource: A): Unit =
{
/**
* extraction step
*/
val output = extract(dataSource: A)
/**
* conversion method
*/
val transformed = output map (transform(_))
/**
* loading step
*/
transformed.foreach(load(_))
}
}
You can then build your Importer with code such as:
val importer = new Importer with FooExtractor with FooBarTransformer with BarLoader {}
or
val testImporter = Importer with MockExtractor with TestTransformer with MockLoader {}
or similar for your test cases.

Related

How can I view the code that Scala uses to automatically generate the apply function for case classes?

When defining a Scala case class, an apply function is automatically generated which behaves similarly to the way the default constructor in java behaves. How can I see the code which automatically generates the apply function? I presume the code is a macro in the Scala compiler somewhere but I'm not sure.
To clarify I am not interested in viewing the resultant apply method of a given case class but interested in the macro/code which generates the apply method.
It's not a macro. Methods are synthesized by compiler "manually".
apply, unapply, copy are generated in scala.tools.nsc.typechecker.Namers
https://github.com/scala/scala/blob/2.13.x/src/compiler/scala/tools/nsc/typechecker/Namers.scala#L1839-L1862
/** Given a case class
* case class C[Ts] (ps: Us)
* Add the following methods to toScope:
* 1. if case class is not abstract, add
* <synthetic> <case> def apply[Ts](ps: Us): C[Ts] = new C[Ts](ps)
* 2. add a method
* <synthetic> <case> def unapply[Ts](x: C[Ts]) = <ret-val>
* where <ret-val> is the caseClassUnapplyReturnValue of class C (see UnApplies.scala)
*
* #param cdef is the class definition of the case class
* #param namer is the namer of the module class (the comp. obj)
*/
def addApplyUnapply(cdef: ClassDef, namer: Namer): Unit = {
if (!cdef.symbol.hasAbstractFlag)
namer.enterSyntheticSym(caseModuleApplyMeth(cdef))
val primaryConstructorArity = treeInfo.firstConstructorArgs(cdef.impl.body).size
if (primaryConstructorArity <= MaxTupleArity)
namer.enterSyntheticSym(caseModuleUnapplyMeth(cdef))
}
def addCopyMethod(cdef: ClassDef, namer: Namer): Unit = {
caseClassCopyMeth(cdef) foreach namer.enterSyntheticSym
}
https://github.com/scala/scala/blob/2.13.x/src/compiler/scala/tools/nsc/typechecker/Namers.scala#L1195-L1219
private def templateSig(templ: Template): Type = {
//...
// add apply and unapply methods to companion objects of case classes,
// unless they exist already; here, "clazz" is the module class
if (clazz.isModuleClass) {
clazz.attachments.get[ClassForCaseCompanionAttachment] foreach { cma =>
val cdef = cma.caseClass
assert(cdef.mods.isCase, "expected case class: "+ cdef)
addApplyUnapply(cdef, templateNamer)
}
}
// add the copy method to case classes; this needs to be done here, not in SyntheticMethods, because
// the namer phase must traverse this copy method to create default getters for its parameters.
// here, clazz is the ClassSymbol of the case class (not the module). (!clazz.hasModuleFlag) excludes
// the moduleClass symbol of the companion object when the companion is a "case object".
if (clazz.isCaseClass && !clazz.hasModuleFlag) {
val modClass = companionSymbolOf(clazz, context).moduleClass
modClass.attachments.get[ClassForCaseCompanionAttachment] foreach { cma =>
val cdef = cma.caseClass
def hasCopy = (decls containsName nme.copy) || parents.exists(_.member(nme.copy).exists)
// scala/bug#5956 needs (cdef.symbol == clazz): there can be multiple class symbols with the same name
if (cdef.symbol == clazz && !hasCopy)
addCopyMethod(cdef, templateNamer)
}
}
equals, hashCode, toString are generated in scala.tools.nsc.typechecker.SyntheticMethods
https://github.com/scala/scala/blob/2.13.x/src/compiler/scala/tools/nsc/typechecker/SyntheticMethods.scala
/** Synthetic method implementations for case classes and case objects.
*
* Added to all case classes/objects:
* def productArity: Int
* def productElement(n: Int): Any
* def productPrefix: String
* def productIterator: Iterator[Any]
*
* Selectively added to case classes/objects, unless a non-default
* implementation already exists:
* def equals(other: Any): Boolean
* def hashCode(): Int
* def canEqual(other: Any): Boolean
* def toString(): String
*
* Special handling:
* protected def writeReplace(): AnyRef
*/
trait SyntheticMethods extends ast.TreeDSL {
//...
Symbols for accessors are created in scala.reflect.internal.Symbols
https://github.com/scala/scala/blob/2.13.x/src/reflect/scala/reflect/internal/Symbols.scala#L2103-L2128
/** For a case class, the symbols of the accessor methods, one for each
* argument in the first parameter list of the primary constructor.
* The empty list for all other classes.
*
* This list will be sorted to correspond to the declaration order
* in the constructor parameter
*/
final def caseFieldAccessors: List[Symbol] = {
// We can't rely on the ordering of the case field accessors within decls --
// handling of non-public parameters seems to change the order (see scala/bug#7035.)
//
// Luckily, the constrParamAccessors are still sorted properly, so sort the field-accessors using them
// (need to undo name-mangling, including the sneaky trailing whitespace)
//
// The slightly more principled approach of using the paramss of the
// primary constructor leads to cycles in, for example, pos/t5084.scala.
val primaryNames = constrParamAccessors map (_.name.dropLocal)
def nameStartsWithOrigDollar(name: Name, prefix: Name) =
name.startsWith(prefix) && name.length > prefix.length + 1 && name.charAt(prefix.length) == '$'
caseFieldAccessorsUnsorted.sortBy { acc =>
primaryNames indexWhere { orig =>
(acc.name == orig) || nameStartsWithOrigDollar(acc.name, orig)
}
}
}
private final def caseFieldAccessorsUnsorted: List[Symbol] = info.decls.toList.filter(_.isCaseAccessorMethod)
Perhaps I could point out few points in the codebase that might be relevant.
First, there is a way to correlate Scala Language Specification grammar directly to source code. For example, case classes rule
TmplDef ::= β€˜case’ β€˜class’ ClassDef
relates to Parser.tmplDef
/** {{{
* TmplDef ::= [case] class ClassDef
* | [case] object ObjectDef
* | [override] trait TraitDef
* }}}
*/
def tmplDef(pos: Offset, mods: Modifiers): Tree = {
...
in.token match {
...
case CASECLASS =>
classDef(pos, (mods | Flags.CASE) withPosition (Flags.CASE, tokenRange(in.prev /*scanner skips on 'case' to 'class', thus take prev*/)))
...
}
}
Specification continues
A case class definition of 𝑐[tps](ps1)…(ps𝑛) with type parameters
tps and value parameters ps implies the definition of a companion
object, which serves as an extractor object.
object 𝑐 {
def apply[tps](ps1)…(ps𝑛): 𝑐[tps] = new 𝑐[Ts](xs1)…(xs𝑛)
def unapply[tps](π‘₯: 𝑐[tps]) =
if (x eq null) scala.None
else scala.Some(π‘₯.xs11,…,π‘₯.xs1π‘˜)
}
so let us try to hunt for implied definition of
def apply[tps](ps1)…(ps𝑛): 𝑐[tps] = new 𝑐[Ts](xs1)…(xs𝑛)
which is another way of saying synthesised definition. Promisingly, there exists MethodSynthesis.scala
/** Logic related to method synthesis which involves cooperation between
* Namer and Typer.
*/
trait MethodSynthesis {
Thus we find two more potential clues Namer and Typer. I wonder what is in there? But first MethodSynthesis.scala has only approx 300 LOC, so let us just skim through a bit. We stumble accross a promising line
val methDef = factoryMeth(classDef.mods & (AccessFlags | FINAL) | METHOD | IMPLICIT | SYNTHETIC, classDef.name.toTermName, classDef)
"factoryMeth"... there is a ring to it. Find usages! We are quickly led to
/** The apply method corresponding to a case class
*/
def caseModuleApplyMeth(cdef: ClassDef): DefDef = {
val inheritedMods = constrMods(cdef)
val mods =
if (applyShouldInheritAccess(inheritedMods))
(caseMods | (inheritedMods.flags & PRIVATE)).copy(privateWithin = inheritedMods.privateWithin)
else
caseMods
factoryMeth(mods, nme.apply, cdef)
}
It seems we are on the right track. We also note the name
nme.apply
which is
val apply: NameType = nameType("apply")
Eagerly, we find usages of caseModuleApplyMeth and we are wormholed to Namer.addApplyUnapply
/** Given a case class
* case class C[Ts] (ps: Us)
* Add the following methods to toScope:
* 1. if case class is not abstract, add
* <synthetic> <case> def apply[Ts](ps: Us): C[Ts] = new C[Ts](ps)
* 2. add a method
* <synthetic> <case> def unapply[Ts](x: C[Ts]) = <ret-val>
* where <ret-val> is the caseClassUnapplyReturnValue of class C (see UnApplies.scala)
*
* #param cdef is the class definition of the case class
* #param namer is the namer of the module class (the comp. obj)
*/
def addApplyUnapply(cdef: ClassDef, namer: Namer): Unit = {
if (!cdef.symbol.hasAbstractFlag)
namer.enterSyntheticSym(caseModuleApplyMeth(cdef))
val primaryConstructorArity = treeInfo.firstConstructorArgs(cdef.impl.body).size
if (primaryConstructorArity <= MaxTupleArity)
namer.enterSyntheticSym(caseModuleUnapplyMeth(cdef))
}
Woohoo! The documentation states
<synthetic> <case> def apply[Ts](ps: Us): C[Ts] = new C[Ts](ps)
which seems eerily similar to SLS version
def apply[tps](ps1)…(ps𝑛): 𝑐[tps] = new 𝑐[Ts](xs1)…(xs𝑛)
Our stumbling-in-the-dark seems to have led us to a discovery.
I noticed that, while others have posted the pieces of code that generate the name of the method, the signature, the type, the corresponding symbols in the symbol table, and pretty much everything else, so far nobody has posted the piece of code that generates the actual body of the case class companion object apply method.
That code is in scala.tools.nsc.typechecker.Unapplies.factoryMeth(mods: Global.Modifiers, name: Global.TermName, cdef: Global.ClassDef): Global.DefDef which is defined in src/compiler/scala/tools/nsc/typechecker/Unapplies.scala, and the relevant part is this:
atPos(cdef.pos.focus)(
DefDef(mods, name, tparams, cparamss, classtpe,
New(classtpe, mmap(cparamss)(gen.paramToArg)))
)
which uses the TreeDSL internal Domain Specific Language for generating Syntax Nodes in the Abstract Syntax Tree, and (roughly) means this:
At the current position in the tree (atPos(cdef.pos.focus))
Splice in a method definition node (DefDef)
Whose body is just a New node, i.e. a constructor invocation.
The description of the TreeDSL trait states:
The goal is that the code generating code should look a lot like the code it generates.
And I think that is true, and makes the code easy to read even if you are not familiar with the compiler internals.
Compare the generating code once again with the generated code:
DefDef(mods, name, tparams, cparamss, classtpe,
New(classtpe, mmap(cparamss)(gen.paramToArg)))
def apply[Tparams](constructorParams): CaseClassType =
new CaseClassType(constructorParams)

What is this Scala construct doing?

I've been using Play! Framework with Java and would like to try it out with Scala.
I've started on a Scala book but the most basic Play! sample has me completely puzzled:
def index(): Action[AnyContent] = Action { implicit request =>
Ok(views.html.index())
}
What Scala construct is Play! using here? I understand that we are defining a function that returns an Action with a generic parameter AnyContent. But the next part has me puzzled. What does the assignment mean in this context?
If I go to definition of Action[AnyContent] it's defined as trait Action[A] extends EssentialAction { ... }
If I go to the definition of Action after equals it's defined as:
trait BaseController extends BaseControllerHelpers {
/**
* The default ActionBuilder. Used to construct an action, for example:
*
* {{{
* def foo(query: String) = Action {
* Ok
* }
* }}}
*
* This is meant to be a replacement for the now-deprecated Action object, and can be used in the same way.
*/
def Action: ActionBuilder[Request, AnyContent] = controllerComponents.actionBuilder
}
Note: I'm interested in the Scala construct that's used I don't care what Play! is actually doing here which I kind of understand.
You are essentially calling Action.apply(), which is defined here in ActionBuilder. The first and only parameter of the apply() function being the function request => Ok(...).

How to add methods to Slick tables?

I'd like do perform something like that using Slick (I have updated to 3.0.0-M1):
class MyTable extends Table[(Int, Int)](tag, "MyTable) {
def a = column[Int]("a")
def b = column[Int]("b")
def * = (a, b)
def total: Int = a + b // That's THE thing
}
So that IΒ can later perform:
val values = TableQuery[MyTable]
values.map(_.total)
Of course, I am stuck on the total method. The total method can be fairly complex (I have an application where it should compute the median of three counts), so I think it should be actual Scala code to be executed in the end.
How anything like this could be developed in Slick?
As long as you can express it using Slick, e.g.
def total/*: Column[Int]*/ = a + b
It will be run on the server side. Instead of placing it in the Table subclass, you can alternatively use an implicit class to patch on a method from the outside:
implicit class ExtendMyTable(t: MyTable){
def total/*: Column[Int]*/ = t.a + t.b
}
It just needs to be in scope where you try to call .total. Or if you really need client-side Scala coding, extend the result type instead, e.g.
implicit class ExtendMyTableResult(t: (Int,Int)){
def total/*: Int*/ = t._1 + t._2
}
And then do
TableQuery[MyTable].run.map(_.total)

Using Scala 2.10 implicit classes to convert to "built-in" standard library classes

I am trying to use the new Scala 2.10 implicit class mechanism to convert a java.sql.ResultSet to a scala.collection.immutable.Stream. In Scala 2.9 I use the following code, which works:
/**
* Implicitly convert a ResultSet to a Stream[ResultSet]. The Stream can then be
* traversed using the usual methods map, filter, etc.
*
* #param resultSet the Result to convert
* #return a Stream wrapped around the ResultSet
*/
implicit def resultSet2Stream(resultSet: ResultSet): Stream[ResultSet] = {
if (resultSet.next) Stream.cons(resultSet, resultSet2Stream(resultSet))
else {
resultSet.close()
Stream.empty
}
}
I can then use it like this:
val resultSet = statement.executeQuery("SELECT * FROM foo")
resultSet.map {
row => /* ... */
}
The implicit class that I came up with looks like this:
/**
* Implicitly convert a ResultSet to a Stream[ResultSet]. The Stream can then be
* traversed using the usual map, filter, etc.
*/
implicit class ResultSetStream(val row: ResultSet)
extends AnyVal {
def toStream: Stream[ResultSet] = {
if (row.next) Stream.cons(row, row.toStream)
else {
row.close()
Stream.empty
}
}
}
However, now I must call toStream on the ResultSet, which sort of defeats the "implicit" part:
val resultSet = statement.executeQuery("SELECT * FROM foo")
resultSet.toStream.map {
row => /* ... */
}
What am I doing wrong?
Should I still be using the implicit def and import scala.language.implicitConversions to avoid the "features" warning?
UPDATE
Here is an alternative solution that converts the ResultSet into a scala.collection.Iterator (only Scala 2.10+):
/*
* Treat a java.sql.ResultSet as an Iterator, allowing operations like filter,
* map, etc.
*
* Sample usage:
* val resultSet = statement.executeQuery("...")
* resultSet.map {
* resultSet =>
* // ...
* }
*/
implicit class ResultSetIterator(resultSet: ResultSet)
extends Iterator[ResultSet] {
def hasNext: Boolean = resultSet.next()
def next() = resultSet
}
I don't see a reason here to use implicit classes. Stick to you first version. Implicit classes are mainly useful (as in "concise") to add methods to existing types (the so called "enrich my library" pattern).
It is just syntactic sugar for a wrapper class and an implicit conversion to this class.
But here you are just converting (implicitly) from one preexisting type to another preexisting type. There is no need to define a new class at all (let alone an implicit class).
In your case, you could make it work using implicit classes by making ResultSetStream extend Stream and implementing as a proxy to toStream. But that would really a lot fo trouble for nothing.

Is is possible to capture the type parameter of a trait using Manifests in Scala 2.7.7?

I'm writing a ServletUnitTest trait in Scala to provide a convenience API for ServletUnit. I have something like the following in mind:
/**
* Utility trait for HttpUnit/ServletUnit tests
*
* #param [T] Type parameter for the class under test
*/
trait ServletUnitTest[T <: HttpServlet] {
/**
* Resource name of the servlet, used to construct the servlet URL.
*/
val servletName: String
/**
* Servlet class under test
*/
implicit val servletClass: Manifest[T]
/**
* ServletUnit {#link ServletRunner}
*/
sealed lazy val servletRunner: ServletRunner = {
val sr = new ServletRunner();
sr.registerServlet(servletName, servletClass.erasure.getName);
sr
}
/**
* A {#link com.meterware.servletunit.ServletUnitClient}
*/
sealed lazy val servletClient = servletRunner.newClient
/**
* The servlet URL, useful for constructing WebRequests
*/
sealed lazy val servletUrl = "http://localhost/" + servletName
def servlet(ic: InvocationContext) = ic.getServlet.asInstanceOf[T]
}
class MyServletTest extends ServletIUnitTest[MyServlet] {
val servletName = "download"
// ... test code ...
}
This code doesn't compile as written, but hopefully my intent is clear. Is there a way to do this (with or without Manifests)?
While researching this topic, I found about a solution in this scala-list post by Jorge Ortiz, which did the trick for me, and is simpler than Aaron's.
In essence, his solution is (paraphrasing):
trait A[T] {
implicit val t: Manifest[T]
}
class B[T: Manifest] extends A[T] {
override val t = manifest[T]
}
(I'm ignoring the OP request to be 2.7.7 compatible as I'm writing this in 2011...)
For now, Scala represents traits as interfaces so this technique will work. There are some problems with this approach to implementing traits, however, in that when methods are added to a trait, the implementing class will not necessarily recompile because the interface representation only has a forwarding method pointing to another class that actually implements the method concretely. In response to this there was talk earlier this year of using interface injection into the JVM at runtime to get around this problem. If the powers that be use this approach then the trait's type information will be lost before you can capture it.
The type information is accessible with the Java reflection API. It's not pretty but it works:
trait A[T]{
def typeParameter = {
val genericType = getClass.getGenericInterfaces()(0).asInstanceOf[ParameterizedType]
genericType.getActualTypeArguments()(0)
}
}
class B extends A[Int]
new B().typeParameter -> java.lang.Integer
Some invariant checks should be added I've only implemented the happy path.
I found a solution that works, but it's pretty awkward since it requires the test class to call a method (clazz) on the trait before any of the trait's lazy vals are evaluated.
/**
* Utility trait for HttpUnit/ServletUnit tests
*
* #param [T] Type parameter for the class under test
*/
trait ServletUnitTest[T <: HttpServlet] {
/**
* Resource name of the servlet, used to construct the servlet URL.
*/
val servletName: String
/**
* Servlet class under test
*/
val servletClass: Class[_] // = clazz
protected def clazz(implicit m: Manifest[T]) = m.erasure
/**
* ServletUnit {#link ServletRunner}
*/
sealed lazy val servletRunner: ServletRunner = {
val sr = new ServletRunner();
sr.registerServlet(servletName, servletClass.getName);
sr
}
/**
* A {#link com.meterware.servletunit.ServletUnitClient}
*/
sealed lazy val servletClient = servletRunner.newClient
/**
* The servlet URL, useful for constructing WebRequests
*/
sealed lazy val servletUrl = "http://localhost/" + servletName
def servlet(ic: InvocationContext) = ic.getServlet.asInstanceOf[T]
}
class MyServletTest extends ServletIUnitTest[MyServlet] {
val servletName = "download"
val servletClass = clazz
// ... test code ...
}