How does one implement a Hadoop Mapper in Scala 2.9.0? - scala

When I migrated to Scala 2.9.0 from 2.8.1, all of the code was functional except for the Hadoop mappers. Because I had some wrapper objects in the way, I distilled down to the following example:
import org.apache.hadoop.mapreduce.{Mapper, Job}
object MyJob {
def main(args:Array[String]) {
val job = new Job(new Configuration())
job.setMapperClass(classOf[MyMapper])
}
}
class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
}
}
When I run this in 2.8.1, it runs quite well (and I have plenty of production code in 2.8.1. In 2.9.0 I get the following compilation error:
error: type mismatch;
found : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])
The failing call is when I call setMapperClass on the Job object. Here's the definition of that method:
public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }
The definition of the Mapper class itself is this:
public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>
Does anyone have a sense of what I'm doing wrong? It looks to me like the type is fundamentally correct: MyMapper does extend Mapper, and the method wants something that extends Mapper. And it works great in 2.8.1...

Silly as it seems, you can work around the problem by defining the Mapper before the Job. The following compiles:
import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._
class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
}
}
object MyJob {
def main(args:Array[String]) {
val job = new Job(new Configuration())
job.setMapperClass(classOf[MyMapper])
}
}

Related

Phantom dsl 2.24 tables are not created

Recently trying to migrate to the latest phantom version 2.24.8. I created a dummy project, but running into a few issues that I can't figure out. Here's my code:
import com.outworkers.phantom.connectors.{CassandraConnection, ContactPoints}
import com.outworkers.phantom.database.Database
import scala.concurrent.Future
import com.outworkers.phantom.dsl._
case class Test(id: String, timestamp: String)
abstract class Tests extends Table[Tests, Test] {
object id extends StringColumn with PartitionKey
object timestamp extends StringColumn with ClusteringOrder
}
abstract class ConcreteTests extends Tests with RootConnector {
def addTest(l: Test): Future[ResultSet] = {
// store(l).consistencyLevel_=(ConsistencyLevel.LOCAL_ONE).future
insert.value(_.id, l.id)
.value(_.timestamp, l.timestamp)
.consistencyLevel_=(ConsistencyLevel.QUORUM).future
}
}
class MyDB(override val connector: CassandraConnection) extends Database[MyDB](connector) {
object tests extends ConcreteTests with connector.Connector
def init(): Unit = {
tests.create
}
}
object Test{
def main(args: Array[String]): Unit = {
val db = new MyDB(ContactPoints(Seq("127.0.0.1")).keySpace("tests"))
db.init
db.tests.addTest(Test("1", "1323234234"))
println("Done")
}
}
It compiled and ran in IntelliJ and print out 'Done'. However, no table is ever created. Also no exceptions or warnings. It did nothing. I tried to stop the local cassandra database. The code throws the NoHostAvailableException. So it does try to connect the local database. What is the problem?
Another weird thing is that "com.typesafe.play" %% "play-json" % "2.6.9" is in my build.sbt. If I remove the library, the same code throws the following exception:
Exception in thread "main" java.lang.NoClassDefFoundError: scala/reflect/runtime/package$
at com.outworkers.phantom.column.AbstractColumn.com$outworkers$phantom$column$AbstractColumn$$_name(AbstractColumn.scala:55)
at com.outworkers.phantom.column.AbstractColumn.com$outworkers$phantom$column$AbstractColumn$$_name$(AbstractColumn.scala:54)
at com.outworkers.phantom.column.Column.com$outworkers$phantom$column$AbstractColumn$$_name$lzycompute(Column.scala:22)
at com.outworkers.phantom.column.Column.com$outworkers$phantom$column$AbstractColumn$$_name(Column.scala:22)
at com.outworkers.phantom.column.AbstractColumn.name(AbstractColumn.scala:58)
at com.outworkers.phantom.column.AbstractColumn.name$(AbstractColumn.scala:58)
at com.outworkers.phantom.column.Column.name(Column.scala:22)
at com.outworkers.phantom.builder.query.InsertQuery.value(InsertQuery.scala:107)
Really cannot figure what's going on. Any help?
BTW, I'm using scala 2.12.6 and JVM 1.8.181.
You're not using the correct DSL method for table creation, have a look at the official guide. All that table.create does is to create an empty CreateQuery, and you're coercing the return type to Unit manually.
The automated blocking create method is on Database, not on table, so what you want is:
class MyDB(override val connector: CassandraConnection) extends Database[MyDB](connector) {
object tests extends ConcreteTests with connector.Connector
def init(): Unit = {
this.create
}
}
If you want to achieve the same thing using table, you need:
class MyDB(override val connector: CassandraConnection) extends Database[MyDB](connector) {
object tests extends ConcreteTests with connector.Connector
def init(): Unit = {
import scala.concurrent.duration._
import scala.concurrent.Await
Await.result(tests.create.future(), 10.seconds)
}
}
It's only the call to future() method that will trigger any kind of action to the database, otherwise you're just building a query without executing it. The method name can be confusing, and we will improve the docs and future releases to make it more obvious.
The conflict with play 2.6.9 looks very weird, it's entirely possible there's an incompatible dependency behind the scenes to do with macro compilation. Raise that as a separate issue and we can definitely have a look at it.

`exception during macro expansion: [error] scala.reflect.macros.TypecheckException` when using quill

I'm pretty new to Scala, Play, and Quill and I'm not sure what I'm doing wrong. I have my project split up into models, repositories, and services (and controllers, but that is not relevant for this question). Right now, I'm getting this error for the lines in my services that are making changes to the database:
exception during macro expansion: scala.reflect.macros.TypecheckException: Can't find implicit `Decoder[models.AgentId]`. Please, do one of the following things:
1. ensure that implicit `Decoder[models.AgentId]` is provided and there are no other conflicting implicits;
2. make `models.AgentId` `Embedded` case class or `AnyVal`.
And I'm getting this error for all the other lines in my services:
exception during macro expansion: [error] scala.reflect.macros.TypecheckException: not found: value quote
I found a similar ticket, but the same fix does not work for me (I am already requiring ctx as an implicit variable, so I can't import it as well. I'm totally at a loss and if anyone has any suggestions, I would be happy to try anything. I'm using the following versions:
Scala 2.12.4
Quill 2.3.2
Play 2.6.6
The code:
db/package.scala
package db
import io.getquill.{PostgresJdbcContext, SnakeCase}
package object db {
class DBContext(config: String) extends PostgresJdbcContext(SnakeCase, config)
trait Repository {
val ctx: DBContext
}
}
repositories/AgentsRepository.scala
package repositories
import db.db.Repository
import models.Agent
trait AgentsRepository extends Repository {
import ctx._
val agents = quote {
query[Agent]
}
def agentById(id: AgentId) = quote { agents.filter(_.id == lift(id)) }
def insertAgent(agent: Agent) = quote {
query[Agent].insert(_.identifier -> lift(agent.identifier)
).returning(_.id)
}
}
services/AgentsService.scala
package services
import db.db.DBContext
import models.{Agent, AgentId}
import repositories.AgentsRepository
import scala.concurrent.ExecutionContext
class AgentService(implicit val ex: ExecutionContext, val ctx: DBContext)
extends AgentsRepository {
def list: List[Agent] =
ctx.run(agents)
def find(id: AgentId): List[Agent] =
ctx.run(agentById(id))
def create(agent: Agent): AgentId = {
ctx.run(insertAgent(agent))
}
}
models/Agent.scala
package models
import java.time.LocalDateTime
case class AgentId(value: Long) extends AnyVal
case class Agent(
id: AgentId
, identifier: String
)
I am already requiring ctx as an implicit variable, so I can't import it as well
You don't need to import a context itself, but everything which is inside in order to make it work
import ctx._
Make sure to place it before ctx.run called, as in https://github.com/getquill/quill/issues/998#issuecomment-352189214

Check what the method of mocked object receives

In my project, whenever a class produces some output, instead of doing println it calls OutputStore.write, which is a class and method I defined.
I am trying to test the output of another class so I mocked OutputStore. I want to see what parameters it receives to OutputStore.write.
val mockOutputStore = mock[OutputStore]
I would like to do something like this:
val argument = ArgumentCaptor.forClass(classOf[OutputStore])
verify(mockOutputStore).write(argument.capture())
assertEquals("some parameter", argument.getValue())
However, this doesn't compile as verify is not even recognized.
The signature of my test class is this:
class SomeUnitTestSet extends org.scalatest.FunSuite with MockitoSugar with PropertyChecks
Any idea how to check what parameters a mocked object's method receives?
Here is a translation of what #JBNizet suggested into a Scala code
Assuming you have your OutputStore class
class OutputStore {
def write(msg: String) = {
println(msg)
}
}
and some OutputStoreApiUser class
class OutputStoreApiUser(val outputStore: OutputStore) {
def foo(): Unit = {
outputStore.write("some parameter")
outputStore.write("some parameter2")
}
}
Then your test might be something like this (in real life you probably #Inject outputStore but this is not relevant here):
import org.mockito.Mockito.verify // static import!
import org.scalatest.mockito.MockitoSugar
import org.scalatest.prop.PropertyChecks
class SomeUnitTestSet extends org.scalatest.FunSuite with MockitoSugar with PropertyChecks {
test("Capture calls"){
val mockOutputStore = mock[OutputStore]
val apiUser = new OutputStoreApiUser(mockOutputStore)
apiUser.foo()
verify(mockOutputStore).write("some parameter")
verify(mockOutputStore).write("some parameter2")
}
}
This one compiles and works for me as I would expect

Scala: How to test methods that call System.exit()?

I have been developing a command-line tool which calls System.exit() (don't want to use exceptions instead of) on certain inputs.
I am familiar with Java: How to test methods that call System.exit()? and its the most elegant approach.
Unfortunately, it is not enough pure, due to I had to add the dependencies to system-rules, junit-interface
Is there any common pattern for dealing with System.exit in specs2 which is more pure than my current approach which don't use specs2?
import org.junit.Rule;
import org.junit.Test;
import org.junit.contrib.java.lang.system.ExpectedSystemExit;
public class ConverterTest {
#Rule
public final ExpectedSystemExit exit = ExpectedSystemExit.none();
#Test
public void emptyArgs() {
exit.expectSystemExit();
Converter.main(new String[]{});
}
#Test
public void missingOutArgument() {
exit.expectSystemExitWithStatus(1);
Converter.main(new String[]{"--in", "src/test/resources/078.xml.gz"});
}
}
If you really wish to go with a method using System.exit(), the simplest way to test it was actually called is to replace your SecurityManager with one that'll throw an ExitException (subclassing SecurityException) when System.exit() is called:
class SystemExitSpec
import java.security.Permission
import org.specs2.mutable.Specification
import org.specs2.specification.BeforeAfterAll
sealed case class ExitException(status: Int) extends SecurityException("System.exit() is not allowed") {
}
sealed class NoExitSecurityManager extends SecurityManager {
override def checkPermission(perm: Permission): Unit = {}
override def checkPermission(perm: Permission, context: Object): Unit = {}
override def checkExit(status: Int): Unit = {
super.checkExit(status)
throw ExitException(status)
}
}
abstract class SystemExitSpec extends Specification with BeforeAfterAll {
sequential
override def beforeAll(): Unit = System.setSecurityManager(new NoExitSecurityManager())
override def afterAll(): Unit = System.setSecurityManager(null)
}
test ConverterSpec
import org.specs2.execute.Failure
import scala.io.Source
class ConverterSpec extends SystemExitSpec {
"ConverterSpec" should {
"empty args" >> {
try {
Converter.main(Array[String]())
Failure("shouldn't read this code")
} catch {
case e: ExitException =>
e.status must_== 1
}
1 must_== 1
}
}
First option: use some exception instead of System.exit.
Second option: call application in separate thread and check return codes.
Third option: mock System.exit. There are many possibilities to do that, mentioned one is quite good.
However, there is no specs2-specific pattern to work with System.exit. Personally I'd suggest first or second options.

Unable to override java.lang.String field. What is wrong?

I tried to compile code that contains
class FixedIndexedRepository(override val name: java.lang.String, location: URI) extends FixedIndexedRepo
Which extends FixedIndexedRepo which extends Java class AbstractIndexedRepo
public abstract class AbstractIndexedRepo implements RegistryPlugin, Plugin, RemoteRepositoryPlugin, IndexProvider, Repository {
...
protected String name = this.getClass().getName();
...
Unfortunately Scala 2.9.2 compiler stops with an error:
.../FixedIndexedRepository.scala:29: overriding variable name in class AbstractIndexedRepo of type java.lang.String;
[error] value name has incompatible type
[error] class FixedIndexedRepository(override val name: java.lang.String, location: URI) extends FixedIndexedRepo
How to fix this? What is wrong?
Rex says it is ugly:
Making a public accessor from an inherited protected Java field
Given:
package j;
public class HasName {
protected String name = "name";
}
then the fake-out is:
package user
private[user] class HasNameAdapter extends j.HasName {
protected def named: String = name
protected def named_=(s: String) { name = s }
}
class User(n: String = "nom") extends HasNameAdapter {
def name(): String = named
def name_=(s: String) { this named_= s }
this name_= n
}
object Test extends App {
val u = new User("bob")
Console println s"user ${u.name()}"
Console println s"user ${u.name}"
}
You were forewarned about the ugly.
I haven't quite worked out the details either, but the weekend is coming up.
Unfortunately Scala 2.9.2 compiler stops with an error
You mean, fortunately it stops with an error.