Mocking DynamoDB in Spark with Mockito - scala

I want to mock Utilities function dynamoDBStatusWrite so that when my spark program will run, it will not hit the DynamoDB.
Below is my mocking and test case stuff
class FileConversion1Test extends FlatSpec with MockitoSugar with Matchers with ArgumentMatchersSugar with SparkSessionTestWrapper {
"File Conversion" should "convert the file to" in {
val utility = mock[Utilities1]
val client1 = mock[AmazonDynamoDB]
val dynamoDB1 =mock[DynamoDB]
val dynamoDBFunc = mock[Utilities1].dynamoDBStatusWrite("test","test","test","test")
val objUtilities1 = new Utilities1
FieldSetter.setField(objUtilities1,objUtilities1.getClass.getDeclaredField("client"),client1)
FieldSetter.setField(objUtilities1,objUtilities1.getClass.getDeclaredField("dynamoDB"),dynamoDB1)
FieldSetter.setField(objUtilities1,objUtilities1.getClass.getField("dynamoDBStatusWrite"),dynamoDBFunc)
when(utility.dynamoDBStatusWrite("test","test","test","test")).thenReturn("pass")
assert(FileConversion1.fileConversionFunc(spark,"src/test/inputfiles/userdata1.csv","parquet","src/test/output","exec1234567","service123")==="passed")
}
}
My spark program should not try to connect dynamoDB. but is trying to connect

You have 2 problems there, for starters the fact that you mock something doesn't replace it automatically in your system, you need to build your software so components are injected and so in the test you would be providing a mock version of them. ie, fileConversionFunc should receive another parameter with the connector to Dynamo.
That said, it's considered a bad practice to mock library/3rd party classes, what you should do there is to create your own component that encapsulates the interaction with Dynamo, and then mock your component, as it's an API you control.
You can find a detailed explanation of why here

Related

Cannot mock the RedisClient class - method pipeline overrides nothing

I have a CacheService class that uses an instance of the scala-redis library
class CacheService(redisClient: RedisClient) extend HealthCheck {
private val client = redisClient
override def health: Future[ServiceHealth] = {
client.info
...
}
In my unit test, I'm mocking the client instance and testing the service
class CacheServiceSpec extends AsyncFlatSpec with AsyncMockFactory {
val clientMock = mock[RedisClient]
val service = new CacheService(clientMock)
"A cache service" must "return a successful future when healthy" in {
(clientMock.info _).expects().returns(Option("blah"))
service.health map {
health => assert(health.status == ServiceStatus.Running)
}
}
}
yet I'm getting this compilation error
Error:(10, 24) method pipeline overrides nothing.
Note: the super classes of <$anon: com.redis.RedisClient> contain the following, non final members named pipeline:
def pipeline(f: PipelineClient => Any): Option[List[Any]]
val clientMock = mock[RedisClient]
My research so far indicates ScalaMock 4 is NOT capable of mocking companion objects. The author suggests refactoring the code with Dependency Injection.
Am I doing DI correctly (I chose constructor args injection since our codebase is still relatively small and straightforward)? Seems like the author is suggesting putting a wrapper over the client instance. If so, I'm looking for an idiomatic approach.
Should I bother with swapping out for another redis library? The libraries being actively maintained, per redis.io's suggestion, use companion objects as well. I personally think this is is not a problem of these libraries.
I'd appreciate any further recommendations. My goal here is to create a health check for our external services (redis, postgres database, emailing and more) that is at least testable. Criticism is welcomed since I'm still new to the Scala ecosystem.
Am I doing DI correctly (I chose constructor args injection since our
codebase is still relatively small and straightforward)? Seems like
the author is suggesting putting a wrapper over the client instance.
If so, I'm looking for an idiomatic approach.
Yes, you are right and this seems to be a known issue(link1). Ideally, there needs to a wrapper around the client instance. One approach could be to create a trait that has a method say connect and extend it to RedisCacheDao and implement the connect method to give you the client instance whenever you require. Then, all you have to do is to mock this connection interface and you will be able to test.
Another approach could be to use embedded redis for unit testing though usually, it is used for integration testing.You can start a simple redis server where the tests are running via code and close it once the testing is done.
Should I bother with swapping out for another redis library? The
libraries being actively maintained, per redis.io's suggestion, use
companion objects as well. I personally think this is is not a problem
of these libraries.
You can certainly do that. I would prefer Jedis as it is easy and it's performance is better than scala-redis(while performing mget).
Let me know if it helps!!

How do I create a "Locator" for Akka Actors

I am an akka newbie and I'm trying to build an application that is composed of Spray and Akka. As part of the application I would like to give my fellow developers (who are also new to akka) some prepackaged actors that do specific things which they can then "attach" to their actor systems.
Specifically :
Is there a recommended way to provide a "actor locator"/"Actor System locator" -- think service locator like API to lookup and send messages to actors ? In other words How can I implement a function like:
ActorLocator.GoogleLocationAPIActor
so that I can then use it like :
ActorLocator.GoogleLocationAPIActor ! "StreetAddress"
Assume that getGoogleLocationAPIActor returns an ActorRef that accepts Strings that are addresses and makes an HTTP call to google to resolve that to a lat/lon.
I could internally use actorSelection, but :
I would like to provide the GoogleLocationAPIActor as part of a library that my fellow developers can use
#1 means that when my fellow developer builds an actor based application, he needs a way to tell the library code where the actor system is, so that the library can go an attach the actor to it (In keeping with the one actor system per application practice). Of course in a distributed environment, this could be a guardian for a cluster of actors that are running remotely.
Currently I define the ActorSystem in an object like and access it everywhere like
object MyStage{
val system:ActorSystem = ActorSystem("my-stage")
}
then
object ActorLocator{
val GoogleLocationAPIActor = MyStage.system.actorOf(Props[GoogleLocationAPI])
}
This approach seems to be similar to this but I'm not very sure if this is a good thing. My concerns are that the system seems too open for anyone to add children to without any supervision hierarchy, it seems a bit ugly.
Is my ask a reasonable one or Am I thinking about this wrong ?
How can we have "build up" a library of actors that we can reuse across apps ?
Since this is is about designing an API, you're dangerously close to opinion territory but anyway, here is how I would be tempted to structure this. Personally I'm quite allergic to global singletons so:
Since ActorLocator is a service, I would organize it as a Trait:
trait ActorLocator {
def GoogleLocationAPIActor: ActorRef
def SomeOtherAPIActor: ActorRef
}
Then, you can have an implementation of the service such as:
class ActorLocatorLocalImpl(system: ActorSystem) extends ActorLocator {
override lazy val GoogleLocationAPIActor: ActorRef =
system.actorOf(Props[GoogleLocationAPI])
//etc
}
And a Factory object:
object ActorLocator {
def local(system: ActorSystem): ActorLocator =
new ActorLocatorLocalImpl(system)
}
If you need to create more complex implementations of the service and more complex factory methods, the users, having constructed a service, still just deal with the interface of the Trait.

Play framework - testing data access layer without started application

Is there a way to write tests for the data access objects (DAOs) in play framework 2.x without starting an app?
Tests with fake app are relatively slow even if the database is an in-memory H2 as the docs suggests.
After experiencing similar issues with execution time of tests using FakeAplication I switched to a different approach. Instead of creating one fake app per test I start a real instance of the application and run all my tests against it. With a large test suite there is a big win in total execution time.
http://yefremov.net/blog/fast-functional-tests-play/
For unit testing, a good solution is mocking. If you are using Play 2.4 and above, Mockito is already built in, and you do not have to import mockito separately.
For integration testing, you cannot run tests without fake application, since sometimes your DAOs probably require application context information, for example the information defined in application.conf. In this case, you must setup a FakeApplication with fake application configuration so that DAOs have that information.
This sample repo,https://github.com/luongbalinh/play-mongo/tree/master/test, contains tests at service and controller layers, including both unit tests with Mockito and integration tests. Integration tests for DAOs should be very similar to the service tests. Hopefully, it gives you a hint of how to use Mockito to write DAO tests.
Turns out the Database object can be constructed directly from the Databases factory, so ended up having a trait like this one:
trait DbTests extends BeforeAndAfterAll with SuiteMixin { this: Suite =>
val dbUrl = sys.env.getOrElse("DATABASE_URL",
"jdbc:postgresql://localhost:5432/testuser=user&password=pass")
val database = Databases("org.postgresql.Driver", dbUrl, "tests")
override def afterAll() = {
database.shutdown()
}
}
then use it the following way:
class SampleDaoTest extends DbTests {
val myDao = new MyDao(database) //construct the dao, database is injected so can be passed
"read form db" in {
myDao.read(id = 123) mustEqual MyClass(123)
}
}

Configure play-slick and samples

I'm currently trying to use Play! Framework 2.2 and play-slick (master branch).
In the play-slick code I would like to override driver definition in order to add the Oracle Driver (I'm using slick-extension). In the Config.Scala of play-slick I just saw /** Extend this to add driver or change driver mapping */ ...
I'm coming from far far away (currently reading Programming In Scala) so there's a lot to learn. So my questions are :
Can someone explain me how to extend this Config object ? this object is used in others classes ... Is the cake apttern useful here ?
Talking about cake pattern, I read the computer-database example provided by play-slick. This sample uses the cake pattern and import play.api.db.slick.Config.driver.simple._ If I'm using Oracle driver I cannot use this import, am I wrong ? How can I use the cake pattern to define an implicit session ?
Thanks a lot.
Waiting for your advices and I'm still studying the play-slick code at home :)
To extend the Config trait I do not think the cake pattern is required. You should be able to create your Config object like this:
import scala.slick.driver.ExtendedDriver
object MyExtendedConfig extends play.api.db.slick.Config {
override def driverByName: String => Option[ExtendedDriver] = {name: String =>
super.driverByName(name) orElse Map("oracledriverstring" -> OracleDriver).get(name)
}
lazy val app = play.api.Play.current
lazy val driver: ExtendedDriver = driver()(app)
}
To be able to use it you only need to do: import MyExtendedConfig.driver._ instead of import play.slick.db.api.Config.driver._. BTW, I see that the type of the driverByName could have been a Map instead of a Function making it easier to extend. This shouldn't break though, but it would be easier to do it.
I think Jonas Bonér's old blog is a great place to read what the cake pattern is (http://jonasboner.com/2008/10/06/real-world-scala-dependency-injection-di/). My naive understanding of it is that you have a cake pattern when you have layers that uses the self types:
trait FooComponent{ driver: ExtendedDriver =>
import driver.simple._
class Foo extends Table[Int]("") {
//...
}
}
There are 2 use cases for the cake pattern in slick/play-slick: 1) if you have tables that references other tables (as in the computer database sample) 2) to have control over exactly which database is used at which time or if you use many many different types. By using the Config you do not really need the cake pattern as long as you only have 2 different DBs (one for prod and one for test), which is the point of the Config.
Hope this answers your questions and good luck on reading Programming in Scala (loved that book :)

dynamically create class in scala, should I use interpreter?

I want to create a class at run-time in Scala. For now, just consider a simple case where I want to make the equivalent of a java bean with some attributes, I only know these attributes at run time.
How can I create the scala class? I am willing to create from scala source file if there is a way to compile it and load it at run time, I may want to as I sometimes have some complex function I want to add to the class. How can I do it?
I worry that the scala interpreter which I read about is sandboxing the interpreted code that it loads so that it won't be available to the general application hosting the interpreter? If this is the case, then I wouldn't be able to use the dynamically loaded scala class.
Anyway, the question is, how can I dynamically create a scala class at run time and use it in my application, best case is to load it from a scala source file at run time, something like interpreterSource("file.scala") and its loaded into my current runtime, second best case is some creation by calling methods ie. createClass(...) to create it at runtime.
Thanks, Phil
There's not enough information to know the best answer, but do remember that you're running on the JVM, so any techniques or bytecode engineering libraries valid for Java should also be valid here.
There are hundreds of techniques you might use, but the best choice depends totally on your exact use case, as many aren't general purpose. Here's a couple of ideas though:
For a simple bean, you may as well
just use a map, or look into the
DynaBean class from apache commons.
For more advanced behaviour you could
invoke the compiler explicitly and
then grab the resulting .class file
via a classloader (this is largely
how JSPs do it)
A parser and custom DSL fit well in
some cases. As does bean shell
scripting.
Check out the ScalaDays video here: http://days2010.scala-lang.org/node/138/146
which demonstrates the use of Scala as a JSR-223 compliant scripting language.
This should cover most scenarios where you'd want to evaluate Scala at runtime.
You'll also want to look at the email thread here: http://scala-programming-language.1934581.n4.nabble.com/Compiler-API-td1992165.html#a1992165
This contains the following sample code:
// We currently call the compiler directly
// To reduce coupling, we could instead use ant and the scalac ant task
import scala.tools.nsc.{Global, Settings}
import scala.tools.nsc.reporters.ConsoleReporter
{
// called in the event of a compilation error
def error(message: String): Nothing = ...
val settings = new Settings(error)
settings.outdir.value = classesDir.getPath
settings.deprecation.value = true // enable detailed deprecation warnings
settings.unchecked.value = true // enable detailed unchecked warnings
val reporter = new ConsoleReporter(settings)
val compiler = new Global(settings, reporter)
(new compiler.Run).compile(filenames)
reporter.printSummary
if (reporter.hasErrors || reporter.WARNING.count > 0)
{
...
}
}
val mainMethod: Method = {
val urls = Array[URL]( classesDir.toURL )
val loader = new URLClassLoader(urls)
try {
val clazz: Class = loader.loadClass(...)
val method: Method = clazz.getMethod("main", Array[Class]( classOf[Array[String]] ))
if (Modifier.isStatic(method.getModifiers)) {
method
} else {
...
}
} catch {
case cnf: ClassNotFoundException => ...
case nsm: NoSuchMethodException => ...
}
}
mainMethod.invoke(null, Array[Object]( args ))