apache bean unable to serialize due to interface that I want to mock - apache-beam

unable to serialize DoFnWithExecutionInformation{doFn=com.orderly.dataflow.RosterFileReader#60ec7684, mainOutputTag=Tag, sideInputMapping={}, schemaInformation=DoFnSchemaInformation{elementConverters=[]}}
java.lang.IllegalArgumentException: unable to serialize DoFnWithExecutionInformation{doFn=com.orderly.dataflow.RosterFileReader#60ec7684, mainOutputTag=Tag, sideInputMapping={}, schemaInformation=DoFnSchemaInformation{elementConverters=[]}}
I am not sure this is possible. I understand in apache beam, these functions must be serializable for scaling out but during test time I also want to mock where we read from.
Is there some kind of context or something I can create to inject the interface which is mockable for reading?
Here is my code
public class RosterFileReader extends DoFn<String, PractitionerStandardOutputDto> {
private static final Logger log = LoggerFactory.getLogger(RosterFileReader.class);
private final String projectId;
private GCPBucketStorage storage;
public RosterFileReader(String projectId, GCPBucketStorage storage) {
I am wondering if there is a context I can send in? GCPBucketStorage.java is OUR API since googles was not that mockable. In this way, we have full control of throwing exceptions and testing recovery as well as other scenarios.
EDIT: I would be willing to settle for some code like this
if(isRunningLocally) {
storage = new MockStorage();
} else {
storage = new GCPBucketStorageImpl();
}
it basically kind of sucks having test code like that in production code, but this END to END test has already caught bugs!!!!!! bugs that are missed in the unit testing that people are doing. We generally do not do single class or unit testing and only do what twitter calls Feature Testing since it allows huge refactors without touching tests -> https://blog.twitter.com/engineering/en_us/topics/insights/2017/the-testing-renaissance.html
I can reach into the Mock using static fields I guess(Again, I hate doing that but this test is so valuable having it end to end truly)
EDIT 2: Is serialization a ton like hadoop where you have to define classes to deploy along with the main jar? Perhaps I just need a document to make GCPStorage and GCPBucketStorageImpl serializable (and MockStorage as well most likely since it is in production code :( ) -> The if..else is totally worth it on the integration bugs we are finding pre-CI so people don't break the code on master ever.
EDIT 3:
This looks very promising -> https://gist.github.com/jlewi/f1cd323dc88bd58601ef
Will update post after trying.
thanks,
Dean

Actually, just making the interface serializable seems to work in test (not tested in production yet) Along with this last line in the production code that is ugly(and I wonder if I can inject this instead) ->
yup, pretty ugly to have test code in production :(. Of BIG worth note is instead of injecting the interface, I inject GCPBucketStorageImpl.java (our prod implemntation) and now pass in the mock(which subclasses the prod class <---- This is done so we get test validation on someone modifying production class to not being serializable in which case we would not catch the issue.

Related

Modifying Autofac Scope to Support XUnit Testing

I use Autofac extensively. Recently I've gotten interested in tweaking the lifetime scopes when registering items for XUnit testing. Basically I want to register a number of standard components I use as "instance per test" rather than what I normally do for runtime (I've found a useful library on github that defines an instance-per-test lifetime).
One way to do this is to define two separate container builds, one for runtime and one for xunit testing. That would work but it gets increasingly expensive to maintain.
What I'd like to do (I think) is modify the registration pipeline dynamically depending upon the context -- runtime or xunit test -- in which it is being built. In pseudocode:
builder.RegisterType<SomeType>().AsImplementedInterfaces().SingleInstance();
...
void TweakPipeline(...)
{
if( Testing )
{
TypeBeingRegistered.InstancePerTest();
}
else
{
TypeBeingRegistered.SingleInstance();
}
}
Is this something Autofac middleware can do? If not is there another capability in the Autofac API which could address it? As always, links to examples would be appreciated.
This is an interesting question. I like that you started thinking about some of the new features in Autofac, very few do. So, kudos for the good question.
If you think about the middleware, yes, you can probably use it to muck with lifetime scope, but we didn't really make "change the lifetime scope on the fly" something easy to do and... I'll be honest, I'm not sure how you'd do it.
However, I think there are a couple of different options you have to make life easier. In the order in which I'd do them if it was me...
Option 1: Container Per Test
This is actually what I do for my tests. I don't share a container across multiple tests, I actually make building the container part of the test setup. For Xunit, that means I put it in the constructor of the test class.
Why? A couple reasons:
State is a problem. I don't want test ordering or state on singletons in the container to make my tests fragile.
I want to test what I deploy. I don't want something to test out OK only to find that it worked because of something I set up in the container special for testing. Obvious exceptions for mocks and things to make the tests actually unit tests.
If the problem is that the container takes too long to set up and is slowing the tests down, I'd probably troubleshoot that. I usually find the cause of this to be either that I'm assembly scanning and registering way, way too much (oops, forgot the Where statement to filter things down) or I've started trying to "multi-purpose" the container to start orchestrating my app startup logic by registering code to auto-execute on container build (which is easy to do... but don't forget the container isn't your app startup logic, so maybe separate that out).
Container per test really is the easiest, most isolated way to go and requires no special effort.
Option 2: Modules
Modules are a nice way to encapsulate sets of registrations and can be a good way to take parameters like this. In this case, I might do something like this for the module:
public class MyModule : Module
{
public bool Testing { get; set; }
protected override void Load(ContainerBuilder builder)
{
var toUpdate = new List<IRegistrationBuilder<object, ConcreteReflectionActivatorData, SingleRegistrationStyle>>();
toUpdate.Add(builder.RegisterType<SomeType>());
toUpdate.Add(builder.RegisterType<OtherType>());
foreach(var reg in toUpdate)
{
if(this.Testing)
{
reg.InstancePerTest();
}
else
{
reg.SingleInstance();
}
}
}
}
Then you could register it:
var module = new MyModule { Testing = true };
builder.RegisterModule(module);
That makes the list of registrations easier to tweak (foreach loop) and also keeps the "things that need changing based on testing" isolated to a module.
Granted, it could get a little complex in there if you have lambdas and all sorts of other registrations in there, but that's the gist.
Option 3: Builder Properties
The ContainerBuilder has a set of properties you can use while building stuff to help avoid having to deal with environment variables but also cart around arbitrary info you can use while setting up the container. You could write an extension method like this:
public static IRegistrationBuilder<TLimit, TActivatorData, TRegistrationStyle>
EnableTesting<TLimit, TActivatorData, TRegistrationStyle>(
this IRegistrationBuilder<TLimit, TActivatorData, TRegistrationStyle> registration,
ContainerBuilder builder)
{
if(builder.Properties.TryGetValue("testing", out var testing) && Convert.ToBoolean(testing))
{
registration.InstancePerTest();
}
return registration;
}
Then when you register things that need to be tweaked, you could do it like this:
var builder = new ContainerBuilder();
// Set this in your tests, not in production
// builder.Properties["testing"] = true;
builder.RegisterType<Handler>()
.SingleInstance()
.EnableTesting(builder);
var container = builder.Build();
You might be able to clean that up a bit, but again, that's the general idea.
You might ask why use the builder as the mechanism to transport properties if you have to pass it in anyway.
Fluent syntax: Due to the way registrations work, they're all extension methods on the registration, not on the builder. The registration is a self-contained thing that doesn't have a reference to the builder (you can create a registration object entirely without a builder).
Internal callbacks: The internals on how registration works basically boil down to having a list of Action executed where the registrations have all the variables set up in a closure. It's not a function where we can pass stuff in during build; it's self-contained. (That might be interesting to change, now I think of it, but that's another discussion!)
You can isolate it: You could put this into a module or anywhere else and you won't be adding any new dependencies or logic. The thing carting around the variable will be the builder itself, which is always present.
Like I said, you could potentially make this better based on your own needs.
Recommendation: Container Per Test
I'll wrap up by just again recommending container per test. It's so simple, it requires no extra work, there are no surprises, and it "just works."

IOC vs New guidelines

Recently I was looking at some source code provided by community leaders in their open source implementations. One these projects made use of IOC. Here is sample hypothetical code:
public class Class1
{
private ISomeInterface _someObject;
public Class1(ISomeInterface someObject)
{
_someObject = someObject;
}
// some more code and then
var someOtherObject = new SomeOtherObject();
}
My question is not about what the IOCs are for and how to use them in technical terms but rather what are the guidelines regarding object creation. All that effort and then this line using "new" operator. I don't quite understand. Which object should be created by IOC and for which ones it is permissible to be created via the new operator?
As a general rule of thumb, if something is providing a service which may want to be replaced either for testing or to use a different implementation (e.g. different authentication services) then inject the dependency. If it's something like a collection, or a simple data object which isn't providing behaviour which you'd ever want to vary, then it's fine to instantiate it within the class.
Usually you use IoC because:
A dependency that can change in the future
To code against interfaces, not concrete types
To enable mocking these dependencies in Unit Testing scenarios
You could avoid using IoC in the case where you don't control the dependency, for example an StringBuilder is always going to be an StringBuilder and have a defined behavior, and you usually don't really need to mock that; while you might want to mock an HttpRequestBase, because it's an external dependency on having an internet connection, for example, which is a problem during unit tests (longer execution times, and it's something out of your control).
The same happens for database access repositories and so on.

Autofac and Quartz.Net Integration

Does anyone have any experience integrating autofac and Quartz.Net? If so, where is it best to control lifetime management -- the IJobFactory, within the Execute of the IJob, or through event listeners?
Right now, I'm using a custom autofac IJobFactory to create the IJob instances, but I don't have an easy way to plug in to a ILifetimeScope in the IJobFactory to ensure any expensive resources that are injected in the IJob are cleaned up. The job factory just creates an instance of a job and returns it. Here are my current ideas (hopefully there are better ones...)
It looks like most AutoFac integrations somehow wrap a ILifetimeScope around the unit of work they create. The obvious brute force way seems to be to pass an ILifetimeScope into the IJob and have the Execute method create a child ILifetimeScope and instantiate any dependencies there. This seems a little too close to a service locator pattern, which in turn seems to go against the spirit of autofac, but it might be the most obvious way to ensure proper handling of a scope.
I could plug into some of the Quartz events to handle the different phases of the Job execution stack, and handle lifetime management there. That would probably be a lot more work, but possibly worth it if it gets cleaner separation of concerns.
Ensure that an IJob is a simple wrapper around an IServiceComponent type, which would do all the work, and request it as Owned<T>, or Func<Owned<T>>. I like how this seems to vibe more with autofac, but I don't like that its not strictly enforceable for all implementors of IJob.
Without knowing too much about Quartz.Net and IJobs, I'll venture a suggestion still.
Consider the following Job wrapper:
public class JobWrapper<T>: IJob where T:IJob
{
private Func<Owned<T>> _jobFactory;
public JobWrapper(Func<Owned<T>> jobFactory)
{
_jobFactory = jobFactory;
}
void IJob.Execute()
{
using (var ownedJob = _jobFactory())
{
var theJob = ownedJob.Value;
theJob.Execute();
}
}
}
Given the following registrations:
builder.RegisterGeneric(typeof(JobWrapper<>));
builder.RegisterType<SomeJob>();
A job factory could now resolve this wrapper:
var job = _container.Resolve<JobWrapper<SomeJob>>();
Note: a lifetime scope will be created as part of the ownedJob instance, which in this case is of type Owned<SomeJob>. Any dependencies required by SomeJob that is InstancePerLifetimeScope or InstancePerDependency will be created and destroyed along with the Owned instance.
Take a look at https://github.com/alphacloud/Autofac.Extras.Quartz. It also available as NuGet package https://www.nuget.org/packages/Autofac.Extras.Quartz/
I know it a bit late:)

Cannot mock something like TableDomainService where the EntityContext is set in the class definition

I am trying to learn and implement TDD specifically using Moq and I have come up against a design that I can't figure out how to mock:
namespace RIACompletelyRelativeWebService.Web.Services
{
[EnableClientAccess]
public class AncestorDomainService : TableDomainService<AncestorEntityContext>
{
public AncestorDomainService()
{
//this.EntityContext = new AncestorEntityContext();
}
public IQueryable<AncestorEntity> GetAncestorEntities()
{
return this.EntityContext.AncestorEntities;
}
public void AddAncestorEntity(AncestorEntity entity)
{
this.EntityContext.AncestorEntities.Add(entity);
}
}
}
I think I need to mock the TableDomainService so that I can test my AncestorDomainService logic without firing up Azure. I tired something like this:
public class AncestorDomainService<TEntityContext> : TableDomainService<TEntityContext> where TEntityContext is a TableEntityContext
But, the TableDomainService did not like having a generic being used. I also tried setting the EntityContext but it is read only. I have seen other people use the generic DomainService and the Repository design pattern, but since TableDomainService is what lets me use Azure tables behind the scenes, I think I have to stick with TableDomainService<>. Do I just have to fake the TableDomainService, the TableEntityContext and the TableEntitySet that is returned?
I don't know from the code above how the logic you want to test looks like, but you might try seperating your code (the code that you want to test) from the service itself.
You could try to abstract AncestorDomainService (introducing an IAncestorDomainService) and than use moq to mock IAncestorDomainService. Your logic would move to another class that has a dependency to IAncestorDomainService. I've done this with Linq2Sql (which seems to have a similar design and also returns IQueryable). I wouldn't try to mock the 'internals' of TableDomainService because this stuff is usually not designed for easy testing.
THe best solution, if you can afford the time, is to make your code fully testable. That means actually having the scripts necessary to setup an instance of Azure (real or local) with a known good state.
Since the whole point of your AncestorDomainService is to deal with Azure, mocking out its base class doesn't make much sense for a test effectiveness prespective. (Some people choose to optimize for test speed over effectiveness, but I think that's a waste of time.)

How to use OSGi getServiceReference() right

I am new to OSGi and came across several examples about OSGi services.
For example:
import org.osgi.framework.*;
import org.osgi.service.log.*;
public class MyActivator implements BundleActivator {
public void start(BundleContext context) throws Exception {
ServiceReference logRef =
context.getServiceReference(LogService.class.getName());
}
}
My question is, why do you use
getServiceReference(LogService.class.getName())
instead of
getServiceReference("LogService")
If you use LogService.class.getName() you have to import the Interface. This also means that you have to import the package org.osgi.services.log in your MANIFEST.MF.
Isn't that completely counterproductive if you want to reduce dependencies to push loose coupling? As far as I know one advantage of services is that the service consumer doesn't have to know the service publisher. But if you have to import one specific Interface you clearly have to know who's providing it. By only using a string like "LogService" you would not have to know that the Interface is provided by org.osgi.services.log.LogService.
What am I missing here?
Looks like you've confused implementation and interface
Using the actual interface for the name (and importing the interface , which you'll end up doing anyway) reenforces the interface contract that services are designed around. You don't care about the implemenation of a LogService but you do care about the interface. Every LogService will need to implement the same interface, hence your use of the interface to get the service. For all you know the LogService is really a wrapper around SLF4J provided by some other bundle. All you see is the interface. That's the loose coupling you're looking for. You don't have to ship the interface with every implementation. Leave the interface it's own bundle and have multiple implementations of that interface.
Side note: ServiceTracker is usually easier to use, give it a try!
Added benefits: Using the interface get the class name avoids spelling mistakes, excessive string literals, and makes refactoring much easier.
After you've gotten the ServiceReference, your next couple lines will likely involve this:
Object logSvc = content.getService(logRef)
// What can you do with logSvc now?!? It's an object, mostly useless
// Cast to the interface ... YES! Now you need to import it!
LogSerivce logger = (LogService)logSvc;
logger.log(LogService.LOG_INFO, "Interfaces are a contract between implementation and consumer/user");
If you use the LogService, you're coupled to it anyway. If you write middleware you likely get the name parameterized through some XML file or via an API. And yes, "LogService" will fail terribly, you need to use the fully qualified name: "org.osgi.service.log.LogService". Main reason to use the LogService.class.getName() pattern is to get correct renaming when you refactor your code and minimize spelling errors. The next OSGi API will very likely have:
ServiceReference<S> getServiceReference(Class<S> type)
calls to increase type safety.
Anyway, I would never use these low level API unless you develop middleware. If you actually depend on a concrete class DS is infinitely simpler, and even more when you use it with the bnd annotations (http://enroute.osgi.org/doc/217-ds.html).
#Component
class Xyz implements SomeService {
LogService log;
#Reference
void setLog( LogService log) { this.log = log; }
public void foo() { ... someservice ... }
}
If you develop middleware you get the service classes usually without knowing the actual class, via a string or class object. The OSGi API based on strings is used in those cases because it allows us to be more lazy by not creating a class loader until the last moment in time. I think the biggest mistake we made in OSGi 12 years ago is not to include the DS concepts in the core ... :-(
You cannot use value "LogService"
as a class name to get ServiceReference, because you have to use fully qualified class name
"org.osgi.services.log.LogService".
If you import package this way:
org.osgi.services.log;resolution:=optional
and you use ServiceTracker to track services in BundleActivator.start() method I suggest to use "org.osgi.services.log.LogService" instead of LogService.class.getName() on ServiceTracker initializazion. In this case you'll not get NoClassDefFoundError/ClassNotFountException on bundle start.
As basszero mentioned you should consider to use ServiceTracker. It is fairly easy to use and also supports a much better programming pattern. You must never assume that a ServiceReference you got sometime in the past is still valid. The service the ServiceReference points to might have gone away. The ServiceTracker will automatically notify you when a service is registered or unregistered.