The docs recommend to register frequently-used components via lambdas as ...
This can yield an improvement of up to 10x faster Resolve() calls
Now there are obviously a few questions:
why? (EDIT: to clarify: I would understand if the register time goes up, because you got to use reflection now to find the right constructor and such, but why the heck the resolve time?)
In which scenarios does this apply / what aspects of the registered class make this number go up/which make it go down?
What resolve times are we generally talking about anyhow? Like "yeah now it takes 100 instead of 10 cpu cycles" or actually measurable numbers in "normal" use cases (web service with per-request lifetimes)?
As noted in the comments, the short version is that the concrete implementation is going to be faster than the reflection manner of resolving.
Diving deeper, think about the steps involved in each.
Lambda:
Execute the method.
There is no step two.
Reflection:
Enumerate all the constructors for the type to be instantiated. This list could be cached, but is already fairly well cached by the .NET framework.
Of all of the constructors that are available, figure out which one to execute based on the number of constructor parameters available and the types registered in the container. Note the types registered in the container may change based on registration sources, lifetime scope registrations, etc.
Resolve the constructor parameters. If there are reflection-based registrations that make up the constructor parameters, run them through this process recursively.
Invoke the selected constructor using the resolved parameters.
As you can see, there's actually a lot more work than just Activator.CreateInstance in the reflection manner of resolving, which is why it takes longer.
But, as also noted in the comments, don't worry about premature optimization. This all happens pretty darn quickly so wait to optimize until you can actually locate a bottleneck using a profiler or some similar tool.
Related
I am working with a legacy scala codebase, and as is always the case modifying the code is quite difficult without touching different parts.
One of my new requirement in to make several decisions based on some input parameters. Problem is that these decisions are to be made at various points along the execution. So either I encapsulate all those parameters in a case class instance and pass it along. But it means I would have to modify multiple methods signatures, and I want to avoid this approach as much as possible.
Another approach can be to create a global object containing all those input parameters and accessible from different points in the execution. Is it a good approach in Scala?
No, using global mutable variables to pass “hidden” parameters is not a good idea, not in Scala and not in any other programming language. It makes the code hard to understand and modify, because a function's behaviour will now depend on which functions were invoked earlier. And it's extremely fragile, because you might forget setting one of those global parameters before invoking the function, which means that it will use whatever value was stored there before. This is the kind of thing that can appear to work for years, and then break when you modify a completely unrelated part of the program.
I can't stress this enough: do not use global mutable variables, period. The solution is to man up and change those method signatures. Depending on the details, dependency injection may or may not help in your particular case.
I have heard people describe dynamic patching as a bit of a hack or at risk of breaking in future releases of Pd. This is reasonable enough, but it seems to imply that there are alternatives when building abstractions.
Dynamic patching seems to be useful for both instantiating a variable number of objects and connecting up to a variable number (a number defined at creation time - I personally don't need it to change after the fact, at this stage) of inlets and outlets within an abstraction.
Now I understand that the [clone] object can solve the problem of creating objects. I can see too that looping through send and receive objects would solve much of the connection issues with careful planning but what I do not understand is how objects like [trigger], [route] and [select] can be adjusted or replaced in some way? I fail to see how you would avoid using dynamic patching to, for example, create a [trigger f f] when the creation arg to your abstraction is 2, and a [trigger f f f] when the creation arg is 3. Again, the same with [route] and [select] and similar objects.
EDIT: The original question was perceived as too vague. I later posed a follow-up question in the comments which should really be here instead. As it happens, the answer to the follow-up provided a good answer to the original question, in my opinion. So to summarise and hopefully clarify, I was after a few "tools" to use when building abstractions so that I could limit my use of dynamic patching, if possible. These tools turned out to be:
using send and receive instead of inlets and outlets (although [initbang] can be used for creating inlets and outlets at instantiation).
using [clone]
chaining trigger, route and select objects using send and receive - for example, using [t b b] - [t b b] instead of [t b b b]. This means that the number of arguments in these objects can be defined at creation time with the help of [clone] for example. This is discussed in the Pd mailing list.
using [initbang] as indicated in the answer below.
After having attempted to build a drum machine with presets and an arbitrary number of tracks with my limited knowledge of dynamic patching techniques, I realised that there must be many ways of avoiding the problems I had when doing this, which were several! Of course, some things have to be done with dynamic patching and that's fine. It's just about creating manageable code.
This is really an answer to "follow-up question" in the comment¹, rather than the original question (which I consider too broad to be answered),
Is there a way to define an abstraction that has an argument that defines how many outlets the abstraction exposes?
Sure, just use $1 for that.
E.g. [gates 10] could create 10 outlets...
Presumably it could dynamically patch itself, but that doesn't seem like a good idea.
well, if you want an abstraction to have a dynamic API (that is: a variable number of inlets/outlets), then there is no way around dynamic patching.
Is this a good case for building your own external?
depends on what you actually want the external to do.
the iemguts library (disclaimer: of which I am the author) has everything in place to allow you to dynamically patch what you need.
Most important, there is [initbang], to create iolets before Pd tries to connect them (if you use [loadbang], the iolets will be created after Pd failed to connect to them).
It also includes a [canvasargs] object which allows you to get all the arguments to the abstraction (e.g. which simplifies the task of having the number of outlets equal the number of arguments - like [trigger] or [pack])
if instead you want to wrap the entire functionality of your abstraction into an external, that's of course also possible (and pretty simple in the realm of C).
Also keep in mind that other's might have already coded what you need.
¹ please don't abuse the comment field for follow-up questions. either update your original question (if the follow-up is a mere clarification of the original question) or post a new one.
I have read various old StackOverflow discussions on this general topic but there is still one part of the puzzle which appears, to me at least, to be missing.
It is simply this: what is the actual mechanism by which the anonymous function is serialized? And, where could we find its source code?
Or is it all just magic?
Other relevant SO articles (the third of these itself points to some useful articles outside StockOverflow):
Serialization of Scala Functions
Why Scala can serialize...
How to serialize functions in Scala
I'm going to answer my own question with what, I believe is the correct answer. The reason I'm doing it this way is that it seems to me that this aspect of serialization is never explained and it does appear to work just by magic. I essentially confirmed (to my satisfaction) the answer as part of the research I was doing to ensure that my question above was indeed appropriate.
But the main reason I'm offering my own answer is that I invite knowledgeable users either to agree with it, to correct it, to expand upon it, or to destroy it. Here goes...
It's all magic. No, I'm just kidding. But essentially the mechanism, once Scala has taken the step of representing the anonymous function as a Class, is entirely provided for by Java. In addition, we, the programmer, need to ensure that an anonymous function is as much pure code as possible: no references to any objects that might not be serializable. The secret sauce is to be found in the Java class: ObjectStreamClass. Which, in turn, is invoked by the Java serialization classes: ObjectInputStream and ObjectOutputStream.
Essentially the serialized bytes contain the full pathname of the class, its serialVersionUID, and whatever other relevant information is necessary. When deserializing, the system will simply look up the class in the appropriate classpath and return a reference to it. This obviously assumes that the deserializing system has the class in its classpath. The mechanism for that is a little beyond the scope of my research but it's clear that in a system like Spark, it should be easy to arrange.
No (additional) compilation/decompilation of byte code is necessary as the classLoader has everything necessary. I'm slightly surprised to find the ObjectStreamClass in java.io rather than in the reflection package, but I suppose there's an argument for it being there, given the tight coupling with ObjectInputStream and ObjectOutputStream.
One thing to keep in mind is that while we think in terms of serializing/deserializing objects, rather than classes, what we are dealing with here is an object of type Class.
One more thing to note is that in Scala 2.12, anonymous functions are now implemented differently: as Java8 lambdas. This has broken the mechanism described above in a rather serious way. So serious, that Spark is currently having trouble supporting Scala 2.12. The holdup appears to be this issue: SPARK-14540.
The title might be a little confusing so let me elaborate, I've been reading some criticism regarding Scala. It was an email sent to Tyepsafe regarding some deficiencies in Scala from Coda Hale (Yammer's Infrastructure Architect), so to quote:
we stopped seeing lambdas as free and started seeing them as syntactic sugar on top of anonymous classes and thus acquired the same distaste for them as we did anonymous classes.
So, from this, I have a couple of questions regarding how lambdas work in Scala:
What is the difference between a free function and a function that is bound to an anonymous class (technically, aren't all functions bound to the main singleton object)?
What is the impact on performance of using an anonymous class bound function instead of a free function?
Yes, lambdas are still objects, instances of anonymous classes.
This is how the JVM works, all references are objects. You can have either references or values (primitives) and there's no way around it.
Later versions of Java have MethodHandles. But it's worth noting that MethodHandle is also still just an abstract class - albeit one that the JVM specifically knows how to optimise away at runtime.
Also also worth noting is that the JVM can often perform escape analysis on abstract classes (such as Scala's functions), and optimise these away too.
On top of this, Scala can use any object with an apply method as though it were a Function. In this case, the explicit call to apply is emitted in the bytecode and you're not dealing with anonymous classes any more.
Given all of the above, it's impossible to make a general statement regarding the performance of Scala's function implementation, it depends on your specific code/use case. In general, I wouldn't worry unless you hit a corner case where your profiler pinpoints a problem here (which is very unlikely)
Well, in C for example a function is just a 32 or 64 bit pointer to a place in memory to jump to and the concept of a closure doesn't really apply since you can't declare an anonymous c function. I don't know how the C++ lambdas work, I guess the compiler makes a method and passes the fields you want in the closure along with parameters. Maybe that's what you're looking for. In the JVM you have to wrap your logic in a class so now you have a virtual table of methods, fields, and some methods related to synchronization and the type system.
What is the impact on performance?...I don't know, have you noticed an impact on performance? A lot of that extra Java stuff I described really isn't needed for an anonymous class and might just get optimized out. I imagine there are butterflies that influence the weather more than the extra JVM stuff would effect your software.
So I've got maybe 10 objects each of which has 1-3 dependencies (which I think is ok as far as loose coupling is concerned) but also some settings that can be used to define behavior (timeout, window size, etc).
Now before I started using an Inversion of Control container I would have created a factory and maybe even a simple ObjectSettings object for each of the objects that requires more than 1 setting to keep the size of the constructor to the recommended "less than 4" parameter size. I am now using an inversion of control container and I just don't see all that much of a point to it. Sure I might get a constructor with 7 parameters, but who cares? It's all being filled out by the IoC anyways.
Am I missing something here or is this basically correct?
The relationship between class complexity and the size of the IoC constructor had not occurred to me before reading this question, but my analysis below suggests that having many arguments in the IoC constructor is a code smell to be aware of when using IoC. Having a goal to stick to a short constructor argument list will help you keep the classes themselves simple. Following the single responsibility principle will guide you towards this goal.
I work on a system that currently has 122 classes that are instantiated using the Spring.NET framework. All relationships between these classes are set up in their constructors. Admittedly, the system has its fair share of less than perfect code where I have broken a few rules. (But, hey, our failures are opportunities to learn!)
The constructors of those classes have varying numbers of arguments, which I show in the table below.
Number of constructor arguments Number of classes
0 57
1 19
2 25
3 9
4 3
5 1
6 3
7 2
8 2
The classes with zero arguments are either concrete strategy classes, or classes that respond to events by sending data to external systems.
Those with 5 or 6 arguments are all somewhat inelegant and could use some refactoring to simplify them.
The four classes with 7 or 8 arguments are excellent examples of God objects. They ought to be broken up, and each is already on my list of trouble-spots within the system.
The remaining classes (1 to 4 arguments) are (mostly) simply designed, easy to understand, and conform to the single responsibility principle.
The need for many dependencies (maybe over 8) could be indicative of a design flaw but in general I think there is no problem as long as the design is cohesive.
Also, consider using a service locator or static gateway for infrastructure concerns such as logging and authorization rather than cluttering up the constructor arguments.
EDIT: 8 probably is too many but I figured there'd be the odd case for it. After looking at Lee's post I agree, 1-4 is usually good.
G'day George,
First off, what are the dependencies between the objects?
Lots of "isa" relationships? Lots of "hasa" relationships?
Lots of fan-in? Or fan-out?
George's response: "has-a mostly, been trying to follow the composition over inheritance advice...why would it matter though?"
As it's mostly "hasa" you should be all right.
Better make sure that your construction (and destruction) of the components is done correctly though to prevent memory leaks.
And, if this is in C++, make sure you use virtual destructors?
This is a tough one, and why I favor a hybrid approach where appropriate properties are mutable and only immutable properties and required dependencies without a useful default are part of the constructor. Some classes are constructed with the essentials, then tuned if necessary via setters.
It all depends upon what kind of container that you have used to do the IOC and what approaches the container takes whether it uses annotations or configuration file to saturate the object to be instiantiated. Furthermore, if your constructor parameters are just plain primitive data types then it is not really a big deal; however if you have non-primitive types then in my opinion, you can use the Property based DI rather than consutructor based DI.