I have been wondering for a couple of days if NSInvocation should need the NSMethodSignature.
Lets say we want to write our own NSInvocation, my requirements would be as so:
I need a selector SEL
The target object to call the selector on
The argument array
Then i would get the IMP out from the target and the SEL, and pass the argument as parameters.
So, my question is, why do we need an NSMethodSignature to construct and use an NSInvocation?
Note: I do know that by having only a SEL and a target, we dont have the arguments and return type for this method, but why would we care about the types of the args and returns?
Each type in C has a different size. (Even the same type can have different sizes on different systems, but we'll ignore that for now.) int can have 32 or 64 bits, depending on the system. double takes 64 bits. char represents 8 bits (but might be passed as a regular int depending on the system's passing convention). And finally and most importantly, struct types have various sizes, depending on how many elements are in it and each of their sizes; there is no bound to how big it can be. So it is impossible to pass arguments the same way regardless of type. Therefore, how the calling function arranges the arguments, and how the called function interprets its arguments, must be dependent on the function's signature. (You cannot have a type-agnostic "argument array"; what would be the size of the array elements?) When normal function calls are compiled, the compiler knows the signature at compile time, and can arrange it correctly according to calling convention. But NSInvocation is for managing an invocation at runtime. Therefore, it needs a representation of the method signature to work.
There are several things that the NSInvocation can do. Each of those things requires knowledge of the number and types (at least the sizes of the types) of the parameters:
When a message is sent to an object that does not have a method for it, the runtime constructs an NSInvocation object and passes it to -forwardInvocation:. The NSInvocation object contains a copy of all the arguments that are passed, since it can be stored and invoked later. Therefore, the runtime needs to know, at the very least, how big the parameters in total are, in order to copy the right amount of data from registers and/or stack (depending on how arguments are arranged in the calling convention) into the NSInvocation object.
When you have an NSInvocation object, you can query for the value of the i'th argument, using -getArgument:atIndex:. You can also set/change the value for the i'th argument, using -setArgument:atIndex:. This requires it to know 1) Where in its data buffer the i'th parameter starts; this requires knowing how big the previous parameters are, and 2) how big the i'th parameter is, so that it can copy the right amount of data (if it copies too little, it will have a corrupt value; if it copies too much, say, when you do getArgument, it can overwrite the buffer you gave it; or when you do setArgument, overwrite other arguments).
You can have it do -retainArguments, which causes it to retain all arguments of object pointer type. This requires it to distinguish between object pointer types and other types, so the type information must include not only the size.
You can invoke the NSInvocation, which causes it to construct and execute the call to the method. This requires it to know, at the very least, how much data to copy from its buffer into the registers/stack to place all the data where the function will expect it. This requires knowing at least the combined size of all parameters, and probably also needs to know the sizes of individual parameters, so that the divide between parameters on registers and parameters on stack can be figured out correctly.
You can get the return value of the call using -getReturnValue:; this has similar issues to the getting of arguments above.
A thing not mentioned above is that the return type may also have a great effect on the calling mechanism. On x86 and ARM, the common architectures for Objective-C, when the return type is a struct type, the calling convention is very different -- effectively an additional (first) parameter is added before all the regular parameters, which is a pointer to the space that the struct result should be written. This is instead of the regular calling convention where the result is returned in a register. (In PowerPC I believe that double return type is also treated specially.) So knowing the return type is essentially for constructing and invoking the NSInvocation.
NSMethodSignature is required for the message sending and forwarding mechanism to work properly for invocations. NSMethodSignature and NSInvocation were built as a wrapper around __builtin_call(), which is both architecture dependent, and extremely conservative about the stack space a given function requires. Hence, when invocations are invoked, __builtin_call() gets all the information it needs from the method signature, and can fail gracefully by throwing the call out to the forwarding mechanism knowing that it, too, receives the proper information about how the stack should look for re-invocation.
That being said, you cannot make a primitive NSInvocation without a method signature without modifying the C language to support converting arrays to VARARGS seeing as objc_msgSend() and its cousins won't allow it. Even if you could get around that, you'd need to calculate the size of the arguments and the return type (not too hard, but if you're wrong, you're wrong big time), and manage a proper call to __builtin_call(), which would require an intimate knowledge of the message sending architecture or an ffi (which probably drops down to __builtin_call() anyhow).
Related
I have following function:
def timestamp(key: String)
: String
= Monoid.combine(key, Instant.now().getEpochSecond.toString)
and wanted to know, if it is pure or not? A pure function for me is, given the same input returns always the same output. But the function above, given always the same string will returns another string with another time, that it is in my opinion not pure.
No, it's not pure by any definition I know of. A good discussion of pure functions is here: https://alvinalexander.com/scala/fp-book/definition-of-pure-function. In Alvin's definition of purity he says:
A pure function has no “back doors,” which means:
...
It cannot depend on any external I/O. It can’t rely on input from files, databases, web services, UIs, etc; it can’t produce output, such as writing to a file, database, or web service, writing to a screen, etc.
Reading the time of the current system uses I/O so it is not pure.
You are right, it is not a pure function as it returns different result for the same arguments. Mathematically speaking it is not a function at all.
Definition of Pure function from Wikipedia
The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices (usually—see below).
Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices (usually—see below).
I never really got the idea behind this parameter, what is it good for? I also noticed this parameter is ignored in WinSock2, why is that?
Do Unix systems use this parameter or do they ignore it as well?
Windows' implementation of select() uses linked lists internally, so it doesn't need to use the nfds parameter for anything.
On other OS's, however, the fd_set struct is implemented to hold an array of bits (one bit per socket). For example, here is how it is declared (in sys/_types/_fd_def.h) under MacOS/X:
typedef struct fd_set {
__int32_t fds_bits[__DARWIN_howmany(__DARWIN_FD_SETSIZE, __DARWIN_NFDBITS)];
} fd_set;
... and in order to do the right thing, the select() call will have to loop over the bits in the array to see what they contain. By supplying select() with the nfds parameter, we tell the select() implementation that it only needs to iterate over the first (nfds) bits of the array, rather than always having to iterate over the entire array on every call. This allows select() to be more efficient than it would otherwise be.
The documentation specifies the number of CGPoints the points property of CGPathElement holds in function of the value of its type property, for instance, there are two points for AddQuadCurveToPoint, one point for .AddLineToPoint, and... nothing for .CloseSubpath.
The documentation states that, since it's an UnsafeMutablePointer, it is the responsibility of the programmer to release it, so, for the cases where there a two points, I call pPointer.dealloc(2), for those with one point pPointer.dalloc(1) but how about those with no point at all ? Should I still call something or not ? I believe I shouldn't, but since I am not used to manual memory management, I might miss something.
By the way, pPointer is defined outside the switch body.
Say I have a project which is comprised of:
A main script that handles all of the running of my simulation
Several smaller functions
A couple of structs containing the data
Within the script I will be accessing the functions many times within for loops (some over a thousand times within the minute long simulation). Each function is also looking for data contained with a struct files as part of their calculations, which are usually parameters that are fixed over the course of the simulation, however need to be varied manually between runs to observe the effects.
As typically these functions form the bulk of the runtime I'm trying to save time, as my simulation can't quite run at real-time as it stands (the ultimate goal), and I lose alot of time passing variables/parameters around functions. So I've had three ideas to try and do this:
Load the structs in the main simulation, then pass each variable in turn to the function in the form of a large argument (the current solution).
Load the structs every time the function is called.
Define the structs as global variables.
In terms of both the efficiency of the system (most relevent as the project develops), and possibly as I'm no expert programmer from a "good practise" perspective what is the best solution for this? Is there another option that I have not considered?
As mentioned above in the comments - the 1st item is best one.
Have you used the profiler to find out where you code takes most of its time?
profile on
% run your code
profile viewer
Note: if you are modifying your input struct in your child functions -> this will take more time, but if you are just referencing them then that should not be a problem.
Matlab does what's known as a "lazy copy" when passing arguments between functions. This means that it passes a pointer to the data to the function, rather than creating a new instance of that data, which is very efficient memory- and speed-wise. However, if you make any alteration to that data inside the subroutine, then it has to make a new instance of that argument so as to not overwrite the argument's value in the main function. Your response to matlabgui indicates you're doing just that. So, the subroutine may be making an entire new struct every time it's called, even though it's only modifying a small part of that struct's values.
If your subroutine is altering a small part of the array, then your best bet is to just pass that small part to it, then assign your outputs. For instance,
[modified_array] = somesubroutine(struct.original_array);
struct.original_array=modified_array;
You can also do this in just one line. Conceptually, the less data you pass to the subroutine, the smaller the memory footprint is. I'd also recommend reading up on in-place operations, as it relates to this.
Also, as a general rule, don't use global variables in Matlab. I have not personally experienced, nor read of an instance in which they were genuinely faster.
Please note: by a "pure" function, I don't mean "pure virtual"
I'm referring to this
If a function "reads" some global state, does that automatically render it impure? or does it depend on other factors?
If it automatically renders it impure, please explain why.
If it depends on other factors, please explain what are they.
A "pure" function is a function whose result depends only on its input arguments. If it reads anything else, it is not a pure function.
In certain specialized instances, yes. For example, if you had a global cache of already-computed values that was only read and written by your function, it would still be mathematically pure, in the sense that the output only depended on the inputs, but it wouldn't be pure in the strictest sense. For example:
static int cache[256] = {0};
int compute_something(uint8_t input)
{
if(cache[input] == 0)
cache[input] = (perform expensive computation on input that won't return 0);
return cache[input];
}
In this case, so long as no other function touches the global cache, it's still a mathematically pure function, even though it technically depends on external global state. However, this state is just a performance optimization -- it would still perform the same computation without it, just more slowly.
Pure functions are required to construct pure expressions. Constant expressions are pure by definition.
So, if your global 'state' doesn't change you are fine.
Also see referential transparency:
A more subtle example is that of a function that uses a global variable (or a dynamically scoped variable, or a lexical closure) to help it compute its results. Since this variable is not passed as a parameter but can be altered, the results of subsequent calls to the function can differ even if the parameters are identical. (In pure functional programming, destructive assignment is not allowed; thus a function that uses global (or dynamically scoped) variables is still referentially transparent, since these variables cannot change.)
In Haskell for instance, you can create an endless list of random numbers on the impure side, and pass that list to your pure function. The implementation will generate the next number your pure function is using only when it needs it, but the function is still pure.