Passing in "const string &test" is bugging me - constants

Hello wonderful folks,
I'm scratching my head regarding this issue. 'Const' means the contents are not changed, but '&' means the contents can be modified. So how do they both bond together?

& means that the contents are passed by reference, not necessarily that they are modified. Rather, it means that the object is not copied, meaning that its copy constructor isn't invoked to pass it. The confusion may arise because one (common) use of a reference is to allow the callee to modify the referenced object.
A const reference, as shown here, is a pretty natural way to express that a reference to an object will be passed without copying the object, and that the callee will not modify the object. It's a simple and somewhat effective optimization when the object is expensive to copy, the callee doesn't need to modify it, and there are no relevant lifetime concerns that would require a copy to be taken.

Related

Java Remote object being exported over being serialized

I was reading through Java RMI book by Esmond Pitt where I came across :-
"If an exported object is serializable,it is still passed by reference.The fact that the object is exported takes precedence over the fact that it is serializable"
Could anyone elaborate and expound the reason.
Well, at one level, they had to choose one or the other, I suppose. (I wasn't in that meeting :)
But if I have an exported remote object, it needs to serialize to its stub (that is, it needs to be passed by reference) otherwise, you'd never be able to use it as a server-like object.
Did that make sense? If you allowed it to be serialized, it might as well not be a remote object, as any time you attempted to pass it over a network, you would fail to send the stub (which is what you need if you want to make it remotely accessible) and would simply send a serialized copy. Ergo, it's not a remote object any more.
Of course, you might be asking "why can't I choose dynamically each time this happens". Well, in that case, how on earth would you manage the process without causing some horrible coupling between unrelated parts of your software?
That's my guess :)
Being exported must take precedence for serialization purposes over being serializable. Otherwise RMI wouldn't work. It would just become a mobile-agent protocol.
UnicastRemoteObject, for example, implements Serializable by virtue of extending RemoteObject, which is serializable. If your remote objects extend UnicastRemoteObject, as most do, they are auto-exported on construction, and automatically replaced by their stubs during serialization, as the section you're quoting from in my book describes.
If it didn't do that, in preference to serializing as itself, it wouldn't be a UnicastRemoteObject, it would be more of a mobile agent, as described in another chapter of the book.
It follows that if you want it serialized, you have to unexport it first.
I think I also mentioned in the book that some versions of Java will then export it at the receiver on arrival, so it becomes a callback. I was never able to make much sense of this feature. The receiver could always have exported it himself if that's what he wanted: I don't see why it was forced on him.
http://docs.oracle.com/javase/tutorial/rmi/implementing.html
I hope this link will be useful.
"The rules governing how arguments and return values are passed are as follows:
Remote objects are essentially passed by reference. A remote object reference is a stub, which is a client-side proxy that implements the complete set of remote interfaces that the remote object implements.
Local objects are passed by copy, using object serialization. By default, all fields are copied except fields that are marked static or transient. Default serialization behavior can be overridden on a class-by-class basis."
Explanation:
This means that, if an object is made available to all as a remote object then the sender will put the reference into the stream and send it (when sender needs to send the object). The receiver receives the reference to the original object.
If an object is serializable and being sent, the sender makes a copy of the object and send the copy, which is independent object after created.

Why to use empty parentheses in Scala if we can just use no parentheses to define a function which does not need any arguments?

As far as I understand, in Scala we can define a function with no parameters either by using empty parentheses after its name, or no parentheses at all, and these two definitions are not synonyms. What is the purpose of distinguishing these 2 syntaxes and when should I better use one instead of another?
It's mostly a question of convention. Methods with empty parameter lists are, by convention, evaluated for their side-effects. Methods without parameters are assumed to be side-effect free. That's the convention.
Scala Style Guide says to omit parentheses only when the method being called has no side-effects:
http://docs.scala-lang.org/style/method-invocation.html
Other answers are great, but I also think it's worth mentioning that no-param methods allow for nice access to a classes fields, like so:
person.name
Because of parameterless methods, you could easily write a method to intercept reads (or writes) to the 'name' field without breaking calling code, like so
def name = { log("Accessing name!"); _name }
This is called the Uniform Access Principal
I have another light to bring to the usefulness of the convention encouraging an empty parentheses block in the declaration of functions (and thus later in calls to them) with side effects.
It is with the debugger.
If one add a watch in a debugger, such as, say, process referring for the example to a boolean in the focused debug context, either as a variable view, or as a pure side-effect free function evaluation, it creates a nasty risk for your later troubleshooting.
Indeed, if the debugger keeps that watch as a try-to-evaluate thing whenever you change the context (change thread, move in the call stack, reach another breakpoint...), which I found to be at least the case with IntelliJ IDEA, or Visual Studio for other languages, then the side-effects of any other process function possibly found in any browsed scope would be triggered...
Just imagine the kind of puzzling troubleshooting this could lead to if you do not have that warning just in mind, because of some innocent regular naming. If the convention were enforced, with my example, the process boolean evaluation would never fall back to a process() function call in the debugger watches; it might just be allowed in your debugger to explicitly access the () function putting process() in the watches, but then it would be clear you are not directly accessing any attribute or local variables, and fallbacks to other process() functions in other browsed scopes, if maybe unlucky, would at the very least be very less surprising.

How can I lazily load a Perl variable?

I have a variable that I need to pass to a subroutine. It is very possible that the subroutine will not need this variable, and providing the value for the variable is expensive. Is it possible to create a "lazy-loading" object that will only be evaluated if it is actually being used? I cannot change the subroutine itself, so it must still look like a normal Perl scalar to the caller.
You'll want to look at Data::Lazy and Scalar::Defer. Update: There's also Data::Thunk and Scalar::Lazy.
I haven't tried any of these myself, but I'm not sure they work properly for an object. For that, you might try a Moose class that keeps the real object in a lazy attribute which handles all the methods that object provides. (This wouldn't work if the subroutine does an isa check, though, unless it calls isa as a method, in which case you can override it in your class.)
Data::Thunk is the most transparent and robust way of doing this that i'm aware of.
However, I'm not a big fan of it, or any other similar modules or techniques that try to hide themself from the user. I prefer something more explicit, like having the code using the value that's hard to compute simply call a function to retrieve it. That way you don't need to precompute your value, your intent is more clearly visible, and you can also have various options to avoid re-computing the value, like lexical closures, perl's state variables, or modules like Memoize.
You might look into tying.
I would suggest stepping back and rethinking how you are structuring your program. Instead of passing a variable to a method that it might not need, make that value available in some other way, such as another method call, that can be called as needed (and not when it isn't).
In Moose, data like this is ideally stored in attributes. You can make attributes lazily built, so they are not calculated until they are first needed, but after that the value is saved so it does not need to be calculated a second time.

How to make Scala's immutable collections hold immutable objects

I'm evaluating Scala and am having a problem with its immutable collections.
I want to make immutable collections, which are completely immutable, right down through all the contained objects, the objects they reference, ad infinitum.
Is there a simple way to do this?
The code on http://www.finalcog.com/immutable-containers-scala illustrates what I'm trying to achieve, and a nasty work around (ImmutablePoint).
The problem with the workaround is that every time I want to change an object I have to manually make a new copy. I understand that the runtime will have to implement copy-on-write, but can this be made transparent to the developer?
I suppose I'm looking to make Immutable Objects, where methods change the current object state, but all other 'val' (and all immutable container) references to the object retain the 'old' state.
This is not possible out-of-the-box with scala via some specific language construct unless you have followed the idiom that all of your objects are immutable, in which case this behaviour comes for free!
With 2.8, named parameters have made "copy constructors" quite nice to use, from a readability perspective. But you are correct, this works as copy-on-write. The behaviour you are asking for, where the "current" object is the only one mutated goes completely against the way the JVM works, unfortunately (for you)!
Actually the phrase "the current object" makes no sense; really you mean "the current reference"! All other references (outside the current lexical scope) which point to the same object, erm, point to the same object! There is only one object!
Hence it's just not possible for this object to appear to be mutable from the perspective of the current lexical scope but immutable for others
If you're interested in some more general theory on how to handle updates to immutable data structures efficiently,
http://en.wikipedia.org/wiki/Zipper_%28data_structure%29
might prove interesting.

Do I conserve memory in MATLAB by declaring variables global instead of passing them as arguments?

I am new to MATLAB, it wasn't in the job description and I've been forced to take over for the person who wrote and maintained the code my company uses. Life's tough.
The guy from which I'm taking over told me that he declared all the big data vectors as global, to save memory. More specifically, so that when one function calls another function, he doesn't create a copy of the data when he passes it over.
Is this true? I read Strategies for Efficient Use of Memory, and it says that
When working with large data sets, be aware that MATLAB makes a temporary copy of an input variable if the called function modifies its value. This temporarily doubles the memory required to store the array, which causes MATLAB to generate an error if sufficient memory is not available.
It says something very similiar in Memory Allocation For Array #Function Arguments:
When you pass a variable to a function, you are actually passing a reference to the data that the variable represents. As long as the input data is not modified by the function being called, the variable in the calling function and the variable in the called function point to the same location in memory. If the called function modifies the value of the input data, then MATLAB makes a copy of the original array in a new location in memory, updates that copy with the modified value, and points the input variable in the called function to this new array.
So is it true that using global can be better? It seems a little sloppy to blithely declare all the large data as global, instead of making sure that none of the code modifies its input argument. Am I wrong? Does this really improve RAM usage?
In my experience, provided that none of the code modifies the large data, memory usage is the same, regardless of whether you use a global variable or an input argument, just like the Matlab docs say. Further information is in this blog post by a MathWorks employee.
There is quite a bit of folklore on performance issues in Matlab and not all of it is right. The internals of Matlab have changed quite a bit. It may be that in a previous version it's better to use a global variable.
This answer may be somewhat tangential, but an additional topic that bears mention here is the use of nested functions to manage memory.
As has already been established in other answers, there is no need for global variables if the data you are passing to the function is not modified (since it will be passed by reference). If it is modified (and is thus passed by value), using a global variable instead will save you memory. However, global variables can be somewhat "uncouth" for the following reasons:
You have to make a declaration like global varName everywhere you need them.
It can be conceptually a little messy trying to keep track of when and how they are modified, especially if they are spread across multiple m-files.
The user can easily break your code with an ill-placed clear global, which clears all global variables.
An alternative to global variables was mentioned in the first set of documentation you cited: nested functions. Immediately following the quote you cited is a code example (which I've formatted slightly differently here):
function myfun
A = magic(500);
setrowval(400, 0);
disp('The new value of A(399:401,1:10) is')
A(399:401,1:10)
function setrowval(row, value)
A(row,:) = value;
end
end
In this example, the function setrowval is nested inside the function myfun. The variable A in the workspace of myfun is accessible within setrowval (as if it had been declared global in each). The nested function modifies this shared variable, thus avoiding any additional memory allocation. You don't have to worry about the user inadvertently clearing anything and (in my opinion) it's a bit cleaner and easier to follow than declaring global variables.
The solution seems a bit strange to me. As you found out already, it shouldn't have significant impact on the memory usage if the called function does not modify the data array. However, if the called function modifies the data array, there's a functional difference: In one case (making the data array global), the change has an impact on the rest of the code, in the other case (passing it as reference) the modifications are only local and temporary.
I think you pretty much answered your own question, but a couple more references would be good here:
I made a video on this:
http://blogs.mathworks.com/videos/2008/09/16/new-location-and-memory-allocation/
Similar to what Loren spoke of here:
http://blogs.mathworks.com/loren/2006/05/10/memory-management-for-functions-and-variables/
-Dogu