recursive reference in Perl - perl

$a=\$a;
The book I'm reading says in this case $a will NEVER be free,my question is why perl interpreter doesn't fix it at compile time?When it finds it's pointing at itself,don't increase refcount.
Why perl doesn't do it?

Some garbage collectors have cycle detection; Perl, for performance and historical reasons, does not. If you want a reference that doesn't affect the reference count, you can use Scalar::Util::weaken to obtain a weak reference, which removes the need for cycle detection in most situations where you would need to rely on it. There would need to be cycle-detection built into the interpreter to automatically detect whether \$a should be a weak or strong reference, so you just have to do it explicitly.

Related

How can I find all closures?

We've totally forgotten to capture self and its properties when referencing it within a closure. (Note: the compiler didn't warn us.) Now our application is full with strong reference cycles. To fix them, we have to add the capture list to each closure one-by-one.
How can we find them all? I thought to search for in but it results in too much results including comments, for cycles.
Good old Objective C would help me searching for ^. And it would warn us...

weak vs unowned in Swift. What are the internal differences?

I understand the usage and superficial differences between weak and unowned in Swift:
The simplest examples I've seen is that if there is a Dog and a Bone, the Bone may have a weak reference to the Dog (and vice versa) because the each can exist independent of each other.
On the other hand, in the case of a Human and a Heart, the Heart may have an unowned reference to the human, because as soon as the Human becomes... "dereferenced", the Heart can no longer reasonably be accessed. That and the classic example with the Customer and the CreditCard.
So this is not a duplicate of questions asking about that.
My question is, what is the point in having two such similar concepts? What are the internal differences that necessitate having two keywords for what seem essentially 99% the same thing? The question is WHY the differences exist, not what the differences are.
Given that we can just set up a variable like this: weak var customer: Customer!, the advantage of unowned variables being non-optional is a moot point.
The only practical advantage I can see of using unowned vs implicitly unwrapping a weak variable via ! is that we can make unowned references constant via let.
... and that maybe the compiler can make more effective optimizations for that reason.
Is that true, or is there something else happening behind the scenes that provides a compelling argument to keeping both keywords (even though the slight distinction is – based on Stack Overflow traffic – evidently confusing to new and experienced developers alike).
I'd be most interested to hear from people who have worked on the Swift compiler (or other compilers).
My question is, what is the point in having two such similar concepts? What are the internal differences that necessitate having two keywords for what seem essentially 99% the same thing?
They are not at all similar. They are as different as they can be.
weak is a highly complex concept, introduced when ARC was introduced. It performs the near-miraculous task of allowing you to prevent a retain a cycle (by avoiding a strong reference) without risking a crash from a dangling pointer when the referenced object goes out of existence — something that used to happen all the time before ARC was introduced.
unowned, on the other hand, is non-ARC weak (to be specific, it is the same as non-ARC assign). It is what we used to have to risk, it is what caused so many crashes, before ARC was introduced. It is highly dangerous, because you can get a dangling pointer and a crash if the referenced object goes out of existence.
The reason for the difference is that weak, in order to perform its miracle, involves a lot of extra overhead for the runtime, inserted behind the scenes by the compiler. weak references are memory-managed for you. In particular, the runtime must maintain a scratchpad of all references marked in this way, keeping track of them so that if an object weakly referenced goes out of existence, the runtime can locate that reference and replace it by nil to prevent a dangling pointer.
In Swift, as a consequence, a weak reference is always to an Optional (exactly so that it can be replaced by nil). This is an additional source of overhead, because working with an Optional entails extra work, as it must always be unwrapped in order to get anything done with it.
For this reason, unowned is always to be preferred wherever it is applicable. But never use it unless it is absolutely safe to do so! With unowned, you are throwing away automatic memory management and safety. You are deliberately reverting to the bad old days before ARC.
In my usage, the common case arises in situations where a closure needs a capture list involving self in order to avoid a retain cycle. In such a situation, it is almost always possible to say [unowned self] in the capture list. When we do:
It is more convenient for the programmer because there is nothing to unwrap. [weak self] would be an Optional in need of unwrapping in order to use it.
It is more efficient, partly for the same reason (unwrapping always adds an extra level of indirection) and partly because it is one fewer weak reference for the runtime's scratchpad list to keep track of.
A weak reference is actually set to nil and you must check it when the referent deallocates and an unowned one is set to nil, but you are not forced to check it.
You can check a weak against nil with if let, guard, ?, etc, but it makes no sense to check an unowned, because you think that is impossible. If you are wrong, you crash.
I have found that in-practice, I never use unowned. There is a minuscule performance penalty, but the extra safety from using weak is worth it to me.
I would leave unowned usage to very specific code that needs to be optimized, not general app code.
The "why does it exist" that you are looking for is that Swift is meant to be able to write system code (like OS kernels) and if they didn't have the most basic primitives with no extra behavior, they could not do that.
NOTE: I had previously said in this answer that unowned is not set to nil. That is wrong, a bare unowned is set to nil. A unowned(unsafe) is not set to nil and could be a dangling pointer. This is for high-performance needs and should generally not be in application code.

Querying Python runtime for all objects in existence

I'm working on a C++ Python wrapper the attempts to encapsulate the awkwardness of reference counting, retaining, releasing.
It has a set of unit tests.
However I want to ensure that after each test, everything has been cleared away properly. i.e. every object created during that test has had its reference count taken down to 0, and has consequently been removed.
Is there any way of querying the Python runtime for this information?
If I could just get the number of objects being stored, that would do. I could then sure it doesn't change between tests.
EDIT: I believe it is possible to compile Python with a special flag producing a binary that has functions for monitoring reference counting. But this is as much as I know. Maybe more...
That depends on which implementation you use. I'm assuming you're using cpython. Since you're fiddling with the reference counting mechanism, I will further assume that using the garbage collector to find the remaining objects won't be sufficiently reliable for your purpose. (Elsewise, see here.)
The build flag you were thinking about is this one:
It is best to define these options in the EXTRA_CFLAGS make variable:
make EXTRA_CFLAGS="-DPy_REF_DEBUG".
Py_REF_DEBUG introduced in 1.4
named REF_DEBUG before 1.4
Turn on aggregate reference counting. This arranges that extern
_Py_RefTotal hold a count of all references, the sum of ob_refcnt across
all objects. [..]
Special gimmicks:
sys.gettotalrefcount()
Return current total of all refcounts.
(Source: Python git, SpecialBuilds.txt, Debugging builds from the C API reference.)
If you need a list of all pointers to live objects, use Py_TRACE_REFS, directly below that one in the SpecialBuilds file.

MATLAB weak references to handle class objects

While thinking about the possibility of a handle class based ORM in MATLAB, the issue of caching instances came up. I could not immediately think of a way to make weak references or a weak map, though I'm guessing that something could be contrived with event listeners. Any ideas?
More Info
In MATLAB, a handle class (as opposed to a value class) has reference semantics. An example included with MATLAB is the containers.Map class. If you instantiate one and pass it to a function, any modifications the function makes to the object will be visible via the original reference. That is, it works like a Java or Python object reference.
Like Java and Python, MATLAB keeps track in one way or another of how many things are referencing each object of a handle class. When there aren't any more, MATLAB knows it is safe to delete the object.
A weak reference is one that refers to the object but does not count as a reference for purposes of garbage collection. So if the only remaining references to the object are weak, then it can be thrown away. Generally an event or callback can be supplied to the weak reference - when the object is thrown away, the weak references to it will be notified, allowing cleanup code to run.
For instance, a weak value map is like a normal map, except that the values (as opposed to the keys) are implemented as weak references. The weak map class can arrange a callback or event on each of these weak references so that when the referenced object is deleted, the key/value entry in the map is removed, keeping the map nice and tidy.
These special reference types are really a language-level feature, something you need the VM and GC to do. Trying to implement it in user code will likely end in tears, especially if you lean on undocumented behavior. (Sorry to be a stick in the mud.)
There's a couple ways you could do something similar. These are just ideas, not endorsements; I haven't actually done them.
Perhaps instead of caching Matlab object instances per se, you could cache expensive computational results using a real Java weak ref map in the JVM embedded inside Matlab. If you can convert your Matlab values to and from Java relatively quickly, this could be a win. If it's relatively flat numeric data, primitives like double[] or double[][] convert quickly using Matlab's implicit conversion.
Or you could make a regular LRU object cache in the Matlab level (maybe using a containers.Map keyed by hashcodes) that explicitly removes the objects inside it when new ones are added. Either use it directly, or add an onCleanup() behavior to your objects that has them automatically add a copy of themselves to a global "recently deleted objects" LRU cache of fixed size, keyed by an externally meaningful id, and mark the instances in the cache so your onCleanup() method doesn't try to re-add them when they're deleted due to expiration from the cache. Then you could have a factory method or other lookup method "resurrect" instances from the cache instead of constructing brand new ones the expensive way. This sounds like a lot of work, though, and really not idiomatic Matlab.
This is not an answer to your question but just my 2 cents.
Weak reference is a feature of garbage collector. In Java and .NET garbage collector is being called when the pressure on memory is high and is therefore indeterministic.
This MATLAB Digest post says that MATLAB is not using a (indeterministic) garbage collector. In MATLAB references are being deleted from memory (deterministically) on each stack pop i.e. on leaving each function.
Thus I do not think that weak references belongs to the MATLAB reference handling concept. But MATLAB has always had tons of undocumented features so I can not exclude that it is buried somewhere.
In this SO post I asked about MATLAB garbage collector implementation and got no real answer. One MathWorks stuff member instead of answering my question has accused me of trying to construct a Python vs. MATLAB argument. Another MathWorks stuff member wrote something looking reasonable but in substance a clever deception - purposeful distraction from the problem I asked about. And the best answer has been:
if you ask this question then MATLAB
is not the right language for you!

Garbage collection in Perl

Unlike Java, Perl uses reference count for garbage collection. I have tried searching some previous questions which speak about C++ RAII and smart pointers and Java GC but have not understood how Perl deals with the circular referencing problem.
Can anyone explain how Perl's garbage collector deals with circular references? Is there any way to reclaim circular referenced memory which are no longer used by the program or does Perl just ignores this problem altogether?
According to my copy of Programming Perl 3rd ed., on exit Perl 5 does an "expensive mark and sweep" to reclaim circular references. You'll want to avoid circular references as much as possible because otherwise they won't be reclaimed until the program exits.
Perl 5 does offer weak references through the Scalar::Utils module.
Perl 6 will move to a pluggable garbage collected scheme (well, the underlying VM will have multiple garbage collection options and the behavior of those options can have an effect on Perl). That is, you'll be able to choose between various garbage collectors, or implement your own. Want a copying collector? Sure. Want a coloring collector? You got it. Mark/sweep, compacting, etc? Why not?
The quick answer is that Perl 5 does not handle circular references automatically. Unless you take explicit measures in your code, any of your data structures which include circular references will not be reclaimed until the thread that created them dies. This is considered to be an acceptable tradeoff in that it avoids the need for runtime garbage collection that would slow down execution.
If your code creates data structures with circular references (i.e. a tree whose nodes contain references back to the root), you will want to use the Scalar::Util module to "weaken" the references that point back toward the root node. These weak references will not add to the reference count of whatever they point to, so the entire data structure will be automatically deallocated when the last external reference vanishes.
Example:
use Scalar::Util qw(weaken);
...
my $new_node = { content => $content, root => $root_node };
weaken $new_node->{root};
push #{$root_node->{children}}, $new_node;
If you use code like this whenever you add new nodes to your data structure, then the only references to the root that are actually counted are those from outside of the structure. This is exactly what you want. Then the root, and recursively all of its children, will be reclaimed as soon as the last external reference to it vanishes.
Have a look at Proxy Objects.
Perl applies a mark-and-sweep alternate GC in some occasions (when a thread dies, I think) in order to reclaim circular references. Note that the "every value is a string" Perl stanza makes it difficult to create true circular references; this is feasible, but "normal" Perl code does not, which is why reference counting works well with Perl.