Does MATLAB perform tail call optimization? - matlab

I've recently learned Haskell, and am trying to carry the pure functional style over to my other code when possible. An important aspect of this is treating all variables as immutable, i.e. constants. In order to do so, many computations that would be implemented using loops in an imperative style have to be performed using recursion, which typically incurs a memory penalty due to the allocation a new stack frame for each function call. In the special case of a tail call (where the return value of a called function is immediately returned to the callee's caller), however, this penalty can be bypassed by a process called tail call optimization (in one method, this can be done by essentially replacing a call with a jmp after setting up the stack properly). Does MATLAB perform TCO by default, or is there a way to tell it to?

If I define a simple tail-recursive function:
function tailtest(n)
if n==0; feature memstats; return; end
tailtest(n-1);
end
and call it so that it will recurse quite deeply:
set(0,'RecursionLimit',10000);
tailtest(1000);
then it doesn't look as if stack frames are eating a lot of memory. However, if I make it recurse much deeper:
set(0,'RecursionLimit',10000);
tailtest(5000);
then (on my machine, today) MATLAB simply crashes: the process unceremoniously dies.
I don't think this is consistent with MATLAB doing any TCO; the case where a function tail-calls itself, only in one place, with no local variables other than a single argument, is just about as simple as anyone could hope for.
So: No, it appears that MATLAB does not do TCO at all, at least by default. I haven't (so far) looked for options that might enable it. I'd be surprised if there were any.
In cases where we don't blow out the stack, how much does recursion cost? See my comment to Bill Cheatham's answer: it looks like the time overhead is nontrivial but not insane.
... Except that Bill Cheatham deleted his answer after I left that comment. OK. So, I took a simple iterative implementation of the Fibonacci function and a simple tail-recursive one, doing essentially the same computation in both, and timed them both on fib(60). The recursive implementation took about 2.5 times longer to run than the iterative one. Of course the relative overhead will be smaller for functions that do more work than one addition and one subtraction per iteration.
(I also agree with delnan's sentiment: highly-recursive code of the sort that feels natural in Haskell is typically likely to be unidiomatic in MATLAB.)

There is a simple way to check this. Create this function tail_recursion_check:
function r = tail_recursion_check(n)
if n > 1
r = tail_recursion_check(n - 1);
else
error('error');
end
end
and run tail_recursion_check(10), for example. You are going to see a very long stack trace with 10 items that says error at line 3. If there were tail call optimization, you would only see one.

Related

Why is ".map" slower then "while/for loop" in Dart(Flutter)

I saw this article:
https://itnext.io/comparing-darts-loops-which-is-the-fastest-731a03ad42a2
It says that ".map" is slow with benchmark result
But I don't understand why slower than while/for loop
How does it work in low level?
I think it's because .map is called an unnamed method like this (_){ }
Can you explain that in detail?
Its because mapping an array will create a copy of each value than modify the original array.
Since a while/for loop does not copy the values but rather just accesses them using their index, it is a lot faster.
Can you explain that in detail?
It's like saying "I don't understand why hitchhiking on the back of a construction truck is so much slower than taking the high speed train to my destination".
The only detail that is important is that map is not a loop. map() internally probably uses a loop of some kind.
This person is misusing a method call that is meant for something else, just because a side-effect of that call when combining it with a call materializing the iterable, like toList(), is that it loops through the iterable given. It doesn't even have the side effect on it's own.
Stop reading "tutorials" or "tips" of people misusing language features. map() is not a loop. If you need a loop, use a loop. The same goes for the ternary operator. It's not an if, if you need an if, use it.
Use language features for what they are meant, stop misusing language features because their side-effect does what you want and then wondering why they don't work as well as the feature actually meant for it.
Sorry if this seems a bit ranty, but I have seen countless examples by now. I don't know where it comes from. My personal guess is "internet tutorials". Because everybody can write one. Please don't read them. Read a good book. It was written by professionals, proofread, edited, and checked. Internet tutorials are free, written by random people and about worth as much as they cost.

Why variable assignments should not used in functional programming

I am learning functional programming and I can understand why immutability is preferred over mutable objects.
This article also explains it well.
But I am unable to understand why assignments should be performed inside of pure functions.
One reason that I can understand is variable mutability leads to locking and since in a pure function in scala we mostly tail recursion and this creates variables/objects on the call stack rather than a heap.
Is there any other reason why one should avoid variable assignment in functional programming.
There is a difference between assignments and re-assignments. re-assignments are not allowed in functional programming because its mutability is not allowed in pure functions.Variable assignment is allowed.
val a = 1 //assignment allowed
a = 2 //re-assignment not allowed
Reading in a impure fashion (changing state) from the external world is a side-effect in functional programming.
So, function accessing a global variable which can potentially be mutated is not pure.
Just makes life easy.
Generally
When you are disciplined life is less chaotic. Thats exactly what functional programming advocates. When life is less chaotic you can concentrate on better things in life.
So, the main reason for immutability
It becomes hard to reason about the correctness of the program with mutations. In case of concurrent programs this is very painful to debug.
that means it becomes hard to keep track of changes variables undergo in order to understand the code/program or to debug the program.
Mutation is one of the side effect where which makes program hard to understand and reason about.
Functional programming enforces this discipline (use of immutability) so that code is maintainable, expressive and understandable.
Mutation is one of the side effects
Pure function is that one which does not have side effects.
Side effects:
Mutation of variables
Mutation of mutable data structures
Reading or writing to a file/console (external source)
Throwing exceptions to the halt program
Avoiding above mentioned side effects makes a function depend only on the parameters of the function rather than any outside values or state.
Pure function is the most isolated function which neither reads from the world nor writes to the world. It does not halt or break the program control flow.
The above properties make the pure function easy to understand and reason about.
Pure function is mathematical function
Its a mapping from co-domain to range where every value in co-domain is mapped to exactly one value in range.
That means if f(2) is equal to 4 then f(2) is 4 irrespective of what the state of the world is.
Pure function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.

Implementing a priority queue in matlab in order to solve optimization problems using BRANCH AND BOUND

I'm trying to code a priority queue in MATLAB, I know there is the SIMULINK toolbox for priority queue, but I'm trying to code it in MATLAB. I have a pseudo code that uses priority queue for a method called BEST First Search with Branch and Bound. The branch and bound algorithm design strategy is a state space tree and it is used to solve optimization problems. simple explanation of what is branch and bound
I have read chapter 5: Branch and Bound from a book called 'FOUNDATIONS OF ALGORITHMS', it's the 4th edition by Richard Neapolitan and Kumarss Naimipour , and the text is about designing algorithms, complexity analysis of algorithms, and computational complexity (analysis of problems), very interesting book, and I came across this pseudocode:
Void BeFS( state_space_tree T, number& best)
{
priority _queue-of_node PQ;
node(u,v);
initialize (PQ) % initialize PQ to be empty
u=root of T;
best=value(v);
insert(PQ,v) insert(PQ,v) is a procedure that adds v to the priority queue PQ
while(!empty(PQ){ % remove node with best bound
remove(PQ,v);
remove(PQ,v) is a procedure that removes the node with the best bound and it assigns its value to v
if(bound(v) is better than best) % check if node is still promising
for (each child of u of v){
if (value (u) is better than best)
(best=value(u);
if (bound(u) is better than best)
insert(PQ,u)
}
}
}
I don't know how to code it in matlab, and branch and bound is an interesting general algorithm for finding optimal solutions of various optimization problems, especially in discrete and combinatorial optimization, instead of using heuristics to find an optimal solution, since branch and bound reduces calculation time and finds the optimal solution faster.
EDIT:
I have checked everywhere whether a solution already has been implemented , before posting a question here. And I came here to get ideas of how I can get started to implement this code
I have included this in your post so people can know better what you expect of them. However, 'ideas to get started to implement' is still not much more specific than 'how to write code in matlab'.
However, I will still try to answer:
Make the structure of the code, write the basic loops and fill them with comments of what you want to do
Pick (the easiest or first) one of those comments, and see whether you can make it happen in a few lines, you can test it by generating some dummy input for that piece of code
Keep repeating step 2 untill all comments have the required code
If you get stuck in one of the blocks, and have searched but not found the answer to a specific question. Then this is not a bad place to ask.

What is the architecture behind Scratch programming blocks?

I need to build a mini version of the programming blocks that are used in Scratch or later in snap! or openblocks.
The code in all of them is big and hard to follow, especially in Scratch which is written in some kind of subset of SmallTalk, which I don't know.
Where can I find the algorithm they all use to parse the blocks and transform it into a set of instructions that work on something, such as animations or games as in Scratch?
I am really interested in the algorithmic or architecture behind the concept of programming blocks.
This is going to be just a really general explanation, and it's up to you to work out specifics.
Defining a block
There is a Block class that all blocks inherit from. They get initialized with their label (name), shape, and a reference to the method. When they are run/called, the associated method is passed the current context (sprite) and the arguments.
Exact implementations differ among versions. For example, In Scratch 1.x, methods took arguments corresponding to the block's arguments, and the context (this or self) is the sprite. In 2.0, they are passed a single argument containing all of the block's arguments and context. Snap! seems to follow the 1.x method.
Stack (command) blocks do not return anything; reporter blocks do.
Interpreting
The interpreter works somewhat like this. Each block contains a reference to the next one, and any subroutines (reporter blocks in arguments; command blocks in a C-slot).
First, all arguments are resolved. Reporters are called, and their return value stored. This is done recursively for lots of Reporter blocks inside each other.
Then, the command itself is executed. Ideally this is a simple command (e.g. move). The method is called, the Stage is updated.
Continue with the next block.
C blocks
C blocks have a slightly different procedure. These are the if <> style, and the repeat <> ones. In addition to their ordinary arguments, they reference their "miniscript" subroutine.
For a simple if/else C block, just execute the subroutine normally if applicable.
When dealing with loops though, you have to make sure to thread properly, and wait for other scripts.
Events
Keypress/click events can be dealt with easily enough. Just execute them on keypress/click.
Something like broadcasts can be done by executing the hat when the broadcast stack is run.
Other events you'll have to work out on your own.
Wait blocks
This, along with threading, is the most confusing part of the interpretation to me. Basically, you need to figure out when to continue with the script. Perhaps set a timer to execute after the time, but you still need to thread properly.
I hope this helps!

MATLAB takes a long time after last line of a function

I have a function that's taking a long time to run. When I profile it, I find that over half the time (26 out of 50 seconds) is not accounted for in the line by line timing breakdown, and I can show that the time is spent after the function finishes running but before it returns control by the following method:
ts1 = tic;
disp ('calling function');
functionCall(args);
disp (['control returned to caller - ', num2str(toc(ts1))]);
The first line of the function I call is ts2 = tic, and the last line is
disp (['last line of function- ', num2str(toc(ts2))]);
The result is
calling function
last line of function - 24.0043
control returned to caller - 49.857
Poking around on the interwebs, I think this is a symptom of the way MATLAB manages memory. It deallocates on function returns, and sometimes this takes a long time. The function does allocate some large (~1 million element) arrays. It also works with handles, but does not create any new handle objects or store handles explicitly. My questions are:
Is this definitely a memory management problem?
Is there any systematic way to diagnose what causes a problem in this function, as opposed to others which return quickly?
Are there general tips for reducing the amount of time MATLAB spends cleaning up on a function exit?
You are right, it seems to be the time spent on garbage collection. I am afraid it is a fundamental MATLAB flaw, it is known since years but MathWorks has not solved it even in the newest MATLAB version 2010b.
You could try setting variables manually to [] before leaving function - i.e. doing garbage collection manually. This technique also helps against memory leaks in previous MATLAB versions. Now MATLAB will spent time not on end but on myVar=[];
You could alleviate problem working without any kind of references - anonymous functions, nested functions, handle classes, not using cellfun and arrayfun.
If you have arrived to the "performance barrier" of MATLAB then maybe you should simply change the environment. I do not see any sense anyway starting today a new project in MATLAB except if you are using SIMULINK. Python rocks for technical computing and with C# you can also do many things MATLAB does using free libraries. And both are real programming languages and are free, unlike MATLAB.
I discovered a fix to my specific problem that may be applicable in general.
The function that was taking a long time to exit was called on a basic object that contained a vector of handle objects. When I changed the definition of the basic object to extend handle, I eliminated the lag on the close of the function.
What I believe was happening is this: When I passed the basic object to my function, it created a copy of that object (MATLAB is pass by value by default). This doesn't take a lot of time, but when the function exited, it destroyed the object copy, which caused it to look through the vector of handle objects to make sure there weren't any orphans that needed to be cleaned up. I believe it is this operation that was taking MATLAB a long time.
When I changed the object I was passing to a handle, no copy was made in the function workspace, so no cleanup of the object was required at the end.
This suggests a general rule to me:
If a function is taking a long time to clean up its workspace on exiting and you are passing a lot of data or complex structures by value, try encapsulating the arguments to that function in a handle object
This will avoid duplication and hence time consuming cleanup on exit. The downside is that your function can now unexpectedly change your inputs, because MATLAB doesn't have the ability to declare an argument const, as in c++.
A simple fix could be this: pre-allocate the large arrays and pass them as args to your functionCall(). This moves the deallocation issue back to the caller of functionCall(), but it could be that you are calling functionCall more often than its parent, in which case this will speed up your code.
workArr = zeros(1,1e6); % allocate once
...
functionCall(args,workArr); % call with extra argument
...
functionCall(args,wokrArr); % call again, no realloc of workArr needed
...
Inside functionCall you can take care of initializing and/or re-setting workArr, for instance
[workArr(:)] = 0; % reset work array