I just read the following article from MathWorks which describes why it is important to avoid the eval function and lists alternatives to many of eval's common uses.
After reading the article, I have the impression that the eval function is neither useful nor necessary. So, my question is this: When is the eval function necessary?
I have found only one useful case for eval, and then the evalc variety: when calling a function with built-in command line call back (e.g. lines without ; or with disp calls), which you cannot modify. For instance when you got some obfuscated function that dumps heaps of stuff to your command window. In that case it's best to try and obtain the source code to modify that to your needs, as using evalc will mess up your performance. Otherwise, I have not found a case where eval is the best solution.
I wrote an extensive answer detailing why you should try to avoid eval as much as possible here: How to put these images together?
I have already used eval when trying to create multiple arrays with different names. This is not really recommended, but it worked for my specific application. For example, if I wanted to have N matrices with the specific names "matrix1" "matrix2" .. "matrixN" , one solution would be to manually type these in as "matrix1 = something" ... "matrixN = somethingelse". If N is really large, this is not ideal. Using eval , you could set up a for loop that would change the name of the matrix on every loop, and calculate some value based on that same N value.
Of course, ideally saving them in to a cell would be better, but I needed the arrays in the format I described.
Related
My problem is the following:
I have very many (~1000) mutually calling Matlab scripts, which are very poorly written, regularly damage each other's environments and generally became unmanageable.
One of the reasons I even got this problem is that I need to write a testsuite covering a big part of them. Luckily, for most of them the main criterion of 'correctness' is 'they don't crash'.
Just running them one by one in a loop is generally not an option, because they regularly call clear classes, close all, clc, shadow built-in functions and operators, et cetera.
So my original aim was to find a way to run a matlab script in sort of an 'isolated environment', but I didn't find a good way to do it. (Suggestions welcome, but it is not the main question.)
Since I will need to convert them all to functions anyway, I am looking for some way to do it auto-magically, or at least semi-automatically.
What I can mean semi-automatically:
Just add a line function varargout = $filename( varargin ) as the first line of the file, and end as the last one. This will at least make them runnable as functions with feval and all such functions and (more importantly) prevent them from damaging the test-runner.
Do point 1 and scan the file for referencing undeclared variables and add them as function arguments. This should be also doable, since the names of the variables are known. This will not help identifying output variables, but will still be a lot of help. For example, we could pack the whole workspace into one big output structure.
Do a runtime version of point 2. This way the 'magical converter' can actually track execution environments (workspaces) and identify which variables are implicitly used as 'input arguments' of a script, and which would be later used 'output arguments'. This option looks EXPHARD, but for a small number of calls should be not too bad in practice.
Point 1 I can implement myself using sed, as I also can get rid of all clear classes and clc, but the options 2 and 3 seem much harder. Is there anything at least remotely resembling options 2 or 3?
I have a rather bulky program that I've been running as a script from the MATLAB command line. I decided to clean it up a bit with some nested functions (I need to keep everything in one file), but in order for that to work it required me to also make the program itself a function. As a result, the program no longer runs in the base workspace like it did when it was a script. This means I no longer have access to the dozens of useful variables that used to remain after the program runs, which are important for extra calculations and information about the run.
The suggested workarounds I can find are to use assignin, evalin, define the variables as global, or set the output in the definition of the now function-ized program. None of these solutions appeal to me, however, and I would really like to find a way to force the workspace itself to base. Does any such workaround exist? Or is there any other way to do this that doesn't require me to manually define or label each specific variable I want to get out of the function?
Functions should define clearly input and output variables. Organizing the code differently will be much more difficult to understand and to modify later on. In the end, it will most likely cost you more time to work with an unorthodox style than investing in some restructuring.
If you have a huge number of output variables, I would suggest organizing them in structure arrays, which might be easy to handle as output variables.
The only untidy workaround I can imagine would use whos, assignin and eval:
function your_function()
x = 'hello' ;
y = 'world' ;
variables = whos ;
for k=1:length(variables)
assignin('base',variables(k).name,eval(variables(k).name))
end
end
But I doubt that this will help with the aim to clean up your program. As mentioned above I suggest ordering things manually in structures:
function out = your_function()
x = 'hello' ;
y = 'world' ;
out.x = x ;
out.y = y ;
end
If the function you would like to define are simple and have a single output, one option is to use anonymous functions.
Another option is to store all the variable you would like to use afterwards in a struct and have your big function return this struct as an output.
function AllVariables = GlobalFunction(varargin);
% bunch of stuff
AllVariables= struct('Variable1', Variable1, 'Variable2', Variable2, …);
end
This is a thought example of what I am thinking of:
test = 'x > 0';
while str2func(test)
Do your thing
x=x-1;
end
Is it possible to store whole logical operations in a variable like this?
Of course the str2func will break here. If it is possible this function will likely be something else. And I have only added apostrophes to the test variable content, because I cannot think of what else would be the storing method.
I can see it usefull when sending arguments to functions and alike. But mostly I'm just wondering, because I have never seen it done in any programming language before.
You can store the textual representation of a function in a variable and evaluate it, for example
test = 'x > 0';
eval(test)
should result in 1 or 0 depending on x's value.
But you shouldn't use eval for reasons too-often covered here on SO for me to bother repeating. You should instead become familiar with functions and function handles. For example
test = #(x)x>0
makes test a handle to a function which tests whether its argument is greater than 0 or not.
Many languages which are interpreted at run-time, as opposed to compiled languages, have similar capabilities.
I was encountered with usage of function eval(expression) in somebody else's code in matlab:
for example:
for n = 1 : 4
sn = int2str( n) ;
eval( [ 'saveas( fig' sn ', [ sName' sn ' ], ''fig'' ) ' ] );
end
MathWorks stuff in Matlab Help pointed out that:
Many common uses of the eval function are less efficient and are more difficult to read and debug than other MATLAB functions and language constructs.
After this, I find usage of this function in many other program languages, so as Python, JavaScript, PHP.
So I have a few questions:
Will usage of this function be enfluenced on the performance of my code?
If it will slow down execution, why does it occur?
If it slow down execution every time when called, what reason for use this function in principle?
The eval function is dangerous and there are very few examples where you actually need it. For example, the code you gave can easily be rewritten if you store the figure handles in an array fig(1), fig(2) etc and write
for n = 1:4
filename = sprintf('sName%d', n);
saveas(fig(n), filename, 'fig');
end
which is clearer, uses fewer characters, can be analysed by the Matlab editor's linter, is more easily modifiable if (when) you need to extend the code, and is less prone to weird bugs.
As a rule of thumb, you should never use eval in any language unless you really know what you are doing (i.e. you are writing a complicated Lisp macro or something else equivalent to manipulating the AST of the language - if you don't know what that means, you probably don't need to use eval).
There are almost always clearer, more efficient and less dangerous ways to achieve the same result. Often, a call to eval can be replaced with some form of recursion, a higher-order function or a loop.
Using eval here will certainly be slower than a non-eval version, but most likely it won't be a bottleneck in you code. However, the performance is only one issue, maintenance (incl. debugging), as well as readability are other ones.
The slowdown occurs because Matlab uses a JIT compiler, and eval lines cannot be optimized.
Eval use is in most cases due to lack of knowledge about the Matlab functionality that would be appropriate instead. In this particular case, the issue is that the figure handles are stored in variable names called fig1 through fig4. If they had been stored in an array called fig, i.e. fig(1) etc, eval would have been unnecessary.
EDIT Here are two excellent articles by Loren Shure about why eval should be avoided in Matlab. Evading eval, and More on eval.
For the most part, the slowdown occurs because the string has to be parsed into actual code. This isn't such a major issue if used sparingly, but if you ever find yourself using it in code that loops (either an explicit loop or things like JavaScript's setInterval()) then you're in for a major performance drop.
Common uses I've seen for eval that could be done better are:
Accessing property names in the ignorance of [] notation
Calling functions based on an argument name, which could instead be done with a switch (safer, prevents risk of code injection)
Accessing variables named like var1, var2, var3 when they should be arrays instead
To be honest, I don't think I've ever encountered a single situation where eval is the only way to solve a problem. I guess you could liken it to goto in that it's a substitute for program structure, or useful as a temporary solution to test a program before spending the time making it work optimally.
Here is another implication:
When you compile a program that uses eval, you must put pragmas that tell the compiler that some functions are needed. For example:
This code will compile and run well:
foo()
But this one needs a pragma added:
%#function foo
eval('foo()')
Otherwise you will encounter a runtime problem.
I'm producing a function for imenu-create-index-function, to index a source code module, for csharp-mode.el
It works, but delivers completely unacceptable performance. Any tips for fixing this?
The Background
I looked at js.el, which is the rebadged "espresso" now included, since v23.2, into emacs. It indexes Javascript files very nicely, does a good job with anonymous functions and various coding styles and patterns in common use. For example, in javascript one can do:
(function() {
var x = ... ;
function foo() {
if (x == 1) ...
}
})();
...to define a scope where x is "private" or inaccessible from other code. This gets indexed nicely by js.el, using regexps, and it indexes the inner functions (anonymous or not) within that scope also. It works quickly. A big module can be indexed in less than a second.
I tried following a similar approach in csharp-mode, but it's quite a bit more complicated. In Js, everything that gets indexed is a function. So the starting regex is "function" with some elaboration on either end. Once an occurrence of the function keyword is found, then there are 4 - 8 other regexps that get tried via looking-at - the number depends on settings. One nice thing about js mode is that you can turn on or off regexps for various coding styles, to speed things along I suppose. The default "styles" work for most of the code I tried.
This doesn't work in csharp-mode. It works, but it performs poorly enough to make it not very usable. I think the reason for this is that
there is no single marker keyword in C#, as function behaves in javascript. In C# I need to look for namespace, class, struct, interface, enum, and so on.
there's a great deal of flexibility with which csharp constructs can be defined. As one example, a class can define base classes as well as implemented interfaces. Another example: The return type for a method isn't a simple word-like string, but can be something messy like Dictionary<String, List<String>> . The index routine needs to handle all those cases, and capture the matches. This makes it run sloooooowly.
I use a lot of looking-back. The marker I use in the current approach is the open curly brace. Once I find one of those, I use looking-back to determine if the curly is a class, interface, enum, method, etc. I read that looking-back can be slow; I'm not clear on how much slower it is than, say, looking-at.
once I find an open-close pair of curlies, I call narrow-to-region in order to index what's inside. not sure if this is will kill performance or not. I suspect that it is not the main culprit, because the perf problems I see happen in modules with one namespace and 2 or 3 classes, which means narrow gets called 3 or 4 times total.
What's the Question?
My question is: do you have any tips for speeding up imenu-like indexing in a C# buffer?
I'm considering:
avoiding looking-back. I don't know exactly how to do this because when re-search-forward finds, say, the keyword class, the cursor is already in the middle of a class declaration. looking-back seems essential.
instead of using open-curly as the marker, use the keywords like enum, interface, namespace, class
avoid narrow-to-region
any hard advice? Further suggestions?
Something I've tried and I'm not really enthused about re-visiting: building a wisent-based parser for C#, and relying on semantic to do the indexing. I found semantic to be very very very (etc) difficult to use, hard to discover, and problematic. I had semantic working for a while, but then upgraded to v23.2, and it broke, and I never could get it working again. Simple things - like indexing the namespace keyword - took a very long time to solve. I'm very dissatisfied with it and don't want to try again.
I don't really know C# syntax, and without looking at your elisp it's hard to give an answer, but here goes anyway.
looking-back can be deadly slow. It's the first thing I'd experiment with. One thing that helps a lot is using the limit arg to, say, restrict your search to the beginning of the current line. A different approach is when you hit the open curly do backward-char then backward-sexp (or whatever) to get to the front of the previous word, then use looking-at.
Using keywords to search around instead of open curly is probably what I would have done. Maybe something like (re-search-forward "\\(enum\\|interface\\|namespace\\|class\\)[ \t\n]*{" nil t) then using match-string-no-properties on the first capture group to see which of the keywords was found. This might help with the looking-back problem as well.
I don't know how expensive narrow-to-region is, but could be avoided by when you find a open curly do save-excursion forward-sexp and keep point as a limit for the current iteration of your (I assume recursive) searches.