At which lines in my MATLAB code a variable is accessed? - matlab

I am defining a variable in the beginning of my source code in MATLAB. Now I would like to know at which lines this variable effects something. In other words, I would like to see all lines in which that variable is read out. This wish does not only include all accesses in the current function, but also possible accesses in sub-functions that use this variable as an input argument. In this way, I can see in a quick way where my change of this variable takes any influence.
Is there any possibility to do so in MATLAB? A graphical marking of the corresponding lines would be nice but a command line output might be even more practical.

You may always use "Find Files" to search for a certain keyword or expression. In my R2012a/Windows version is in Edit > Find Files..., with the keyboard shortcut [CTRL] + [SHIFT] + [F].
The result will be a list of lines where the searched string is found, in all the files found in the specified folder. Please check out the options in the search dialog for more details and flexibility.
Later edit: thanks to #zinjaai, I noticed that #tc88 required that this tool should track the effect of the name of the variable inside the functions/subfunctions. I think this is:
very difficult to achieve. The problem of running trough all the possible values and branching on every possible conditional expression is... well is hard. I think is halting-problem-hard.
in 90% of the case the assumption that the output of a function is influenced by the input is true. But the input and the output are part of the same statement (assigning the result of a function) so looking for where the variable is used as argument should suffice to identify what output variables are affected..
There are perverse cases where functions will alter arguments that are handle-type (because the argument is not copied, but referenced). This side-effect will break the assumption 2, and is one of the main reasons why 1. Outlining the cases when these side effects take place is again, hard, and is better to assume that all of them are modified.
Some other cases are inherently undecidable, because they don't depend on the computer states, but on the state of the "outside world". Example: suppose one calls uigetfile. The function returns a char type when the user selects a file, and a double type for the case when the user chooses not to select a file. Obviously the two cases will be treated differently. How could you know which variables are created/modified before the user deciding?
In conclusion: I think that human intuition, plus the MATLAB Debugger (for run time), and the Find Files (for quick search where a variable is used) and depfun (for quick identification of function dependence) is way cheaper. But I would like to be wrong. :-)

Related

Lisp - Extracting info from a list of comma separated values

I've tried searching this but have yet to find something that suits anything close to my needs. I'm trying to create a Autocad LISP that takes a text file, which is a list of comma-separated values, and place a block at coordinates defined by the list. BUT, only for items on the list where the last entry starts with "HP"
So that's sounds a bit complex, but the text file is basically a UTM survey output, and looks like this:
1000,Easting,Northing,Elevation,Identifier
1001,Easting,Northing,Elevation,Identifier
Etc.
The identifier is a variety of values, but I want to extract the Northing,Easting,Elevation, and insert a block (this last part I've got) at that location when the identifier begins with "HP". The list can be long and the number of HPs can be 1 or 5000. I'm assuming there's a "for x=1:end, do" type of loop than can be made that reuses the same variables over and over.
I'm a newbie to LISP so I'm stuck in that spot between "here are I've-never-programmed-before tutorials to make hello world" and "here is a library of the 3000 different commands in alphabetical order"
I believe the functions you are needing to solve this question are open, read-line or read-char, close,strlen, and substr. The first four functions relate to AutoLisp writing and reading a file. The last two functions manipulate the string variables that were pulled from the file. With them, you can find the "HP" within the text. To loop through the same code, three come to my mind: repeat, while, and foreach.
For a list of variables to quickly reference with their descriptions, here's a good starting point. This particular page has the information broken up by category instead of alphabetical order.
https://help.solidworks.com/2022/English/api/draftsightlispreference/html/lisp_functions_overview.htm
Here are a few tutorials where AutoLisp code is used to write and read other files:
https://www.afralisp.net/autolisp/tutorials/file-handling.php
https://www.afralisp.net/autolisp/tutorials/external-data.php
Lastly, here's an example of AutoLisp writing and reading attributes from and to blocks.
https://github.com/GitHubUser5376/AttributeImportExport
You can use Lee-Mac's Reacd-CSV function to get a list of the csv values.
And for the "HP" detection yes you might have to go through(using loop options mentioned above like while, repeat,foreach) each and use
(substr Identifier 1 2)
to validate

How to automatically convert a Matlab script into a Matlab function?

My problem is the following:
I have very many (~1000) mutually calling Matlab scripts, which are very poorly written, regularly damage each other's environments and generally became unmanageable.
One of the reasons I even got this problem is that I need to write a testsuite covering a big part of them. Luckily, for most of them the main criterion of 'correctness' is 'they don't crash'.
Just running them one by one in a loop is generally not an option, because they regularly call clear classes, close all, clc, shadow built-in functions and operators, et cetera.
So my original aim was to find a way to run a matlab script in sort of an 'isolated environment', but I didn't find a good way to do it. (Suggestions welcome, but it is not the main question.)
Since I will need to convert them all to functions anyway, I am looking for some way to do it auto-magically, or at least semi-automatically.
What I can mean semi-automatically:
Just add a line function varargout = $filename( varargin ) as the first line of the file, and end as the last one. This will at least make them runnable as functions with feval and all such functions and (more importantly) prevent them from damaging the test-runner.
Do point 1 and scan the file for referencing undeclared variables and add them as function arguments. This should be also doable, since the names of the variables are known. This will not help identifying output variables, but will still be a lot of help. For example, we could pack the whole workspace into one big output structure.
Do a runtime version of point 2. This way the 'magical converter' can actually track execution environments (workspaces) and identify which variables are implicitly used as 'input arguments' of a script, and which would be later used 'output arguments'. This option looks EXPHARD, but for a small number of calls should be not too bad in practice.
Point 1 I can implement myself using sed, as I also can get rid of all clear classes and clc, but the options 2 and 3 seem much harder. Is there anything at least remotely resembling options 2 or 3?

Can the MATLAB editor show the file from which text is displayed? [duplicate]

In MATLAB, how do you tell where in the code a variable is getting output?
I have about 10K lines of MATLAB code with about 4 people working on it. Somewhere, someone has dumped a variable in a MATLAB script in the typical way:
foo
Unfortunately, I do not know what variable is getting output. And the output is cluttering out other more important outputs.
Any ideas?
p.s. Anyone ever try overwriting Standard.out? Since MATLAB and Java integration is so tight, would that work? A trick I've used in Java when faced with this problem is to replace Standard.out with my own version.
Ooh, I hate this too. I wish Matlab had a "dbstop if display" to stop on exactly this.
The mlint traversal from weiyin is a good idea. Mlint can't see dynamic code, though, such as arguments to eval() or string-valued figure handle callbacks. I've run in to output like this in callbacks like this, where update_table() returns something in some conditions.
uicontrol('Style','pushbutton', 'Callback','update_table')
You can "duck-punch" a method in to built-in types to give you a hook for dbstop. In a directory on your Matlab path, create a new directory named "#double", and make a #double/display.m file like this.
function display(varargin)
builtin('display', varargin{:});
Then you can do
dbstop in double/display at 2
and run your code. Now you'll be dropped in to the debugger whenever display is implicitly called by the omitted semicolon, including from dynamic code. Doing it for #double seems to cover char and cells as well. If it's a different type being displayed, you may have to experiment.
You could probably override the built-in disp() the same way. I think this would be analagous to a custom replacement for Java's System.out stream.
Needless to say, adding methods to built-in types is nonstandard, unsupported, very error-prone, and something to be very wary of outside a debugging session.
This is a typical pattern that mLint will help you find:
So, look on the right hand side of the editor for the orange lines. This will help you find not only this optimization, but many, many more. Notice also that your variable name is highlighted.
If you have a line such as:
foo = 2
and there is no ";" on the end, then the output will be dumped to the screen with the variable name appearing first:
foo =
2
In this case, you should search the file for the string "foo =" and find the line missing a ";".
If you are seeing output with no variable name appearing, then the output is probably being dumped to the screen using either the DISP or FPRINTF function. Searching the file for "disp" or "fprintf" should help you find where the data is being displayed.
If you are seeing output with the variable name "ans" appearing, this is a case when a computation is being done, not being put in a variable, and is missing a ';' at the end of the line, such as:
size(foo)
In general, this is a bad practice for displaying what's going on in the code, since (as you have found out) it can be hard to find where these have been placed in a large piece of code. In this case, the easiest way to find the offending line is to use MLINT, as other answers have suggested.
I like the idea of "dbstop if display", however this is not a dbstop option that i know of.
If all else fails, there is still hope. Mlint is a good idea, but if there are many thousands of lines and many functions, then you may never find the offender. Worse, if this code has been sloppily written, there will be zillions of mlint flags that appear. How will you narrow it down?
A solution is to display your way there. I would overload the display function. Only temporarily, but this will work. If the output is being dumped to the command line as
ans =
stuff
or as
foo =
stuff
Then it has been written out with display. If it is coming out as just
stuff
then disp is the culprit. Why does it matter? Overload the offender. Create a new directory in some directory that is on top of your MATLAB search path, called #double (assuming that the output is a double variable. If it is character, then you will need an #char directory.) Do NOT put the #double directory itself on the MATLAB search path, just put it in some directory that is on your path.
Inside this directory, put a new m-file called disp.m or display.m, depending upon your determination of what has done the command line output. The contents of the m-file will be a call to the function builtin, which will allow you to then call the builtin version of disp or display on the input.
Now, set a debugging point inside the new function. Every time output is generated to the screen, this function will be called. If there are multiple events, you may need to use the debugger to allow processing to proceed until the offender has been trapped. Eventually, this process will trap the offensive line. Remember, you are in the debugger! Use the debugger to determine which function called disp, and where. You can step out of disp or display, or just look at the contents of dbstack to see what has happened.
When all is done and the problem repaired, delete this extra directory, and the disp/display function you put in it.
You could run mlint as a function and interpret the results.
>> I = mlint('filename','-struct');
>> isErrorMessage = arrayfun(#(S)strcmp(S.message,...
'Terminate statement with semicolon to suppress output (in functions).'),I);
>>I(isErrorMessage ).line
This will only find missing semicolons in that single file. So this would have to be run on a list of files (functions) that are called from some main function.
If you wanted to find calls to disp() or fprintf() you would need to read in the text of the file and use regular expresions to find the calls.
Note: If you are using a script instead of a function you will need to change the above message to read: 'Terminate statement with semicolon to suppress output (in scripts).'
Andrew Janke's overloading is a very useful tip
the only other thing is instead of using dbstop I find the following works better, for the simple reason that putting a stop in display.m will cause execution to pause, every time display.m is called, even if nothing is written.
This way, the stop will only be triggered when display is called to write a non null string, and you won't have to step through a potentially very large number of useless display calls
function display(varargin)
builtin('display', varargin{:});
if isempty(varargin{1})==0
keyboard
end
A foolproof way of locating such things is to iteratively step through the code in the debugger observing the output. This would proceed as follows:
Add a break point at the first line of the highest level script/function which produces the undesired output. Run the function/script.
step over the lines (not stepping in) until you see the undesired output.
When you find the line/function which produces the output, either fix it, if it's in this file, or open the subfunction/script which is producing the output. Remove the break point from the higher level function, and put a break point in the first line of the lower-level function. Repeat from step 1 until the line producing the output is located.
Although a pain, you will find the line relatively quickly this way unless you have huge functions/scripts, which is bad practice anyway. If the scripts are like this you could use a sort of partitioning approach to locate the line in the function in a similar manner. This would involve putting a break point at the start, then one half way though and noting which half of the function produces the output, then halving again and so on until the line is located.
I had this problem with much smaller code and it's a bugger, so even though the OP found their solution, I'll post a small cheat I learned.
1) In the Matlab command prompt, turn on 'more'.
more on
2) Resize the prompt-y/terminal-y part of the window to a mere line of text in height.
3) Run the code. It will stop wherever it needed to print, as there isn't the space to print it ( more is blocking on a [space] or [down] press ).
4) Press [ctrl]-[C] to kill your program at the spot where it couldn't print.
5) Return your prompt-y area to normal size. Starting at the top of trace, click on the clickable bits in the red text. These are your potential culprits. (Of course, you may need to have pressed [down], etc, to pass parts where the code was actually intended to print things.)
You'll need to traverse all your m-files (probably using a recursive function, or unix('find -type f -iname *.m') ). Call mlint on each filename:
r = mlint(filename);
r will be a (possibly empty) structure with a message field. Look for the message that starts with "Terminate statement with semicolon to suppress output".

Using variables defined in R environment within knitr

I'd like to use knitr to produce a document which includes numbers, tables, and figures defined separately in R. I have a separate .R script within which I define the variables of interest, and I have executed this script and verified the variables are in memory. Then, within a .Rmd file I have the R markup code, within which I attempt to display the variables I've defined in R. I keep getting error messages whenever I attempt to knit, stating something like:
Error in unique(c("AsIs", oldClass(x))) : object "lamdaQ6" not found...
Clearly the knitr process initiates a new environment which excludes already defined variables. I have rather extensive R code to define the variables I want to include in the document, which I want to keep separate from the R markup code (both for clarity, and because R markup is not a good development environment for R).
Is there some means of preserving awareness of existing variables in R memory within knitr? I have searched extensively and not found a solution, probably because I don't know the correct term.
The solution offered by Paul Roub worked -- I thought it didn't, but this turned-out to be due to my own typo.
A cleaner solution for my needs, rather than source the whole R file, was to use "save" to save just the objects I needed to create the document. I added a line to save the objects at the end of the .R file, then used "load" to load the objects at the top of the .Rmd script.
So, at the end of the .R file, to save objects XX and YY, I have:
save(XX,YY,file="Data4Rmd.RData")
Then, near the beginning of the .Rmd file, to import these objects, I have:
```
{r,echo=FALSE}
setwd("c:\CurrentProjectDirectory")
load(file="Data4Rmd.RData")
```
This allows me to later use statements like:
Decay rate is lambda = `r XX`, and mean lifetime is `r YY`.
Yet another alternative might have been to save and load the whole workspace, but this seemed excessive.
Thanks for the assistance Paul, much appreciated :)

Using regexp to index a file for imenu, performance is unacceptable

I'm producing a function for imenu-create-index-function, to index a source code module, for csharp-mode.el
It works, but delivers completely unacceptable performance. Any tips for fixing this?
The Background
I looked at js.el, which is the rebadged "espresso" now included, since v23.2, into emacs. It indexes Javascript files very nicely, does a good job with anonymous functions and various coding styles and patterns in common use. For example, in javascript one can do:
(function() {
var x = ... ;
function foo() {
if (x == 1) ...
}
})();
...to define a scope where x is "private" or inaccessible from other code. This gets indexed nicely by js.el, using regexps, and it indexes the inner functions (anonymous or not) within that scope also. It works quickly. A big module can be indexed in less than a second.
I tried following a similar approach in csharp-mode, but it's quite a bit more complicated. In Js, everything that gets indexed is a function. So the starting regex is "function" with some elaboration on either end. Once an occurrence of the function keyword is found, then there are 4 - 8 other regexps that get tried via looking-at - the number depends on settings. One nice thing about js mode is that you can turn on or off regexps for various coding styles, to speed things along I suppose. The default "styles" work for most of the code I tried.
This doesn't work in csharp-mode. It works, but it performs poorly enough to make it not very usable. I think the reason for this is that
there is no single marker keyword in C#, as function behaves in javascript. In C# I need to look for namespace, class, struct, interface, enum, and so on.
there's a great deal of flexibility with which csharp constructs can be defined. As one example, a class can define base classes as well as implemented interfaces. Another example: The return type for a method isn't a simple word-like string, but can be something messy like Dictionary<String, List<String>> . The index routine needs to handle all those cases, and capture the matches. This makes it run sloooooowly.
I use a lot of looking-back. The marker I use in the current approach is the open curly brace. Once I find one of those, I use looking-back to determine if the curly is a class, interface, enum, method, etc. I read that looking-back can be slow; I'm not clear on how much slower it is than, say, looking-at.
once I find an open-close pair of curlies, I call narrow-to-region in order to index what's inside. not sure if this is will kill performance or not. I suspect that it is not the main culprit, because the perf problems I see happen in modules with one namespace and 2 or 3 classes, which means narrow gets called 3 or 4 times total.
What's the Question?
My question is: do you have any tips for speeding up imenu-like indexing in a C# buffer?
I'm considering:
avoiding looking-back. I don't know exactly how to do this because when re-search-forward finds, say, the keyword class, the cursor is already in the middle of a class declaration. looking-back seems essential.
instead of using open-curly as the marker, use the keywords like enum, interface, namespace, class
avoid narrow-to-region
any hard advice? Further suggestions?
Something I've tried and I'm not really enthused about re-visiting: building a wisent-based parser for C#, and relying on semantic to do the indexing. I found semantic to be very very very (etc) difficult to use, hard to discover, and problematic. I had semantic working for a while, but then upgraded to v23.2, and it broke, and I never could get it working again. Simple things - like indexing the namespace keyword - took a very long time to solve. I'm very dissatisfied with it and don't want to try again.
I don't really know C# syntax, and without looking at your elisp it's hard to give an answer, but here goes anyway.
looking-back can be deadly slow. It's the first thing I'd experiment with. One thing that helps a lot is using the limit arg to, say, restrict your search to the beginning of the current line. A different approach is when you hit the open curly do backward-char then backward-sexp (or whatever) to get to the front of the previous word, then use looking-at.
Using keywords to search around instead of open curly is probably what I would have done. Maybe something like (re-search-forward "\\(enum\\|interface\\|namespace\\|class\\)[ \t\n]*{" nil t) then using match-string-no-properties on the first capture group to see which of the keywords was found. This might help with the looking-back problem as well.
I don't know how expensive narrow-to-region is, but could be avoided by when you find a open curly do save-excursion forward-sexp and keep point as a limit for the current iteration of your (I assume recursive) searches.