How to identify the script that produced a specific workspace variable? - matlab

I am learning about an existing code, which produces lots of different variables. My goal is to identify the variable in the workspace and localize the script that produced this variable.
I need to localize the script, because in the code I have a script that calls the other three scripts. For this reason it is difficult to identify the script that produced a specific variable. Besides, the three codes are very long.
How can I identify the source script based just on the workspace variables?

I recently had a similar problem, so I hacked a quick function together, which will detect newly created variables, based on an initial state.
function names2 = findNewVariables(state)
persistent names1
if state == 1
% store variables currently in caller workspace
names1 = evalin('caller', 'who');
names2 = [];
elseif state == 2
% which variables are in the caller workspace in the second call
names2 = evalin('caller', 'who');
% find which variables are new, and filter previously stored
ids = ismember(names2,names1) ~= 1;
names2(~ids) = [];
names2(strcmp(names2, 'names1')) = [];
names2(strcmp(names2, 'names2')) = [];
names2(strcmp(names2, 'ans')) = [];
end
To use this, first initialize the function with argument 1 to get the variables currently in the workspace: findNewVariables(1). Then run some code, script, whatever, which will create some variables in the workspace. Then call the function again, and store its output as follows: new_vars = findNewVariables(2). new_vars is a cell array that contains the names of the newly created variables.
Example:
% make sure the workspace is empty at the start
clear
a = 1;
% initialize the function
findNewVariables(1);
test % script that creates b, c, d;
% store newly created variable names
new_vars = findNewVariables(2);
Which will result in:
>> new_vars
new_vars =
3×1 cell array
{'b'}
{'c'}
{'d'}
Note, this will only detect newly created variables (hence a clear is required at the start of the script), and not updated/overwritten variables.

You could use exist for this. Roughly:
assert( ~exist('varName', 'var'), 'Variable exists!');
script1(in1, in2);
assert( ~exist('varName', 'var'), 'Variable exists!');
script2(in1, in2);
assert( ~exist('varname', 'var'), 'Variable exists!');
When the assertion fails, it means that the variable was created.

Related

How can I run a MATLAB script on .csv files in two separate folders at the same time?

So I have an iterative loop that extracts data from .csv files in MATLAB's active folder and plots it. I would like to take it one step further and run the script on two folders, each with their own .csv files.
One folder is called stress and the other strain. As the name implies, they contain .csv files for stress and strain data for several samples, each of which is called E3-01, E3-02, E3-03, etc. In other words, both folders have the same number of files and the same names.
The way I see it, the process would have the following steps:
Look in the stress folder, look inside file E3-01, extract the data in the column labelled Stress
Look in the strain folder, look inside file E3-01, extract the data in the column labelled Strain
Combine the data together for sample E3-01 and plot it
Repeat steps 1-3 for all files in the folders
Like I said, I already have a script that can find the right column and extract the data. What I'm not sure about is how to tell MATLAB to alternate the folder that the script is being run on.
Instead of a script, would a function be better? Something that accepts 4 inputs: the names of the two folders and the columns to extract?
EDIT: Apologies, here's the code I have so far:
clearvars;
files = dir('*.csv');
prompt = {'Plot name:','x label:','y label:','x values:','y values:','Points to eliminate:'};
dlg_title = 'Input';
num_lines = 1;
defaultans = {'Title','x label','y label','Surface component 1.avg(epsY) [True strain]','Stress','0'};
answer = inputdlg(prompt,dlg_title,num_lines,defaultans);
name_plot = answer{1};
x_label = answer{2};
y_label = answer{3};
x_col = answer{4};
y_col = answer{5};
des_cols = {y_col,x_col};
smallest_n = 100000;
points_elim = answer{6};
avg_x_values = [];
avg_y_values = [];
for file = files'
M=xlsread(file.name);
[row,col]=size(M);
if smallest_n > row
smallest_n = row;
end
end
smallest_n=smallest_n-points_elim;
avg_x_values = zeros(smallest_n,size(files,1));
avg_y_values = zeros(smallest_n,size(files,1));
hold on;
set(groot, 'DefaultLegendInterpreter', 'none');
set(gca,'FontSize',20);
ii = 0;
for file = files'
ii = ii + 1;
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,x_col));
x_values = n(1:end-points_elim,col);
[row, col] = find(strcmpi(s,y_col));
y_values = n(1:end-points_elim,col);
plot(x_values,y_values,'DisplayName',s{1,1});
avg_x_values(:,ii)=x_values(1:smallest_n);
avg_y_values(:,ii)=y_values(1:smallest_n);
end
ylabel({y_label});
xlabel({x_label});
title({name_plot});
colormap(gray);
hold off;
avg_x_values = mean(avg_x_values,2);
avg_y_values = mean(avg_y_values,2);
plot(avg_x_values,avg_y_values);
set(gca,'FontSize',20);
ylabel({y_label});
xlabel({x_label});
title({name_plot});
EDIT 2: #Adriaan I tried to write the following function to get a column from a file:
function [out_col] = getcolumn(col,file)
file = dir(file);
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,col));
out_col = n(1:end,col);
end
but I get the error
Function 'subsindex' is not defined for values of class 'struct'.
Error in getcolumn (line 21)
y = x(:,n);
not sure why.
You can do both, of course, and it depends on preference mainly, provided you're the sole user of the script. If others are going to use it as well, use functions instead, as they can contain a proper help file and calling help functionname will then give you that help.
For instance:
folders1 = dir(../strain/*)
folders2 = dir(../stress/*)
for ii 1 = 1:numel(folders)
operand1 = folders1{ii};
operand2 = folders2{ii};
%... rest of script
%
% Or function:
data = YourFunction(folders1{ii},folders2{ii})
end
So all in all you can use both, although from experience I find functions easier to use in the end, as you just pass parameters and don't need to trawl through the complete code to change the parameters each run.
Additionally you can partition off small parts of your program which do a fix task. If you nest your functions, and finally call just a single function in your scripts, you don't have to look at hundreds of lines of code each time you run the script, but rather can just run a single function (which can also be inside a script or function, ad infinitum).
Finally, a function has its own scope; meaning that any variables that are in that function stay within that function unless you explicitly set them as output (apart from global variables, but those are problematic anyway). This can be a good thing, or a bad thing, depending on the rest of your code. If you function would output ~20 variables for further processing, the function probably should include more steps. It'd be a good thing if you create lots of intermediate variables (I always do), because when the function's finished running, the scope of that function will be removed from memory, saving you clear tmpVar1 tmpVar2 tmpVar3 etc every few lines in your script.
For the script the argument in favour would be that it is easier to debug; you don't need dbstop on error and can step a bit easier through the script, keeping check of all your variables. But, after the debugging has been completed, this argument becomes moot, and thus in general I'd start with writing a script, and once it performs as desired, I rework it to a function at minimal extra effort.

Will returned array be copied by value or returned as reference in MATLAB?

I wanted to ask how values in MATLAB are returned? Are they copied or passed by reference?
take a look at this example with matrix A:
function main
A = foo(10);
return;
end
function [resultMatrix] = foo(count)
resultMatrix = zeros(count, count);
return;
end
Does the copy operation take place when function returns matrix and assigns it to variable A ?
MATLAB uses a system known as copy-on-write in which a copy of the data is only made when it is necessary (i.e. when the data is modified). When returning a variable from a function, it is not modified between when it was created inside of the function and when it was stored in a different variable by the calling function. So in your case, you can think of the variable as being passed by reference. Once the data is modified, however, a copy will be made
You can check this behavior using format debug which will actually tell us the memory location of the data (detailed more in this post)
So if we modify your code slightly so that we print the memory location of each variable we can track when a copy is made
function main()
A = foo(10);
% Print the address of the variable A
fprintf('Address of A in calling function: %s\n', address(A));
% Modify A
B = A + 1;
% Print the address of the variable B
fprintf('Address of B in calling function: %s\n', address(B));
end
function result = foo(count)
result = zeros(count);
% Print the address of the variable inside of the function
fprintf('Address of result in foo: %s\n', address(result));
end
function loc = address(x)
% Store the current display format
fmt = get(0, 'format');
% Turn on debugging display and parse it
format debug
loc = regexp(evalc('disp(x)'), '(?<=pr\s*=\s*)[a-z0-9]*', 'match', 'once');
% Revert the display format to what it was
format(fmt);
end
And this yields the following (or similar) output
Address of result in foo: 7f96d9d591c0
Address of A in calling function: 7f96d9d591c0
Address of B in calling function: 7f96d9c74400
As a side-note, you don't need to explicitly use return in your case since the function will naturally return when it encounters the end. return is only necessary when you need to use it to alter the flow of your program and exit a function pre-maturely.

Matlab load mat into variable

When loading data from a .Mat file directly into a variable, it stores an struct instead of the variable itself.
Example:
myData.mat contains var1, var2, var3
if I do:
load myData.mat
it will create the variables var1, var2 and var3 in my workspace. OK.
If I assign what load returns to a variable, it stores an struct. This is normal since I'm loading several variables.
foo = load('myData.mat')
foo =
struct with fields:
var1
var2
var3
However suppose that I'm only interested in var1 and I want to directly store into a variable foo.
Load has an option of loading only specific variables from a .mat file, however it still stores an struct
foo = load('myData.mat', 'var1')
foo =
struct with fields:
var1
I want var1 to be directly assigned to foo.
Of course I can do:
foo = load('myData.mat', 'var1')
foo = foo.var1;
But it should be a way of doing this automatically in one line right?
If the MAT-file contains one variable, use
x = importdata(mat_file_name)
load does not behave this way otherwise load would behave inconsistently depending upon the number of variables that you have requested which would lead to an extremely confusing behavior.
To illustrate this, imagine that you wrote a general program that wanted to load all variables from a .mat file, make some modification to them, and then save them again. You want this program to work with any file so some files may have one variable and some may have multiple variables stored in them.
If load used the behavior you've specified, then you'd have to add in all sorts of logic to check how many variables were stored in a file before loading and modifying it.
Here is what this program would look like with the current behavior of load
function modifymyfile(filename)
data = load(filename);
fields = fieldnames(data);
for k = 1:numel(fields)
data.(fields{k}) = modify(data.(fields{k}));
end
save(filename, '-struct', 'data')
end
If the behavior was the way that you think you want
function modifymyfile(filename)
% Use a matfile to determine the number of variables
vars = whos(matfile(filename));
% If there is only one variable
if numel(vars) == 1
% Assign that variable (have to use eval)
tmp = load(filename, vars(1).name);
tmp = modify(tmp);
% Now to save it again, you have to use eval to reassign
eval([vars(1).name, '= tmp;']);
% Now resave
save(filename, vars(1).name);
else
data = load(filename);
fields = fieldnames(data);
for k = 1:numel(fields)
data.(fields{k}) = modify(data.(fields{k}));
end
save(filename, '-struct', 'data');
end
end
I'll leave it to the reader to decide which of these is more legible and robust.
The best way to do what you're trying to do is exactly what you've shown in your question. Simply reassign the value after loading
data = load('myfile.mat', 'var1');
data = data.var1;
Update
Even if you only wanted the variable to not be assigned to a struct when a variable was explicitly specified, you'd still end up with inconsistent behavior which would make it difficult if my program accepted a list of variables to change as a cell array
variables = {'var1', 'var2'}
data = load(filename, variables{:}); % Would yield a struct
variables = {'var1'};
data = load(filename, variables{:}); % Would not yield a struct
#Suever is right, but in case you wish for a one-line workaround this will do it:
foo = getfield(load('myData.mat'), 'var1');
It looks ugly but does what you want:
foo = subsref(matfile('myData.mat'),struct('type','.','subs','var1'))
Use matfile allows partial loading of variables into memory i.e. it only loads what is necessary. The function subsref does the job of the indexing operator "." in this case.

MATLAB: Loop through the values of a list from 'who' function

I have a long list of variables in my workspace.
First, I'm finding the potential variables I could be interested in using the who function. Next, I'd like to loop through this list to find the size of each variable, however who outputs only the name of the variables as a string.
How could I use this list to refer to the values of the variables, rather than just the name?
Thank you,
list = who('*time*')
list =
'time'
'time_1'
'time_2'
for i = 1:size(list,1);
len(i,1) = length(list(i))
end
len =
1
1
1
If you want details about the variables, you can use whos instead which will return a struct that contains (among other things) the dimensions (size) and storage size (bytes).
As far as getting the value, you could use eval but this is not recommended and you should instead consider using cell arrays or structs with dynamic field names rather than dynamic variable names.
S = whos('*time*');
for k = 1:numel(S)
disp(S(k).name)
disp(S(k).bytes)
disp(S(k).size)
% The number of elements
len(k) = prod(S(k).size);
% You CAN get the value this way (not recommended)
value = eval(S(k).name);
end
#Suever nicely explained the straightforward way to get this information. As I noted in a comment, I suggest that you take a step back, and don't generate those dynamically named variables to begin with.
You can access structs dynamically, without having to resort to the slow and unsafe eval:
timestruc.field = time;
timestruc.('field1') = time_1;
fname = 'field2';
timestruc.(fname) = time_2;
The above three assignments are all valid for a struct, and so you can address the fields of a single data struct by generating the field strings dynamically. The only constraint is that field names have to be valid variable names, so the first character of the field has to be a letter.
But here's a quick way out of the trap you got yourself into: save your workspace (well, the relevant part) in a .mat file, and read it back in. You can do this in a way that will give you a struct with fields that are exactly your variable names:
time = 1;
time_1 = 2;
time_2 = rand(4);
save('tmp.mat','time*'); % or just save('tmp.mat')
S = load('tmp.mat');
afterwards S will be a struct, each field will correspond to a variable you saved into 'tmp.mat':
>> S
S =
time: 1
time_1: 2
time_2: [4x4 double]
An example writing variables from workspace to csv files:
clear;
% Writing variables of myfile.mat to csv files
load('myfile.mat');
allvars = who;
for i=1:length(allvars)
varname = strjoin(allvars(i));
evalstr = strcat('csvwrite(', char(39), varname, '.csv', char(39), ', ', varname, ')');
eval(evalstr);
end

List all environment variables in Matlab

How does one get a list of all defined environment variables in Matlab? I'm aware of getenv but you have to provide a name, and doc getenv offers no help in how to use it to retrieve items in any other way. I can't find any other relevant information online. Is this even possible?
I'm interested in a platform-independent answer (or at least Windows and Linux).
You could use
system('env')
on linux/mac, and
system('set') % hope I remember correctly, no windows at hand
In both cases you'd have to parse the output though, as it comes in the format variable=<variable-value>.
Below is a function that implements two ways to retrieve all environment variables (both methods are cross-platform):
using Java capabilities in MATLAB
using system-specific commands (as #sebastian suggested)
NOTE: As #Nzbuu explained in the comments, using Java's System.getenv() has a limitation in that it returns environment variables captured at the moment the MATLAB process starts. This means that any later changes made with setenv in the current session will not be reflected in the output of the Java method. The system-based method does not suffer from this.
getenvall.m
function [keys,vals] = getenvall(method)
if nargin < 1, method = 'system'; end
method = validatestring(method, {'java', 'system'});
switch method
case 'java'
map = java.lang.System.getenv(); % returns a Java map
keys = cell(map.keySet.toArray());
vals = cell(map.values.toArray());
case 'system'
if ispc()
%cmd = 'set "'; %HACK for hidden variables
cmd = 'set';
else
cmd = 'env';
end
[~,out] = system(cmd);
vars = regexp(strtrim(out), '^(.*)=(.*)$', ...
'tokens', 'lineanchors', 'dotexceptnewline');
vars = vertcat(vars{:});
keys = vars(:,1);
vals = vars(:,2);
end
% Windows environment variables are case-insensitive
if ispc()
keys = upper(keys);
end
% sort alphabetically
[keys,ord] = sort(keys);
vals = vals(ord);
end
Example:
% retrieve all environment variables and print them
[keys,vals] = getenvall();
cellfun(#(k,v) fprintf('%s=%s\n',k,v), keys, vals);
% for convenience, we can build a MATLAB map or a table
m = containers.Map(keys, vals);
t = table(keys, vals);
% access some variable by name
disp(m('OS')) % similar to getenv('OS')