Find variables and the dependencies between them in code - matlab

I am trying to write a function that finds all the variables in a function which contains computing operations. I do this to find out which variables this block of calculations requires as input arguments to perform the calculations.
A calculation function always gets a table with different parameters as input, accesses certain parameters of that table to calculate certain metrics.
For example my data table T contains the double arrays Power and Time. This table is passed to the function calc_energy which then calculates Energy:
function [ T ] = calc_energy( T )
T.Energy = T.Power .* T.Time;
end
Say whenever a new data table is generated, different calculations as the above one are run automatically. Now if Power itself is calculated by the function calc_power, but calc_energy is run before or parallel to calc_power, then I have a problem because calc_energy doesn't find the required variables in T.
In order to prevent such an error, I want my function check_required_variables to be called inside the calc_energy and check beforehand whether T.Power exists. The thing is that check_required_variables should be a general function that is called in every single calculation function and therefore doesn't know the required variables. It has to find them in the function that it is called by.
function [ T ] = calc_energy( T )
OK = check_required_variables( T, 'calc_energy.m' );
if OK == 1
T.Energy = T.Power .* T.Time;
else
error('Required fields not found');
end
end
Are there any functions that spot the variables Energy, Power and Time used in my code? And are there functions that maybe analyse the dependencies between these variables (Energy dependent on Power and Time)? What are generally clever ways to figure out the ocurring variables just from the code? Any ideas?
I know this is a tough one so I'm thankful for any suggestions.

use isfield
function [ T ] = calc_energy( T )
if all(isfield(T,{'Power','Time'}))
T.Energy = T.Power .* T.Time;
else
error('Nooooope')
end
end
However, knowing which ones are required seems a bit harder.... You can always try to read the .m file, and regexp for T.____, then put the input of that to isfield.
This however, just hints of bad software design. A function should know what it requires to run. What happens if OK is false? It just skips the computation? You can then cascade to hundreds of calls (depend on the application) because the original structure failed to have a variable, or had a typo. I'd take the radically opposite approach to software design:
function [ T ] = calc_energy( T )
assert(all(isfield(T,{'Power','Time'}))); %error if there are not.
T.Energy = T.Power .* T.Time;
end

Related

Only add existing variables in Matlab

I would like to add the sums of 5 one-column arrays in Matlab. The catch is that depending on previous inputs, any of these arrays may or may not exist, thus throwing an error when I try to add the sums of these arrays for post-processing.
After doing some digging I found the following function which I can use to return a logical statement if a certain variable exists in the workspace:
exist('my_variable','var') == 1
I'm not sure this helps me in this case though - Currently the line in which I add the sums of the various arrays looks as follows:
tot_deviation = sum(var1) + sum(var2) + sum(var3) + sum(var4) + sum(var4);
Is there a short way to add only the sums of the existing arrays without excessive loops?
Any help would be appreciated!
You can use if statements:
if ~exist('var1','var'), var1 = 0;end
if ~exist('var2','var'), var2 = 0;end
if ~exist('var3','var'), var3 = 0;end
if ~exist('var4','var'), var4 = 0;end
tot_deviation = sum(var1) + sum(var2) + sum(var3) + sum(var4) + sum(var4);
To my knowledge there is no quick way to do this in matlab. It seems to me that, depending on the structure of your code, you have the following alternatives:
1- Initialize all your variables to column arrays of zeros with something like
var = zeros(nbLines, 1);
2- Put all your columns vectors side to side in a single array and the use tot_deviation = sum(sum(MyArray)); which will work no matter how many columns and lines there is in the array.
3- If you pass your variables to a function you can check the number of inputs arguments inside the function using 'nargin' and then only proceed to sum the right number of variables.
I would recommend using the second method for it seem to me that it is the one that allows you to take the most advantage of matlab's array system which good.
The most robust solution is to initialise all of your variables to 0 at the top of the function. Then there is no chance they don't exist, and they influence the summation correctly.
Alternatively...
You could (read: shouldn't) use a really nasty eval trick here for flexibility...
vars = {'var1','var2','var3','var4'};
tot = 0;
for ii = 1:numel(vars)
if exist(vars{ii}, 'var')
tot = tot + eval(var);
end
end
I say it's "nasty" because eval should be avoided (read the linked blog). The check on the variable name existence mitigates some of the strife, but it's still not ideal.
As suggested in the MathWorks blog on evading eval, a better option would be a struct with dynamic field names. You could use almost the same syntax as above, but replace the if statement with
if isfield( myStruct, vars{ii} )
tot = tot + myStruct.(vars{ii});
end
This will avoid dynamically named variables and keep your workspace clean!

matlab: check which lines of a path are used - graphshortestpath

The related problem comes from the power Grid in Germany. I have a network of substations, which are connected according to the Lines. The shortest way from point A to B was calculated using the graphshortestpath function. The result is a path with the used substation ID's. I am interested in the Line ID's though, so I have written a sequential code to figure out the used Line_ID's for each path.
This algorithm uses two for loops. The first for-loop to access the path from a cell array, the second for-loop looks at each connection and searches the Line_ID from the array.
Question: Is there a better way of coding this? I am looking for the Line_ID's, graphshortestpath only returns the node ID's.
Here is the main code:
for i = i_entries
path_i = LKzuLK_path{i_entries};
if length(path_i) > 3 %If length <=3 no lines are used.
id_vb = 2:length(path_i) - 2;
for id = id_vb
node_start = path_i(id);
node_end = path_i(id+1);
idx_line = find_line_idx(newlinks_vertices, node_start, ...
node_end);
Zuordnung_LKzuLK_pathLines(ind2sub(size_path,i),idx_line) = true;
end
end
end
Note: The first and last enrty of path_i are area ID's, so they are not looked upon for the search for the Line_ID's
function idx_line = find_line_idx(newlinks_vertices, v_id_1, v_id_2)
% newlinks_vertices includes the Line_ID, and then the two connecting substations
% Mirror v_id's in newlinks_vertices:
check_links = [newlinks_vertices; newlinks_vertices(:,1), newlinks_vertices(:,3), newlinks_vertices(:,2)];
tmp_dist1 = find(check_links(:,2) == v_id_1);
tmp_dist2 = find(check_links(tmp_dist1,3) == v_id_2,1);
tmp_dist3 = tmp_dist1(tmp_dist2);
idx_line = check_links(tmp_dist3,1);
end
Note: I have already tried to shorten the first find-search routine, by indexing the links list. This step will return a short list with only relevant entries of the links looked upon. That way the algorithm is reduced of the first and most time consuming find function. The result wasn't much better, the calculation time was still at approximately 7 hours for 401*401 connections, so too long to implement.
I would look into Dijkstra's algorithm to get a faster implementation. This is what Matlab's graphshortestpath uses by default. The linked wiki page probably explains it better than I ever could and even lays it out in pseudocode!

Calculations in table based on variable names in matlab

I am trying to find a better solution to calculation using data stored in table. I have a large table with many variables (100+) from which I select smaller sub-table with only two observations and their difference for smaller selection of variables. Thus, the resulting table looks for example similarly to this:
air bbs bri
_________ ________ _________
test1 12.451 0.549 3.6987
test2 10.2 0.47 3.99
diff 2.251 0.078999 -0.29132
Now, I need to multiply the ‘diff’ row with various coefficients that differ between variables. I can get the same result with the following code:
T(4,:) = array2table([T.air(3)*0.2*0.25, T.bbs(3)*0.1*0.25, T.bri(3)*0.7*0.6/2]);
However, I need more flexible solution since the selection of variables will differ between applications. I was thinking that better solution might be using either varfun or rowfun and speficic function that would assign correct coefficients/equations based on variable names:
T(4,:) = varfun(#func, T(3,:), 'InputVariables', {'air' 'bbs' 'bri'});
or
T(4,:) = rowfun(#func, T(3,:), 'OutputVariableNames', T.Properties.VariableNames);
However, the current solution I have is similarly inflexible as the basic calculation above:
function [air_out, bbs_out, bri_out] = func(air, bbs, bri)
air_out = air*0.2*0.25;
bbs_out = bbs*0.1*0.25;
bri_out = bri*0.7*0.6/2;
since I need to define every input/output variable. What I need is to assign in the function coefficients/equations for every variable and the ability of the function to apply it only to the variables that are present in the specific sub-table.
Any suggestions?

Matlab saving without losing previous iteration's value

i'm trying to save values of iterations in a loop. After this function, they will execute other functions before coming to the next iteration. But the problem i'm facing is, it overwrites them and bcomes 000000. Only the last iteration values are seen. How can i fix it ? Is there a way not to use the same variable and save them ? i read about append but will have to use different var name n is not really nice to do so.
function DistanceMatrix(iteration,distance_row)
load('data.mat','oridata')
load('centroids.mat','centroids')
for i = distance_row:(distance_row+3)
for j=1:300 %no.genes
total=0;
for k=1:6
total=total+((oridata(j,k)- centroids(i,k))^2);
end
DistMatrix_Val(i,j)=sqrt(total);
end
end
save('DistanceMatrix.mat','DistMatrix_Val')
DistMatrix_Val;
GroupMatrix(iteration,distance_row)
end
This is the output. I WOULD LIKE TO STORE ALL ITERATION's value and not overwrite them. Can any1 help ?
OK. Use
load('DistanceMatrix.mat','DistMatrix_Val')
or
persistent DistMatrix_Val
just before the first load command you showed to us.
I think this is what you should do:
functon DistanceMatrix = DistanceMatrix(iteration,distance_row)
Rather than saving the variable at the end of the function, you return it.
After returning it you can include the variable in a bigger variable (for example a three dimensional Nx4x300 matrix)
If neccesary you can then save it at the end.

How to reciprocate?

Actually I am coding a Matlab simulation where the AnchorID and SourceID will report to eachother. For example if I take an anchor 30 and source 50 it will collect all the agc values between these anchor and source and calculate rssi_dB and display them.Below mentioned is an example of anchor 30 and source id 50
Note: list of anchors ID's and source ID's are same. for e.g. 30 50 55 58 . These ID are both same for anchor and source.
function A30(BlinkSet)
for i=1:length(BlinkSet)
xAnchorID=30;
xSourceID=50;
a=BlinkSet{i}.AnchorID;
b=BlinkSet{i}.SourceID;
if xAnchorID==a && xSourceID==b
xagc=BlinkSet{i}.agc;
rssi_dB(i)=-(33+xagc*(89-33)/(29-1));
end
end
rssi_dB(rssi_dB==0)=[];
rssi_dBm=sum(rssi_dB(:))/length(rssi_dB);
disp([sprintf('The rssi value is %0.0f',rssi_dBm)]);
When I call the function in Matlab command window I get the rssi value of the above function.
Also my task is when I reciprocate the Anchor ID and source ID say Anchor as 50 and source as 30 like the function I have mentioned below I get an error which is mentioned after the function below.
function A50(BlinkSet)
for i=1:length(BlinkSet)
xAnchorID=50;
xSourceID=30;
a=BlinkSet{i}.AnchorID;
b=BlinkSet{i}.SourceID;
if xAnchorID==a && xSourceID==b
xagc=BlinkSet{i}.agc;
rssi_dB(i)=-(33+xagc*(89-33)/(29-1));
end
end
rssi_dB(rssi_dB==0)=[];
rssi_dBm=sum(rssi_dB(:))/length(rssi_dB);
disp([sprintf('The rssi value is %0.0f',rssi_dBm)]);
When I call this function I get an error as
??? Undefined function or variable "rssi_dB".
Error in ==> A50 at 14
rssi_dB(rssi_dB==0)=[];
Error in ==> main_reduced at 26
A50(BlinkSet);
In main function I have coded like this,
%A30(BlinkSet);
A50(BlinkSet);
Any help is highly appreciated.
In both of these functions, you only create the variable rssi_dB if execution enters the if statement within the loop (i.e., if xAnchorID==a && xSourceID==b is at some point true). Clearly, this code is never executed in your A50 function. Without knowing what is in BlinkSet it's a bit difficult to diagnose the exact problem, but this is the cause at least.
As a side note: it's not a good idea to create two separate functions to do this job when their code is almost identical. You should add an input argument to your function that allows it to do the job of both. In this particular case, all that changes is the value of xAnchorID and xSourceID, so you could just pass these in:
function srcToAnchorRssi(BlinkSet, xSourceID, xAnchorID)
% The rest of the function stays the same!
If you want to provide some defaults for these arguments, you can do, e.g.:
if nargin < 3 || isempty(xAnchorID), xAnchorID = 50; end
if nargin < 2 || isempty(xSourceID), xSourceID = 30; end
It's always a good idea to include an isempty in statements of this sort, so that your function supports syntax like myFunction(myArg1, [], myArg3). Also note that the order of the operands to || is crucial; if you did if isempty(theArgument) || nargin < theArgumentNumber and the user did not pass theArgument, then it would error in the isempty because theArgument would not exist as a local variable. We can get around this by swapping the operands' order because MATLAB is smart enough to know it doesn't have to evaluate the right operand if the left operand is true (note that this is also the case in many other programming languages).