Good practices for formatting simulation output - simulation

This is almost a programming question, but geared towards physicists.
Suppose I am writing a piece of software that takes some system parameters as input and then calculates something from it, in my case a spectral function $A(k,\omega)$.
When I want to just take the output and feed it to gnuplot, I should make the program output a simple table with one column for the $k$-values, one for $\omega$ and one for $A(k,\omega)$.
But then I cannot store there all the additional information, such as what parameters were used. And maybe I want to store in that output some additional debugging information such as intermediate quantities. In my example, the spectral function is obtained from the self energy, so in some situations I might want to look at the self energy directly.
I do not want to constantly hack the source code depending on what output I want. It would be nicer if all the relevant data of a "run" would be present in a single file/entity but so that it is still easy to extract tables I can feed to gnuplot.
Not wanting to reinvent the wheel and develop a full-blown file format, are there some "standards" around that are best used when creating, processing and storing data from calculations or simulations? Maybe even in an SQL database format?

There are dozens of methods, and none too good; I'll share two mine:
If the program is worth it, I add a small parser of config files. Then I just make a cofig, let's say, SimA.in, and simulator makes a bunch of files with corresponding data SimA.paths, SimA.stats, SimA.log, etc. Unless the names are unique and I add version of the code to log, this makes the results fully reproducible and the simulation itself portable enough to be easily manageable.
If not, I just wrap a code a bit and use R as a host. Then I just return all the arrays and scalars (R data structures are very flexible, and it is easy to cast native R or C structs) and use R to manage, save/load and of course visualize and analyse the data. Moreover, with Sweave and CacheSweave the whole executing, analysis and reporting can be bunched in an elegant bunch, fully reproducible with one command.
If you want an "enterprise" solution, try NetCDF or HDF5. But I feel it may be an overkill here.
And of course a version control of the simulator code is a must. But that's obvious =)

For a project I'm currently working on that uses Python and C++ (via SWIG), I'm planning to use a short python script as input file. So, in a way, I'll be 'hacking the source' to change parameters, but in an interpreted language, not a compiled one.
Currently, I plan to have an input file like parameters.py, and use it like from parameters import params. But that might be too dependent on correct syntax.
params = {
"foods" : ["spam", "beans", "eggs"],
"costs" : [199, 4, 1],
"customerAge" : 23,
}
Another option might be to just define the variables at the script level in parameters2.py. This loses the nice dictionary packaging, but makes it a little harder for the user to mess it up. And it probably wouldn't be to hard to write a 'parser' that puts those script-level variables into a nice dictionary. A plus to method is that the user could parameterize things that weren't originally considered--from parameters2 import * would overwrite previous definitions of those parameters. Of course, this might be bad if the user overwrites something important.
foods = ["spam", "beans", "eggs"]
costs = [199, 4, 1]
customerAge = 23
parameters3.py would use a class, though it is contraindicated by Python's persnicketiness about indentation. from parameters3 import params:
class params:
foods = ["spam", "beans", "eggs"]
costs = [199, 4, 1]
customerAge = 23
I should also mention, for completeness, that our C++ code also defines a parameters class. That is, in our actual project, parameters.py is a SWIG wrapper for a corresponding C++ class. You'd use like from parameters4 import params. However, this allows only parameters that are already declared in the C++ class.
import parameters
params = parameters.Parameters()
params.foods = ["spam", "beans", "eggs"]
params.costs = [199, 4, 1]
params.customerAge = 23

Related

How to extract an MSL model, modify the code, and use locally?

I am interested to replace my own PID-regulator models with MSL/Blocks/Continuous/LimPID. The problem is that this model restricts limitations of output signals to be parameters and thus do not allow time-varying limits, which I need to have.
Studying the code I understand that the output limitation is created by a block MSL/Blocks/Nonlinear/Limiter and I just want to change this to the block VariableLimiter.
I can imagine that you need to ensure that changes of output-limitations vary in a time-scale slower than the regulator in order to not excite unwanted behaviour of the controller. Still here is a class of problems where it would be very useful to allow this limits to vary slowly.
Thanks for the good input to my question and below a very simple example to refine my question. (The LimPID is more complicated and I come back to that).
Let us instead just modify the block Add to a local block in MyModel.
I copy the code from Modelica.Blocks.Math.Add and call it Addb in MyModel. Since here is a dependence of Interfaces.SI2SO I need to make an import before the extends-clause. This import I take from the ordinary general MSL package, instead of copying also that in to MyModel. Then I introduce a new parameter "bias" and modify the equation. The annotation may need some update as well but we do not bother with that now.
MyModel
...
block Addb "Output the sum of the two inputs"
import Modelica.Blocks.Interfaces;
extends Interfaces.SI2SO;
parameter Real k1=+1 "Gain of input signal 1";
parameter Real k2=+1 "Gain of input signal 2";
parameter Real bias=0 "Bias term";
equation
y = k1*u1 + k2*u2 + bias;
annotation (...);
end Addb;
MyModel;
This code seems to work.
My added new question is whether it is enough to look up "extends-clauses" and other references to MSL and make the proper imports since the code is now local, or here are more aspects to think of? The LimPID code is rather complex with procedures for initialization etc so I just wonder if here is more to do than just bring in a number of import-clauses?
The models in Modelica Standard Library (MSL) should only be seen as exemplary models, not covering all possible applications. MSL is write protected and it is not possible to replace the limiter block in LimPID (and add max/min input connectors). Also, it wouldn't work out if you shared your simulation model with others, expecting their MSL to work like your modified MSL.
Personally, I have my own libraries of components where MSL models are inadequate. For example, I have PID controllers with variable limits, manual/automatic functions and many more functions which are needed in my applications.
Often, I create a copy of an MSL model, place it in the same package in my own library and make the necessary modifications and additions, e.g. MyLibrary.Blocks.Continuous.PID.

How to include a c-header with constants in Matlab Simulink

I'm developing a Simulink modell with many C-s-functions. For an easier handling I want to use constants in the c-s-function as in the simulink-modell. So I have a c-header with preprocesser constants like:
#define THIS_IS_A_CONSANT 10
And there is the question:
How it is possible to include this in Simulink in this way I can use THIS_IS_A_CONSANT for example in a constant source like a workspace-variable?
Thanks and regards
Alex
There is functionality in Simulink that will allow you to include custom C header files that define constants, variables, etc.; however, as far as I know (and as one might expect) this really is only pertinent in cases where code is being generated and compiled.
So, for the most part, this particular functionality is only relevant when you are using Simulink Coder to generate a stand-alone executable from your model. For example, this link shows how to include parameters stored in an external header file during code generation through the use of Simulink.Parameter objects with Custom Storage Classes and the Code Generation - Custom Code Pane under the model's Configuration Parameters.
This link from the Simulink doc shows how to use the #define custom storage class to achieve similar results.
However, it sounds like neither of these really solve your issue, as you want to make use of the code in the header file during simulation.
That said, considering that there are elements in Simulink, such as Stateflow Charts and MATLAB Function blocks, that generate and build code "under the hood" during simulation, it's (at least hypothetically) possible that you might be able to use some of the concepts described above to access the values in your header file from one of those elements during simulation. For example, I was successfully able to access preprocessor macros in a Stateflow chart just by going to the Simulation Target->Custom Code pane under Configuration Parameters and including the text #include "header.h" under Include custom C code in generated: Header file. (In this case, header.h contained the line of C code that you included in your post)
Although it seems like you should be able to extend this functionality further, this really was the limit of what I was able to achieve as far as accessing the header file during simulation was concerned. For example, I know that running a model in Rapid Accelerator mode actually generates and builds code under the hood, so seemingly you should be able to use some combination of the techniques I described above to be able to access values from the header file during simulation. It looks like the code that Rapid Accelerator mode generates doesn't respect all of the settings defined by those techniques in the same way that Simulink/Embedded Coder do, though, so I just kept running into compilation errors. (Although maybe I'm just missing some creative combination of settings that could make that work).
Hopefully that helps explain Simulink's abilities (and limitations) regarding the inclusion of C header files. So to summarize, according to the links included above, what you are asking for is almost barely possible, but in practice... not really.
So if really all you want is to be able to create workspace variables out of the preprocessor #define's in your header file, it probably is just easiest to manually parse the file with a MATLAB script, as had previously been suggested in the comments. Here is a quick-and-dirty script that loads in a header file, iterates over each line, uses a regular expression (which you can improve upon if needed) to parse #define statements, and then calls eval to create variables from the parsed input.
filename = 'header.h';
pattern = '^\s*#define\s*(\w*)\s*(\d*\.?\d+)';
fid = fopen(filename);
tline = fgetl(fid);
while ischar(tline)
tokens = regexp(tline, pattern,'tokens','once');
if(numel(tokens) == 2)
eval([tokens{1} ' = ' tokens{2}]);
end
tline = fgetl(fid);
end
fclose(fid);
You could put this code in a callback so that it will execute every time you load your model. Just goto File->Model Properties->Model Properties, click on the Callbacks tab, and then place the code under whichever callback you desire (such as PreLoadFcn if you want it to run immediately before the model loads).

loading parameter files for data different sets

I need to analyse several sets of data which are associated with different parameter sets (one single set of parameters for each set of data). I'm currently struggling to find a good way to store these parameters such that they are readily available when analysing a specific dataset.
The first thing I tried was saving them in a script file parameters.m in the data directory and load them with run([path_to_data,'/parameters.m']). I understand, however, that this is not good coding practice and it also gave me scoping problems (I think), as changes in parameters.m were not always reflected in my workspace variables. (Workspace variables were only changed after Clear all and rerunning the code.)
A clean solution would be to define a function parameters() in each data directory, but then again I would need to add the directory to the search path. Also I fear I might run into namespace collisions if I don't give the functions unique names. Using unique names is not very practical on the other hand...
Is there a better solution?
So define a struct or cell array called parameters and store it in the data directory it belongs in. I don't know what your parameters look like, but ours might look like this:
parameters.relative_tolerance = 10e-6
parameters.absolute_tolerance = 10e-6
parameters.solver_type = 3
.
.
.
and I can write
save('parameter_file', 'parameters')
or even
save('parameter_file', '-struct', 'parameters', *fieldnames*)
The online help reveals how to use -struct to store fields from a structure as individual variables should that be useful to you.
Once you've got the parameters saved you can load them with the load command.
To sum up: create a variable (most likely a struct or cell array) called parameters and save it in the data directory for the experiment it refers to. You then have all the usual Matlab tools for reading, writing and investigating the parameters as well as the data. I don't see a need for a solution more complicated than this (though your parameters may be complicated themselves).

Accessing variable by string name

I need to load experimental data into scicoslab, a (pretty badly designed) clone fork of scilab which happens to support graphical modeling. The documentation on the web is pretty poor, but it's reasonably similar to scilab and octave.
The data I need to process is contained into a certain number of text files: Data_005, Data_010, …, Data_100. Each of them can be loaded using the -ascii flag for the loadmatfile command.
The problem comes from the fact that loadmatfile("foo", "-ascii") loads the file foo.mat into a variable named foo. In order to to cycle on the data files, I would need to do something like:
for i = [5:5:100]
name = sprintf("Data_%02d", i);
loadmatfile(name, "-ascii");
x = read_var_from_name(name);
do_something(x);
end
where what I search for is a builtin read_var_from_name which would allow me to access the internal symbol table by string.
Do you know if there exist a similar function?
Notes:
There's no way of overriding this behavior if your file is in ascii format;
In this phase I could also use octave (no graphical modelling is involved), although it behaves in the same way.
>> foo = 3.14; name = 'foo'; eval(name)
foo =
3.1400
The above works in MATLAB, and Scilab's documentation says it also has an eval function. Not sure if I understood you correctly, though.
#arne.b has a good answer.
In your case you can also do that in matlab:
a=load('filename.mat')
x=a.('variable_name')
lets go through your points one by one:
"ScicosLab, a (pretty badly designed) clone of Scilab" This in my opinion is an inaccurate way of introducing the software. ScicosLab is not a clone of Scilab but a fork of it. The team behind ScicosLab (INRIA) are the ones who made scocos (now called xcos in Scilab development line). At some point (from Scilab v4) the Scilab team decided to move away from Tcl/tk towards Java, but the SciccosLab/scicos team departed, keep using the language (Tcl) and it's graphical user interface design package (tk). Giving the ScocosLab community the credit that the whole Scilab documentation and support is not very good in general. :) (more about Scilab and the forks here)
Regarding the technical question I'm not sure what you are trying to achieve here, Scilab/ScicosLab still have the eval function which basically does what you want. However this function is to be deprecated in favor of evstr. There is also the execstr function which worth studying.
The loadmatfile, as far as I have understood, "tries" to load the variables defined in a MATLAB .mat file (MATLAB's proprietary tabular format) into the Scilab workspace. For example if there is a variable foo it will "try" to create the variable foo and loads its value from the MATLAB script. Check this example. I would create a variable x(i) = foo in the for loop. again your question is not completely clear.
As a side note maybe you could consider exporting your data as CSV instead of .mat files.

Using table-of-contents in code?

Do you use table-of-contents for listing all the functions (and maybe variables) of a class in the beginning of big source code file? I know that alternative to that kind of listing would be to split up big files into smaller classes/files, so that their class declaration would be self-explanatory enough.. but some complex tasks require a lot of code. I'm not sure is it really worth it spending your time subdividing implementation into multiple of files? Or is it ok to create an index-listing additionally to the class/interface declaration?
EDIT:
To better illustrate how I use table-of-contents this is an example from my hobby project. It's actually not listing functions, but code blocks inside a function.. but you can probably get the idea anyway..
/*
CONTENTS
Order_mouse_from_to_points
Lines_intersecting_with_upper_point
Lines_intersecting_with_both_points
Lines_not_intersecting
Lines_intersecting_bottom_points
Update_intersection_range_indices
Rough_method
Normal_method
First_selected_item
Last_selected_item
Other_selected_item
*/
void SelectionManager::FindSelection()
{
// Order_mouse_from_to_points
...
// Lines_intersecting_with_upper_point
...
// Lines_intersecting_with_both_points
...
// Lines_not_intersecting
...
// Lines_intersecting_bottom_points
...
// Update_intersection_range_indices
for(...)
{
// Rough_method
....
// Normal_method
if(...)
{
// First_selected_item
...
// Last_selected_item
...
// Other_selected_item
...
}
}
}
Notice that index-items don't have spaces. Because of this I can click on one them and press F4 to jump to the item-usage, and F2 to jump back (simple visual studio find-next/prevous-shortcuts).
EDIT:
Another alternative solution to this indexing is using collapsed c# regions. You can configure visual studio to show only region names and hide all the code. Of course keyboard support for that source code navigation is pretty cumbersome...
I know that alternative to that kind of listing would be to split up big files into smaller classes/files, so that their class declaration would be self-explanatory enough.
Correct.
but some complex tasks require a lot of code
Incorrect. While a "lot" of code be required, long runs of code (over 25 lines) are a really bad idea.
actually not listing functions, but code blocks inside a function
Worse. A function that needs a table of contents must be decomposed into smaller functions.
I'm not sure is it really worth it spending your time subdividing implementation into multiple of files?
It is absolutely mandatory that you split things into smaller files. The folks that maintain, adapt and reuse your code need all the help they can get.
is it ok to create an index-listing additionally to the class/interface declaration?
No.
If you have to resort to this kind of trick, it's too big.
Also, many languages have tools to generate API docs from the code. Java, Python, C, C++ have documentation tools. Even with Javadoc, epydoc or Doxygen you still have to design things so that they are broken into intellectually manageable pieces.
Make things simpler.
Use a tool to create an index.
If you create a big index you'll have to maintain it as you change your code. Most modern IDEs create list of class members anyway. it seems like a waste of time to create such index.
I would never ever do this sort of busy-work in my code. The most I would do manually is insert a few lines at the top of the file/class explaining what this module did and how it is intended to be used.
If a list of methods and their interfaces would be useful, I generate them automatically, through a tool such as Doxygen.
I've done things like this. Not whole tables of contents, but a similar principle -- just ad-hoc links between comments and the exact piece of code in question. Also to link pieces of code that make the same simplifying assumptions that I suspect may need fixing up later.
You can use Visual Studio's task list to get a listing of certain types of comment. The format of the comments can be configured in Tools|Options, Environment\Task List. This isn't something I ended up using myself but it looks like it might help with navigating the code if you use this system a lot.
If you can split your method like that, you should probably write more methods. After this is done, you can use an IDE to give you the static call stack from the initial method.
EDIT: You can use Eclipse's 'Show Call Hierarchy' feature while programming.