Running Netlogo headless on the cloud - netlogo

I've written a NetLogo model to model agent movement in a landscape. I'd like to run this model from the command prompt, using AWs/Google Compute. The model uses about 500MB worth of input rasters and shapefiles and writes rasters and csv files. It also uses the extensions gis, rnd, cf, table and csv.
Would this be possible using the Controlling API? (https://github.com/NetLogo/NetLogo/wiki/Controlling-API). Can I just use the steps listed in the link? I have not tried running NetLogo from the command prompt before.
Also, I do not want to run BehaviourSpace as it is not relevant to this model.

A BehaviorSpace experiment can consist of only a single run, so BehaviorSpace may actually be relevant to you here. You only need to write one short XML file (or no new files at all, if the experiment setup you want is already part of the model) to do it this way.
Whereas if you go through the controlling API, you will have to write and compile Java (or Scala) code, which is a substantially more complex task.
But if you decide to go the controlling API route: yes, that works too, and it is documented, as you've already noticed.

Related

data processing of Dymola's result during simulation

I am working on a complex Modelica model that contains a large set of data, and I need the simulation to keep going until I terminate the simulation process, maybe even for days, so the .mat file could get very large, I got trouble with how to do data processing. So I'd like to ask if there are any methods that allow me to
output the data I need after a fixed time step during simulation, but not using the .mat file after simulation. I am considering using Modelica.Utilities.Stream.Print` function to print the data I need into a CSV file, but I have to write a huge amount of code that prints every variable I need, so I think there should be a better solution.
delete the .mat file during a fixed time step, so the .mat file stored on my PC wouldn't get too large, and don't affect the normal simulation of Dymola.
Long time ago I wrote a small C-program that runs the executable of Dymola with two threads. One of them is responsible for terminating the whole simulation after exceeding an input time limit. I used the executable of this C-program within the standard given mfiles from Dymola. I think with some hacking capabilities, one would be able to conduct the mentioned requirements.
Have a look at https://github.com/Mathemodica/dymmat however I need to warn that the associated mfiles were for particular type of models and the software is not maintained since long time. However, the idea of the C-program would be reproducible.
I didn't fully test this, so please think of this more like "source of inspiration" than a full answer:
In Section "4.3.6 Saving periodic snapshots during simulation" of the Dymola 2021 Release Notes you'll find a description to do the following:
The simulator can be instructed to print the simulation result file “dsfinal.txt” snapshots during simulation.
This can be done periodically using the Simulation Setup options "Complete result snapshots", but I think for your case it could be more useful to trigger it from the model using the function Dymola.Simulation.TriggerResultSnapshot(). A simple example is given as well:
when x > 0 then
Dymola.Simulation.TriggerResultSnapshot();
end when;
Also one property of this function could help, as it by default creates multiple files without overwriting them:
By default, a time stamp is added to the snapshot file name, e.g.: “dsfinal_0.1.txt”.
The format of the created dsfinal_[TIMESTAMP].txt is a bit overwhelming at first, as it contains all information for initializing the model, but there should be everything you need...
So some effort is shifted to the post processing, as you will likely need to read multiple files, but I think this is an acceptable trade-off.

Sequence Tagging in batch with Mallet cmd prompt

I have tested the SimpleTagger for Sequence Tagging on mallet's cmd prompt interface. I would now like to train over many files and run tests in batches. Is it also possible to do this on mallet's command prompt? I want to get some hint on the performance of the algorithm for the task at hand before I dive into using the JAVA API.
I have seen that Classification tasks can be run in batch from the command prompt.
is it possible to use SimpleTagger in batch? if no
Can someone point me to a reference code where Sequence Tagging has been done in batch using the java API.
Somewhere I found a reference to "http://mallet.cs.umass.edu/index.php/Command_line_tutorial", but the link seems to be broken.
After some exploration, I learned that it was not possible to readily use the cc.mallet.fst.SimpleTagger for batch evaluations. Instead, I found out that the cc.mallet.examples.TrainCRF is a handy code (that uses the SimpleTagger). The code takes a train and test datasets (in Mallet sequence tagging format, instances separated by single-line) as input arguments and that's it.
I used the mallet-2.0.8 installation available on the Mallet page.
Beware to NOT tune the models based on the performance on the test set. You should avoid that and perhaps not verify the performance on test set until you have tuned the model on the training set sufficiently.

Can Tableau return non-UI results programmatically?

Tableau is an excellent tool for visualizing data. However, it is designed to be the final stop in a data (ETL) pipeline.
My Tableau workbook uses a bunch of Table Calcs to generate a list of "recommended orders". Rather than view these, I want to automate and execute them. This would make Tableau the engine of a quasi-ML process.
In other words, I would like to make Tableau a part of my ETL pipeline and send data to another tier. How can I write a back-end program that executes my Tableau workbook and receives a results dataset?
See the end of this article for example data I want to automate:
http://robm26.blogspot.com/2015/10/keep-your-factory-humming-with-tableau.html
Any ideas?
You're not not going to like the answer I'm going to give you -- "Don't do this".
Tableau isn't meant to be a task in a larger ETL pipeline and the reason you're having problems making it behave the way you want is it's not meant to be done.
Above and beyond the fact that you've figured out how to get a result that you want in Tableau ("the work is done"), Tableau isn't offering you any real value in the scenario you're describing. Use a tool (like Alteryx) that is really purpose built for this sort of work.
The above answer is correct that tabcmd is the way to pull it out. We use a function in python to generate the tabcmd requests so that they can be batched.
import subprocess
def runTabCmd(cmd):
# run tableau command and display the output
print cmd
if run_tabcmd == 'yes':
p = subprocess.Popen(
cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in p.stdout.readlines():
print line
You probably already knew that, but for us it was a way to completely automate the pulling and loading into another python package like scikit-learn for a streamlined ML solution
I'm editing this answer to agree with Russell's answer. Tableau is not an ETL tool and should not be used as such. If you absolutely have to do something, you can use what I provided. Otherwise, the best practice is to use a tool designed for the job.
You can easily use tabcmd to get the results of a view in CSV, which can be used later in your ETL process. If you need to automate it, you can write a script and execute it with a cron job. I, myself, have a few views that are exported to CSV and used later in my ETL stream to feed our CRM.
Just remember to create the view exactly as you want it to be exported to CSV - usually including the order of the fields. Another tip is that I don't let it use the default "Measure Names" and "Measure Values" - to make sure everything is good on my CSV, I have the fields added manually in the row/columns section.

Getting the current Experiment instance at runtime

I'm running JUnit 4 with AnyLogic. In one of my tests, I need access to the Experiment running the test. Is there any clean way to access it at runtime? E.g., is there a static method along the lines of Experiment.getRunningExperiment()?
There isn't a static method that I know of (and, if there was, it might be complicated by multi-run experiments which permit parallel execution, although perhaps not since there's still a single Experiment, though there'd be thread-safety issues).
However, you can use getEngine().getExperiment() from within a model. You probably need to explain more about your usage context. If you're using AnyLogic Pro and exporting the model to run standalone, then you should have access to the experiment instance anyway (as in the help "Running the model from outside without UI").
Are you trying to run JUnit tests from within an Experiment? If so, what's your general design? Obviously JUnit doesn't sit as well in that scenario since it 'expects' to be instantiating and running the thing to be tested. For my automated tests (where I can't export it standalone because I don't use AnyLogic Pro), I judged that it was easier to avoid JUnit (it's just a framework after all) and implement the tests 'directly' (by having my model components write outputs and, at the end of the run in the Experiment, having the Experiment compare the outputs to pre-prepared expected ones and flag if the test was passed or failed). With AnyLogic Pro, you could still export standalone and use JUnit to run the 'already-a-test' Experiments (with the JUnit test checking the Experiment for a testPassed Boolean being set at the end or whatever).
The fact that you want to get running experiments suggests that you're potentially doing this whilst runs are potentially executing. If so, could you explain a bit about your requirements?

accessing command line arguments for headless NetLogo in the Matlab extension

I'm running the matlab extension for netlogo in headless(non-gui) mode. I've downloded the extension source and am trying to access the command line arguments from the java code in the extension. The command line arguments are stored in LabInterface.Settings. I would like to be able to access that object in the java code of the extension. I've been working on this for a couple of days but have had not success. It seems the extension process is designed to create primitives to be used inside netlogo. These primitives have knowledge of the different netlogo objects but there is no way for the extension java code to access it. I would appreciate any help.
I would like to be able to run multiple netlogo-matlab analyses with varying parameters, in batch mode accross multiple machines, perhaps a flux cluster. I need to run in headless because of the batch nature. Sometimes the runs will be on the same machine, sometimes split accross multiple machines, flux or condor. I know a similar functionality exist in netlogo for running varying parameters in a single session. Is there some way to split these accross multiple machines?
Currently, I create a series of setup files for netlogo. Each setup file represents the paramenters that vary for that run. Then I submit each netlogo - setup file combination as a single run. Each run can be farmed out to a seperate machine or processor. Adding the matlab extension complicates this. The matlab extension connects it's server to port 9999. With multiple servers running they all get attached to port 9999 and this causes problems. I was hoping to get information from the setup-file name to create independent port numbers tied to the setup file names. This way I could create a unique socket for each setup file, and hence a unique server connection for each netlogo run.
NetLogo doesn't provide a facility for distributing model runs on a cluster, but various people have done it anyway. See:
http://ccl.northwestern.edu/netlogo/docs/faq.html#cluster
https://github.com/jurnix/netlogo-cluster
http://mass.aitia.ai/index.php/intro/meme
and past threads about it on the netlogo-users group. There is no single standard solution.
As for getting access to LabInterface.Settings, it appears to me from looking through the NetLogo source code that the settings object isn't actually stored anywhere. It's just handed off from method to method, ultimately to lab.Lab.run, without ever actually being kept. So trying to get access to the name of the setup file won't work.
So you'll need to some other way for to make the extension generate unique port numbers. Seems to me like there's any number of possible solutions to this. At the time you generate the setup file you know its name, so you could generate a port number at the same time and include it in the experiment definition contained in the file. Or you could pass a port number in a Java system property (using -D) at the time you start NetLogo. Or you could generate a port number based on the process id of the JVM process. Or you could have the extension try port 9999 and see if it's already in use, and if it is, then try a different port number. That's just a few ideas... I could probably come up with ten more.