I mostly use R and python for data analysis.
How do I use R or python or other general purpose language to generate .dzn file for minizinc model
Although .dzn files are the traditional data entry files for MiniZinc models, newer versions of the MiniZinc driver can use JSON. Both R and Python can easily generate JSON from data. An example to generate JSON data for the following model, model.mzn:
int: n;
array[1..n] of bool: arr;
float: f;
Exporting simple python data:
import json
data = {
"n": 4,
"arr": [True, False, False, True],
"f": 2.75,
}
with open('data.json', 'w') as outfile:
json.dump(data, outfile)
This python script will create a file data.json containing the data. MiniZinc can now directly use the generated file: minizinc --solver gecode model.mzn data.json
The pymzn package supports converting python dicts to dzn files: http://paolodragone.com/pymzn/reference/dzn/index.html
An example may look something like
from pymzn import dzn
data = {
"n": 4,
"arr": [True, False, False, True],
"f": 2.75,
}
with open("example.dzn", "w") as f:
f.write("\n".join(dzn.dict2dzn(data)))
The contents of example.dzn will then be:
n = 4;
arr = array1d(1..4, [true, false, false, true]);
f = 2.75;
Related
I have a large csv log file. Here is a simplified sample:
ts,a.b.c,a.b.d,a.b.e,b.c,b.d,c.d.e,c.d.f,c.g
2021-03-29 06:38:39,1.0000,2,3,28.20,1,2,3,4
2021-03-29 06:38:40,1.0000,2,3,28.20,1,2,3,0.000000
I am using MATLAB's Import Data tool to import it, but, unfortunately, it removes all dots from the header and imports all variables as, e.g.: abc, abd, abe etc.
What is an efficient way to import a csv like the one above as structs?
It am looking for a way to have data imported as structs: a, b and c for this particular log file, so that I can easily access variables as a.b.c or c.d.f.
Here is what I came up with, by simply using readtable.
function res = log_import(logfile)
log_table = readtable(logfile);
res = [];
for i = 1:width(log_table)
str_path = log_table.Properties.VariableDescriptions{i};
fields = strsplit(str_path,'.');
res = setfield(res, fields{1:end}, log_table{:, i});
end
end
I'm using Python3.7.2
The code which makes me confused looks like this:
def foo(cond):
if cond:
z = 1
return z
else:
loc = locals()
x = 1
locals()['y'] = 2
exec("z = x + y")
print(loc)
print(locals())
foo(False)
Here is the result it prints:
{'cond': False, 'loc': {...}, 'x': 1, 'y': 2, 'z': 3}
{'cond': False, 'loc': {...}, 'x': 1, 'y': 2}
The exec function changes the local namespace, but when I execute locals(), variable z disappeared.
However, if I change the code like this:
def foo(cond):
if cond:
return 1
else:
loc = locals()
x = 1
locals()['y'] = 2
exec("z = x + y")
print(loc)
print(locals())
foo(False)
The result will be:
{'cond': False, 'loc': {...}, 'x': 1, 'y': 2, 'z': 3}
{'cond': False, 'loc': {...}, 'x': 1, 'y': 2, 'z': 3}
I'm really confused. :(
In short: the dictionary returned by locals() is only a copy of the true local variables—which are stored in an array, not in a dictionary. You can manipulate the locals() dictionary, add or remove entries at will. Python does not care since that is merely a copy of the local variables it works with. However, each time you call locals(), Python copies the current values of all local variables into the dictionary, replacing anything else you or exec() put in before. So, if z is a proper local variable (as in the first, but not second version of your code), locals() will restore its current value, which happens to be "not set" or "undefined".
Indeed, conceptually Python stores local variables in a dictionary, which can be retrieved through the function locals(). However, internally, it really is two arrays. The names of the locals are an attribute of the code itself and therefore stored in the code object as code.co_varnames. Their values, however, are stored in the frame as fast (which is inaccessible from Python code). You can find more on co_varnames etc. in inspect and dis modules, respectively.
The built-in function locals() actually calls a method fastToLocals() on the current frame object. If we were to write the frame with its fastToLocals method in Python, it might look like this (I am leaving out a lot of details here):
class frame:
def __init__(self, code):
self.__locals_dict = None
self.f_code = code
self.__fast = [0] * len(code.co_varnames)
def fastToLocals(self):
if self.__locals_dict is None:
self.__locals_dict = {}
for (key, value) in zip(self.f_code.co_varnames, self.__fast):
if value is not null: # Remember: it's actually C-code
self.__locals_dict[key] = value
else:
del self.__locals_dict[key] # <- Here's the magic
return self.__locals_dict
def locals():
frame = sys._getframe(1)
frame.fastToLocals()
In plain English: the dictionary you get when calling locals() is cached in the current frame. Calling locals() repeatedly will give you the same dictionary (__locals_dict in the code above). However, each time, you call locals(), the frame will update the entries in this dictionary with the current values that local variables have at that time. As noted here, when a local variable is not set, the entry in the __locals_dict is deleted. And that's what it is all about.
In your first version of the code, the third line says z = 1, which makes z a local variable. In the else branch, however, that local variable z is not set (i.e. would raise a UnboundLocalError) and is therefore removed from the __locals_dict. In the second version of your code, there is no assignment to z, so it is not a local variable and the locals() function does not care about it.
The set of local variables is actually fixed by the compiler. This means that you cannot add or remove local variables at runtime. This is a problem for exec(), as you are clearly using exec() here to define a local variable z. Python's way out is to say that exec() stores z as a local variable in the _locals_dict dictionary, although it cannot put it into the arrays behind this dictionary.
Conclusion: The values of local variables are actually stored in an array, not in the dictionary returned by locals(). When calling locals(), the dictionary is updated with the true current values taken from the dictionary. exec(), on the other hand, can only store its local variables in the dictionary, not in the array. If z is a proper local variable, it will be overwritten by a call to locals() with its current value (which is "does not exist"). If z is not a proper local variable, it will stay in the dictionary untouched.
Python's specifications say that any variable that you assign to inside a function is a local variable. You can change this default through the use of global or nonlocal, of course. But as soon as you write z = ..., z becomes a local variable. If, on the other hand, you only have x = z in your code, the compiler assumes that z is rather a global variable. That's why the line z = 1 makes all the difference: it marks z as a local variable that receives its place in the fast array.
Concerning exec(): in general, there is no way the compiler could really know what code exec() is going to execute (in your case with a string literal it could, but since this is a are rare and uninteresting case, it never tries, anyway). The compiler can therefore not know what local (or global) variables the code in exec() might access, and cannot include that in its calculation of how large the array of local variables should be.
By the way: that local variables are managed in arrays instead of proper dictionaries is the reason why there might be raised an UnboundLocalError instead of an NameError. In case of local variables, the Python interpreter actually recognises the name and knows exactly where its value is to be found (inside the fast array mentioned above). But if that entry is null, it cannot return something meaningful and therefore raises the UnboundLocalError. For global names, however, Python goes really searched for a variable with the given name in the globals and built-in dictionaries. In that case, a variable of the requested name might truly not exist.
Compare your code with the following:
def foo(cond):
if cond:
return 1
else:
loc = locals().copy()
# ——————————————^
x = 1
locals()['y'] = 2
exec("z = x + y")
print(loc)
print(locals())
foo(False)
It produces
{'cond': False}
{'cond': False, 'loc': {'cond': False}, 'x': 1, 'y': 2, 'z': 3}
as you might expect. The reason is that locals() is a reference, not a value.
I am using the Ruamel Python library to programmatically edit human-edited YAML files. The source files have keys that are sorted alphabetically.
I'm not sure if this is a basic Python question, or a Ruamel question, but all methods I have tried to sort Ruamel's OrderedDict structure are failing for me.
I am quite confused, for instance, why the following code, based on this recipe, isn't working:
import ruamel.yaml
import collections
def read_file(f):
with open(f, 'r') as _f:
return ruamel.yaml.round_trip_load(
_f.read(),
preserve_quotes=True
)
def write_file(f, data):
with open(f, 'w') as _f:
_f.write(ruamel.yaml.dump(
data,
Dumper=ruamel.yaml.RoundTripDumper,
explicit_start=True,
width=1024
))
data = read_file('in.yaml')
data = collections.OrderedDict(sorted(data.items(), key=lambda t: t[0]))
write_file('out.yaml', data)
But given this input file:
---
bananas: 1
apples: 2
The following output file is produced:
--- !!omap
- apples: 2
- bananas: 1
I.e. it's turned my file into a YAML ordered map.
Is there an easy way to do this? Also, can I simply insert into the data structure somehow?
If you round_trip a mapping in ruamel.yaml¹ , a mapping doesn't get represented as a collections.OrderedDict(), it gets represented as a ruamel.yaml.comments.CommentedMap(). The latter can be a subclass of collections.OrderedDict() depending on which version of Python you are working with (e.g. in Python 2 it uses the much faster C implementation fromruamel.ordereddict)
The representer doesn't automatically interpret "normal" ordered dictionaries (whether from collections or ruamel.ordereddict) as special in round_trip_dump mode. But if you drop the collections:
import ruamel.yaml
def read_file(f):
with open(f, 'r') as _f:
return ruamel.yaml.round_trip_load(
_f.read(),
preserve_quotes=True
)
def write_file(f, data):
with open(f, 'w') as _f:
ruamel.yaml.dump(
data,
stream=_f,
Dumper=ruamel.yaml.RoundTripDumper,
explicit_start=True,
width=1024
)
data = read_file('in.yaml')
data = ruamel.yaml.comments.CommentedMap(sorted(data.items(), key=lambda t: t[0]))
write_file('out.yaml', data)
your out.yaml will be:
---
apples: 2
bananas: 1
Please note that I also removed an inefficiency in your write_file routine. If you don't specify a stream, all data will be streamed to a StringIO instance first (in memory) and then returned (which you wrote to a stream with _f.write(), it is much more efficient to directly write to the stream.
As for your final question: yes you can insert using:
data.insert(1, 'apricot', 3)
¹ Disclaimer: I am the author of both ruamel.yaml as well as ruamel.ordereddict.
Elixir has been my goto language for the past 18 months or so, however I sometimes find there is a tension between the "no magic" mantra (especially cited with reference to Phoenix vs Rails) and the use of macros.
While I now miss macros when I'm using languages without them, I still wish it was easier to see what they are actually doing. Some part of me always wants to pull back the DSL curtain and see the real code.
Is there a simple way to expand macros and see the code they generate, (perhaps via IEx) so that I don't have to dig through the layers of defmacro trying to piece it together in my head.
You can expand a macro with Macro.expand/2
iex> Macro.expand((quote do: (if true, do: 1)), __ENV__)
{:case, [optimize_boolean: true],
[true,
[do: [{:->, [],
[[{:when, [],
[{:x, [counter: 6], Kernel},
{:in, [context: Kernel, import: Kernel],
[{:x, [counter: 6], Kernel}, [false, nil]]}]}], nil]},
{:->, [], [[{:_, [], Kernel}], 1]}]]]}
You can then use Macro.to_string/2 to get the output as a string instead of an AST:
iex> Macro.expand((quote do: (if true, do: 1)), __ENV__) |> Macro.to_string()
"case(true) do\n x when x in [false, nil] ->\n nil\n _ ->\n 1\nend"
You can then use IO.puts/2 to print the string to the terminal:
iex> Macro.expand((quote do: (if true, do: 1)), __ENV__) |> Macro.to_string() |> IO.puts()
case(true) do
x when x in [false, nil] ->
nil
_ ->
1
end
Try this trick from Chris McCord:
your_ast |> Macro.expand(__ENV__) |> Macro.to_string |> IO.puts
I have been looking at various discussions here on SO and other places, and the general consensus seems that if one is returning multiple non-similar data structures from an R function, they are best returned as a list(a, b) and then accessed by the indexes 0 and 1 and so on. Except, when using an R function via PL/R inside a Perl program, the R list function flattens the list, and also stringifies even the numbers. For example
my $res = $sth->fetchrow_arrayref;
# now, $res is a single, flattened, stringified list
# even though the R function was supposed to return
# list([1, "foo", 3], [2, "bar"])
#
# instead, $res looks like c(\"1\", \""foo"\", \"3\", \"2\", \""bar"\")
# or some such nonsense
Using a data.frame doesn't work because the two arrays being returned are not symmetrical, and the function croaks.
So, how do I return a single data structure from an R function that is made up of an arbitrary set of nested data structures, and still be able to access each individual bundle from Perl as simply $res->[0], $res->[1] or $res->{'employees'}, $res->{'pets'}? update: I am looking for an R equiv of Perl's [[1, "foo", 3], [2, "bar"]] or even [[1, "foo", 3], {a => 2, b => "bar"}]
addendum: The main thrust of my question is how to return multiple dissimilar data structures from a PL/R function. However, the stringification, as noted above, and secondary, is also problematic because I convert the data to JSON, and all those extra quotes just add to useless data transferred between the server and the user.
I think you have a few problems here. The first is you can't just return an array in this case because it won't pass PostgreSQL's array checks (arrays must be symmetric, all of the same type, etc). Remember that if you are calling PL/R from PL/Perl across a query interface, PostgreSQL type constraints are going to be an issue.
You have a couple of options.
You could return setof text[], with one data type per row.
you could return some sort of structured data using structures PostgreSQL understands, like:
CREATE TYPE ab AS (
a text,
b text
);
CREATE TYPE r_retval AS (
labels text[],
my_ab ab
);
This would allow you to return something like:
{labels => [1, "foo", 3], ab => {a => 'foo', b => 'bar'} }
But at any rate you have to put it into a data structure that the PostgreSQL planner can understand and that is what I think is missing in your example.