ruamel.yaml ComposerError when using alias/ as name - ruamel.yaml

I am trying to parse the following document
hello:
there: &there_value 1
foo:
*there_value: 3
This gets correctly parsed with the safe loader:
>>> from ruamel.yaml import YAML
>>> document = """
... hello:
... there: &there_value 1
... foo:
... *there_value: 3
"""
>>> yaml=YAML(typ="safe")
>>> yaml.load(document)
{'hello': {'there': 1}, 'foo': {1: 3}}
The round-trip (standard) loader throws an error:
>>> yaml=YAML()
>>> yaml.load(document)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...\site-packages\ruamel\yaml\main.py", line 434, in load
return constructor.get_single_data()
File "...\site-packages\ruamel\yaml\constructor.py", line 119, in get_single_data
node = self.composer.get_single_node()
File "...\site-packages\ruamel\yaml\composer.py", line 76, in get_single_node
document = self.compose_document()
File "...\site-packages\ruamel\yaml\composer.py", line 99, in compose_document
node = self.compose_node(None, None)
File "...\site-packages\ruamel\yaml\composer.py", line 143, in compose_node
node = self.compose_mapping_node(anchor)
File "...\site-packages\ruamel\yaml\composer.py", line 223, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "...\site-packages\ruamel\yaml\composer.py", line 117, in compose_node
raise ComposerError(
ruamel.yaml.composer.ComposerError: found undefined alias 'there_value:'
in "<unicode string>", line 6, column 3:
*there_value: 3
^ (line: 6)
I am using Python 3.8.10, ruamel.yaml version 0.17.21.

As Anthon suggested in their comment, : is a valid character for an anchor,
so *there_value: is looking for there_value: which is not defined, only there_value is.
The solution is to add a space after the anchor, *there_value :.
hello:
there: &there_value 1
foo:
*there_value : 3
This loads correctly both with round-trip and with safe.

Related

pyexcel suddenly no longer opens .xlsx (or .xls)

I have all the necessary dependencies installed:
pyexcel==0.7.0
pyexcel-ezodf==0.3.4
pyexcel-io==0.6.6
pyexcel-ods3==0.6.1
pyexcel-xls==0.7.0
(and some others, which I've omitted). Last week, my code was working. Now I am unable to open the very same .xls
>>> p = Path("data/jr1305221.xls")
>>> p.exists()
True
>>> pyexcel.get_book(file_name=p)
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\core.py", line 47, in get_book
book_stream = sources.get_book_stream(**keywords)
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\core.py", line 36, in get_book_stream
a_source = SOURCE.get_book_source(**keywords)
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\source_plugin.py", line 85, in get_book_source
return self.get_a_plugin(
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\source_plugin.py", line 69, in get_a_plugin
source_cls = self.load_me_now(
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\source_plugin.py", line 41, in load_me_now
if source.is_my_business(action, **keywords):
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\plugins\__init__.py", line 56, in is_my_business
raise IOError("Unsupported file type")
OSError: Unsupported file type
>>>
or .xlsx
>>> p = Path('data/dummy.xlsx')
>>> p.exists()
True
>>> pyexcel.get_book(file_name=p)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\core.py", line 47, in get_book
book_stream = sources.get_book_stream(**keywords)
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\core.py", line 36, in get_book_stream
a_source = SOURCE.get_book_source(**keywords)
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\source_plugin.py", line 85, in get_book_source
return self.get_a_plugin(
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\source_plugin.py", line 69, in get_a_plugin
source_cls = self.load_me_now(
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\internal\source_plugin.py", line 41, in load_me_now
if source.is_my_business(action, **keywords):
File "C:\Users\User McUser\AppData\Local\Programs\Python\Python310\lib\site-packages\pyexcel\plugins\__init__.py", line 56, in is_my_business
raise IOError("Unsupported file type")
OSError: Unsupported file type
>>>
Even in my docker container, it's stopped working.
As an alternative, I have also tried
with open('data/dummy.xlsx', 'r') as f:
pyexcel.get_book(file_name=f)
and get the same error.
I have read the docs again. I have rolled back my code to last week. What have I done to deserve this? Why has god forsaken me?
file_name needs to be a string, where I was passing in a pathlib.PosixPath
i.e., this returns the error:
p = Path('my_file')
pyexcel.get_book(file_name=p)
this works:
pyexcel.get_book(file_name='my_file')
You can use pyexcel.get_book(file_name=my_file.as_posix()) to get the string representation from the PosixPath.

Expected binary or unicode string, got 0.0

I'm training my first model with TensorFlow, but I keep having this error:
Expected binary or unicode string, got 0.0
I followed TensorFlow linear model tutorial (https://www.tensorflow.org/tutorials/wide) and applied it on my own dataset.
This is what I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 289, in new_func
return func(*args, **kwargs)
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 455, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 953, in _train_model
features, labels = input_fn()
File "<stdin>", line 2, in train_input_fn
File "<stdin>", line 5, in input_fn
File "<stdin>", line 5, in <dictcomp>
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 102, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 473, in make_tensor_proto
append_fn(tensor_proto, proto_values)
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 109, in SlowAppendObjectArrayToTensorProto
tensor_proto.string_val.extend([compat.as_bytes(x) for x in proto_values])
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 109, in <listcomp>
tensor_proto.string_val.extend([compat.as_bytes(x) for x in proto_values])
File "/home/nick/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 65, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got 0.0
Any suggestion?
Thanks

"InterfaceError: connection already closed" when using multiprocessing.Pool on black box function that queries PostgreSQL database

I've been given a Python (2.7) function that takes 3 strings as arguments, and returns a list of dictionaries. Due to the nature of the project, I can't alter the function, which is quite complex, calling several other non-standard Python modules and querying a PostgreSQL database using psychopg2. I think that it's the Postgres functionality that's causing me problems.
I want to use the multiprocessing module to speed up calling the function hundreds of times. I've written a "helper" function so that I can use multiprocessing.Pool (which takes only 1 argument) with my function:
from function_script import function
def function_helper(args):
return function(*args)
And my main code looks like this:
from helper_script import function_helper
from multiprocessing import Pool
argument_a = ['a0', 'a1', ..., 'a99']
argument_b = ['b0', 'b1', ..., 'b99']
argument_c = ['c0', 'c1', ..., 'c99']
input = zip(argument_a, argument_b, argument_c)
p = Pool(4)
results = p.map(function_helper, input)
print results
What I'm expecting is a list of lists of dictionaries, however I get the following errors:
Traceback (most recent call last):
File "/local/python/2.7/lib/python2.7/site-packages/variantValidator/variantValidator.py", line 898, in validator
vr.validate(input_parses)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/validator.py", line 33, in validate
return self._ivr.validate(var, strict) and self._evr.validate(var, strict)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/validator.py", line 69, in validate
(res, msg) = self._ref_is_valid(var)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/validator.py", line 89, in _ref_is_valid
var_x = self.vm.c_to_n(var) if var.type == "c" else var
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/variantmapper.py", line 223, in c_to_n
tm = self._fetch_TranscriptMapper(tx_ac=var_c.ac, alt_ac=var_c.ac, alt_aln_method="transcript")
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/decorators/lru_cache.py", line 176, in wrapper
result = user_function(*args, **kwds)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/variantmapper.py", line 372, in _fetch_TranscriptMapper
self.hdp, tx_ac=tx_ac, alt_ac=alt_ac, alt_aln_method=alt_aln_method)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/transcriptmapper.py", line 69, in __init__
self.tx_identity_info = hdp.get_tx_identity_info(self.tx_ac)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/decorators/lru_cache.py", line 176, in wrapper
result = user_function(*args, **kwds)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/dataproviders/uta.py", line 353, in get_tx_identity_info
rows = self._fetchall(self._queries['tx_identity_info'], [tx_ac])
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/dataproviders/uta.py", line 216, in _fetchall
with self._get_cursor() as cur:
File "/local/python/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/dataproviders/uta.py", line 529, in _get_cursor
cur.execute("set search_path = " + self.url.schema + ";")
File "/local/python/2.7/lib/python2.7/site-packages/psycopg2/extras.py", line 144, in execute
return super(DictCursor, self).execute(query, vars)
DatabaseError: SSL error: decryption failed or bad record mac
And:
Traceback (most recent call last):
File "/local/python/2.7/lib/python2.7/site-packages/variantValidator/variantValidator.py", line 898, in validator
vr.validate(input_parses)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/validator.py", line 33, in validate
return self._ivr.validate(var, strict) and self._evr.validate(var, strict)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/validator.py", line 69, in validate
(res, msg) = self._ref_is_valid(var)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/validator.py", line 89, in _ref_is_valid
var_x = self.vm.c_to_n(var) if var.type == "c" else var
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/variantmapper.py", line 223, in c_to_n
tm = self._fetch_TranscriptMapper(tx_ac=var_c.ac, alt_ac=var_c.ac, alt_aln_method="transcript")
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/decorators/lru_cache.py", line 176, in wrapper
result = user_function(*args, **kwds)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/variantmapper.py", line 372, in _fetch_TranscriptMapper
self.hdp, tx_ac=tx_ac, alt_ac=alt_ac, alt_aln_method=alt_aln_method)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/transcriptmapper.py", line 69, in __init__
self.tx_identity_info = hdp.get_tx_identity_info(self.tx_ac)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/decorators/lru_cache.py", line 176, in wrapper
result = user_function(*args, **kwds)
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/dataproviders/uta.py", line 353, in get_tx_identity_info
rows = self._fetchall(self._queries['tx_identity_info'], [tx_ac])
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/dataproviders/uta.py", line 216, in _fetchall
with self._get_cursor() as cur:
File "/local/python/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/local/python/2.7/lib/python2.7/site-packages/hgvs/dataproviders/uta.py", line 526, in _get_cursor
conn.autocommit = True
InterfaceError: connection already closed
Does anybody know what might cause the Pool function to behave like this, when it seems so simple to use in other examples that I've tried? If this isn't enough information to go on, can anyone advise me on a way of getting to the bottom of the problem (this is the first time I've worked with someone else's code)? Alternatively, are there any other ways that I could use the multiprocessing module to call the function hundreds of times?
Thanks
I think what may be happening is that your connection object is used across all workers and when 1 worker has completed all its tasks it closes the connection and meanwhile the other workers are still working and the connection is closed so when one of those workers tries to use the db it is already closed.

using boto3 in a python3 virtual env in AWS Lambda

I am trying to use Python3.4 and boto3 to walk an S3 bucket and publish some file locations to an RDS instance. The part of this effort I am having trouble with is when using boto3. My lambda function looks like the following:
import subprocess
def lambda_handler(event, context):
args = ("venv/bin/python3.4", "run.py")
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
popen.wait()
output = popen.stdout.read()
print(output)
and, in my run.py file I have some lines:
import boto3
s3c = boto3.client('s3')
which cause an exception. The run.py file is not relevant for this question however, so in order make this post more concise, I've found that the cause of this error is generated with executing the lambda function:
import subprocess
def lambda_handler(event, context):
args = ("python3.4", "-c", "import boto3; print(boto3.client('s3'))")
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
popen.wait()
output = popen.stdout.read()
print(output)
My logstream reports the error:
Event Data
START RequestId: 2b65421a-664d-11e6-81db-974c7c09d283 Version: $LATEST
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/var/runtime/boto3/__init__.py", line 79, in client
return _get_default_session().client(*args, **kwargs)
File "/var/runtime/boto3/session.py", line 250, in client
aws_session_token=aws_session_token, config=config)
File "/var/runtime/botocore/session.py", line 818, in create_client
client_config=config, api_version=api_version)
File "/var/runtime/botocore/client.py", line 63, in create_client
cls = self._create_client_class(service_name, service_model)
File "/var/runtime/botocore/client.py", line 85, in _create_client_class
base_classes=bases)
File "/var/runtime/botocore/hooks.py", line 227, in emit
return self._emit(event_name, kwargs)
File "/var/runtime/botocore/hooks.py", line 210, in _emit
response = handler(**kwargs)
File "/var/runtime/boto3/utils.py", line 61, in _handler
module = import_module(module)
File "/var/runtime/boto3/utils.py", line 52, in import_module
__import__(name)
File "/var/runtime/boto3/s3/inject.py", line 13, in <module>
from boto3.s3.transfer import S3Transfer
File "/var/runtime/boto3/s3/transfer.py", line 135, in <module>
from concurrent import futures
File "/var/runtime/concurrent/futures/__init__.py", line 8, in <module>
from concurrent.futures._base import (FIRST_COMPLETED,
File "/var/runtime/concurrent/futures/_base.py", line 357
raise type(self._exception), self._exception, self._traceback
^
SyntaxError: invalid syntax
END RequestId: 2b65421a-664d-11e6-81db-974c7c09d283
REPORT RequestId: 2b65421a-664d-11e6-81db-974c7c09d283 Duration: 2673.45 ms Billed Duration: 2700 ms Memory Size: 1024 MB Max Memory Used: 61 MB
I need to use boto3 downstream of run.py. Any ideas on how to resolve this are much appreciated. Thanks!

How can I modify/merge Jinja2 dictionaries?

I have a Jinja2 dictionary and I want a single expression that modifies it - either by changing its content, or merging with another dictionary.
>>> import jinja2
>>> e = jinja2.Environment()
Modify a dict: Fails.
>>> e.from_string("{{ x[4]=5 }}").render({'x':{1:2,2:3}})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "jinja2/environment.py", line 743, in from_string
return cls.from_code(self, self.compile(source), globals, None)
File "jinja2/environment.py", line 469, in compile
self.handle_exception(exc_info, source_hint=source)
File "<unknown>", line 1, in template
jinja2.exceptions.TemplateSyntaxError: expected token
'end of print statement', got '='
Two-stage update: Prints superfluous "None".
>>> e.from_string("{{ x.update({4:5}) }} {{ x }}").render({'x':{1:2,2:3}})
u'None {1: 2, 2: 3, 4: 5}'
>>> e.from_string("{{ dict(x.items()+ {3:4}.items()) }}").render({'x':{1:2,2:3}})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "jinja2/environment.py", line 868, in render
return self.environment.handle_exception(exc_info, True)
File "<template>", line 1, in top-level template code
TypeError: <lambda>() takes exactly 0 arguments (1 given)
Use dict(x,**y): Fails.
>>> e.from_string("{{ dict((3,4), **x) }}").render({'x':{1:2,2:3}})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "jinja2/environment.py", line 868, in render
return self.environment.handle_exception(exc_info, True)
File "<template>", line 1, in top-level template code
TypeError: call() keywords must be strings
So how does one modify the dictionary x in Jinja2 by changing an attribute or merging with another dictionary?
This question is similar to: How can I merge two Python dictionaries as a single expression? -- insofar as Jinja2 and Python are analogous.
I found another solution without any extension.
{% set _dummy = x.update({4:5}) %}
It makes x updated. Don't use _dummy.
Sounds like the Jinja2 "do" statement extension may help. Enabling this extension would allow you to rewrite:
{{ x.update({4:5}) }} {{ x }}
as
{% do x.update({4:5}) %} {{ x }}
Example:
>>> import jinja2
>>> e = jinja2.Environment(extensions=["jinja2.ext.do",])
>>> e.from_string("{% do x.update({4:5}) %} {{ x }}").render({'x':{1:2,2:3}})
u' {1: 2, 2: 3, 4: 5}'
>>>
I added a filter to merge dictionaries, namely:
>>> def add_to_dict(x,y): return dict(x, **y)
>>> e.filters['add_to_dict'] = add_to_dict
>>> e.from_string("{{ x|add_to_dict({4:5}) }}").render({'x':{1:2,2:3}})
u'{1: 2, 2: 3, 4: 5}'