Using ruamel.yaml to print out a list with individual elements singlequoted? - ruamel.yaml

I have a dictionary with a few lists(contains a # of strings).
Example List:
hosts = ['199.168.1.100:1000', '199.168.1.101:1000']
When I try to print this out using ruamel.yaml, the elements show up as
hosts:
- 199.168.1.100:1000
- 199.168.1.101:1000
I want the results to be
hosts:
- '199.168.1.100:1000'
- '199.168.1.101:1000'
So I traversed through the list and created a new list with each element being a ruamel SingleQuotedString
S = ruamel.yaml.scalarstring.SingleQuotedScalarString
new_list = []
for e in hosts:
new_list.append(S(e))
hosts = new_list
When I print this out, I still end up printing the "hosts" list without any quotes. What am I doing wrong here?

In the following I assume you mean dumping to YAML when you indicate printing.
Your approach is in principle correct, as using the "global"
yaml.default_style = "'"
would also get the key hosts quoted, and that is not what you
want. Maybe you are not reassigning hosts to the actual datastructure that
you are dumping, because hosts is just the value of the key value pair you
are dumpiong.
The following:
import sys
import ruamel.yaml
S = ruamel.yaml.scalarstring.SingleQuotedScalarString
yaml = ruamel.yaml.YAML()
data = dict(hosts = [S(x) for x in ['199.168.1.100:1000', '199.168.1.101:1000']])
yaml.dump(data, sys.stdout)
will give what you want without problem:
hosts:
- '199.168.1.100:1000'
- '199.168.1.101:1000'

Related

PySpark Set variable that's the name of current loop value

I want to spin through a config file in a loop assigning any options I find to a variable of the same name in my notebook.
So the code is shorter and I don't have a load of try and else steps. I initialize default options at the start and then if I find a config option with the same name it updates it
Config file
[file_options]
cfgfilename = 'newfilename.csv'
Notebook
import configparser
# default options
cfgfilename = 'dummy.csv'
otheroptions_etc = 'hi'
config = configparser.ConfigParser()
config.read({config file path here})
if config.has_section('file_options'):
for option in config.options('file_options'):
{something here to set cfgfilename}= config.get('file_options', option )
print (cfgfilename) # and so it comes out as newfilename.csv not dummy.csv
Thanks samKart this is how I solved it using a dictionary
# create dictionary with the dummy value pairs
dict_values = { 'filename' : 'dummy.csv',
'otheroptions_etc' : 'hi'}
# check it has a file options section
if config.has_section('file_options'):
# loop through the options in the section
for option in config.options('file_options'):
# if that option is in the dictionary..
if option in dict_values.keys():
# ... up date it
dict_values[option] = config.get('file_options', option )
# finally set all the variables to the updated dictionary values
filename= dict_values['filename']
otheroptions_etc= dict_values['otheroptions_etc']

Powershell filtering one list out of another list

<updated, added Santiago Squarzon suggest information>
I have two lists, I pull them from csv but there is only one column in each of the two lists.
Here is how I pull in the lists in my script
$orginal_list = Get-Content -Path .\random-word-350k-wo-quotes.txt
$filter_words = Get-Content -Path .\no_go_words.txt
However, I will use a typed list for simplicity in the code example below.
In this example, the $original_list can have some words repeated.
I want to filter out all of the words in $original_list that are in the $filter_words list.
Then add the filtered list to the variable $filtered_list.
In this example, $filtered_list would only have "dirt","turtle" in it.
I know the line I have below where I subtract the two won't work, it's there as a placeholder as I don't know what to use to get the result.
Of note, the csv file that feeds $original_list could have 300,000 or more rows, and $filter_words could have hundreds of rows. So would want this to be as efficient as possible.
The filtering is case insensitive.
$orginal_list = "yellow","blue","yellow","dirt","blue","yellow","turtle","dirt"
$filter_words = "yellow","blue","green","harsh"
$filtered_list = $orginal_list - $filter_words
$filtered_list
dirt
turtle
Use System.Collections.Generic.HashSet`1 and its .ExceptWith() method:
# Note: if possible, declare the lists as [string[]] arrays to begin with.
# Otherwise, use a [string[]] cast im the method calls below, which,
# however, creates a duplicate array on the fly.
[string[]] $orginal_list = "yellow","blue","yellow","dirt","blue","yellow","turtle","dirt"
[string[]] $filter_words = "yellow","blue","green","harsh"
# Create a hash set based on the strings in $orginal_list,
# with case-insensitive lookups.
$hsOrig = [System.Collections.Generic.HashSet[string]]::new(
$orginal_list,
[System.StringComparer]::CurrentCultureIgnoreCase
)
# Reduce it to those strings not present in $filter_words, in-place.
$hsOrig.ExceptWith($filter_words)
# Convert the filtered hash set to an array.
[string[]] $filtered_list = [string[]]::new($hsOrig.Count)
$hsOrig.CopyTo($filtered_list)
# Output the result
$filtered_list
The above yields:
dirt
turtle
To also speed up reading your input files, use the following:
# Note: System.IO.File]::ReadAllLines() returns a [string[]] instance.
$orginal_list = [System.IO.File]::ReadAllLines((Convert-Path .\random-word-350k-wo-quotes.txt))
$filter_words = [System.IO.File]::ReadAllLines((Convert-Path .\no_go_words.txt))
Note:
.NET generally defaults to (BOM-less) UTF-8; pass a [System.Text.Encoding] instance as a second argument, if needed.
.NET's working dir. usually differs from PowerShell's, so the use of full paths is always advisable in .NET API calls, and that is what the Convert-Path calls ensure.
I have found that using Linq to filter one list out from another is incredibly easy and incredibly fast (especially for large lists)
# ARRAY OF 1000 STRINGS LOWERCASE (item1 - item1000)
[string[]]$ThousandItems = 1..1000 | %{"item$_"};
# ARRAY OF 100 STRINGS UPPERCASE (ITEM901 - ITEM1000)
[string[]]$HundredItems = 901..1000 | %{"ITEM$_"};
# SUBTRACT THE SECOND ARRAY FROM THE FIRST ONE (CASE INSENSITIVELY)
[string[]]$NineHundred = [Linq.Enumerable]::Except($ThousandItems, $HundredItems, [System.StringComparer]::OrdinalIgnoreCase);
$NineHundred;
Which returns the list of 1000 items minus Item901-Item1000
item1
item2
...
item899
item900
As for speed, removing 100 items from a list...
1,000 Items = 1ms
10,000 Items = 2ms
100,000 Items = 12ms
1,000,000 Items = 259ms
10,000,000 Items = 3,008ms
Note: These times are just on the [Linq.Enumerable]::Except() line. So it's just measuring the time taken to subtract one array from the other. It does not measure the time taken to fill the array.
So to apply this to the original poster's example
$original_list = [System.IO.File]::ReadAllLines((Convert-Path .\random-word-350k-wo-quotes.txt));
$filter_words = [System.IO.File]::ReadAllLines((Convert-Path .\no_go_words.txt));
[string[]]$filtered_list = [Linq.Enumerable]::Except($original_list,$filter_words,[System.StringComparer]::OrdinalIgnoreCase);
For this, I literally inserted 350K strings (the MD5 hash of the numbers 1 - 350K) into the original list (uppercase), inserted 10K strings (the MD5 hash of the numbers 1-10K) into the filter words list (lowercase) and ran that code.
There were 340K words in the filtered list, and it only took 260ms to read both files, filter and return the list

how to explicitly write two references in ruamel.yaml

If I have multiple references and when I write them to a YAML file using ruaml.yaml from Python I get:
<<: [*name-name, *help-name]
but instead I would prefer to have
<<: *name-name
<<: *help-name
Is there an option to achieve this while writing to the file?
UPDATE
descriptions:
- &description-one-ref
description: >
helptexts:
- &help-one
help_text: |
questions:
- &question-one
title: "title test"
reference: "question-one-ref"
field: "ChoiceField"
choices:
- "Yes"
- "No"
required: true
<<: *description-one-ref
<<: *help-one
riskvalue_max: 10
calculations:
- conditions:
- comparator: "equal"
value: "Yes"
actions:
- riskvalue: 0
- conditions:
- comparator: "equal"
value: "No"
actions:
- riskvalue: 10
Currently I'm reading such a file and modify specific values within python and then want to write it back. When I'm writing I'm getting the issue that the references are as list and not as outlined.
That means the workflow is as: I'm reading the doc via
yaml = ruamel.yaml.YAML()
with open('test.yaml') as f:
data = yaml.load(f)
for k in data.keys():
if k == 'questions':
q = data.get(k)
for i in range(0, len(q)):
q[i]['title'] = "my new title"
f.close()
g = open('new_file.yaml', 'w')
yaml(data)
g.close()
No, there is no such option, as it would lead to an invalid YAML file.
The << is a mapping key, for which the value is interpreted
specially assuming the parser implements to the language independent
merge key specification. And a mapping key must be unique
according to the YAML specification:
The content of a mapping node is an unordered set of key: value node
pairs, with the restriction that each of the keys is unique.
That ruamel.yaml (< 0.15.75) doesn't throw an error on such
duplicate key is a bug. On duplicate normal keys, ruamel.yaml
does throw an error. The bug is inherited from PyYAML (which is not
specification conformant, and does not throw an error even on
duplicate normal keys).
However with a little pre- and post-processing what you want to do can
be easily achieved. The trick is to make the YAML valid before parsing
by making the offending duplicate << keys unique (but recognisable)
and then, when writing the YAML back to file, substituting these
unique keys by <<: * again. In the following the first occurence of
<<: * is replaced by [<<, 0]:, the second by [<<, 1]: etc.
The * needs to be part of the substitution, as there are no anchors in
the document for those aliases.
import sys
import subprocess
import ruamel.yaml
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.indent(sequence=4, offset=2)
class DoubleMergeKeyEnabler(object):
def __init__(self):
self.pat = '<<: ' # could be at the root level mapping, so no leading space
self.r_pat = '[<<, {}]: ' # probably not using sequences as keys
self.pat_nr = -1
def convert(self, doc):
while self.pat in doc:
self.pat_nr += 1
doc = doc.replace(self.pat, self.r_pat.format(self.pat_nr), 1)
return doc
def revert(self, doc):
while self.pat_nr >= 0:
doc = doc.replace(self.r_pat.format(self.pat_nr), self.pat, 1)
self.pat_nr -= 1
return doc
dmke = DoubleMergeKeyEnabler()
with open('test.yaml') as fp:
# we don't do this line by line, that would not work well on flow style mappings
orgdoc = fp.read()
doc = dmke.convert(orgdoc)
data = yaml.load(doc)
data['questions'][0].anchor.always_dump = True
#######################################
# >>>> do your thing on data here <<< #
#######################################
with open('output.yaml', 'w') as fp:
yaml.dump(data, fp, transform=dmke.revert)
res = subprocess.check_output(['diff', '-u', 'test.yaml', 'output.yaml']).decode('utf-8')
print('diff says:', res)
which gives:
diff says:
which means the files are the same on round-trip (as long as you don't
change anything before dumping).
Setting preserve_quotes and calling ident() on the YAML instance are necessary to
preserve your superfluous quotes, resp. keeping the indentation.
Since the anchor question-one has no alias, you need to enable dumping explicitly by
setting always_dump on that attribute to True. If necessary you can recursively
walk over data and set anchor.always_dump = True when .anchor.value is not None

Convert Ansible variable from Unicode to ASCII

I'm getting the output of a command on the remote system and storing it in a variable. It is then used to fill in a file template which gets placed on the system.
- name: Retrieve Initiator Name
command: /usr/sbin/iscsi-iname
register: iscsiname
- name: Setup InitiatorName File
template: src=initiatorname.iscsi.template dest=/etc/iscsi/initiatorname.iscsi
The initiatorname.iscsi.template file contains:
InitiatorName={{ iscsiname.stdout_lines }}
When I run it however, I get a file with the following:
InitiatorName=[u'iqn.2005-03.org.open-iscsi:2bb08ec8f94']
What I want:
InitiatorName=iqn.2005-03.org.open-iscsi:2bb08ec8f94
What am I doing wrong?
I realize I could write this to the file with an "echo "InitiatorName=$(/usr/sbin/iscsi-iname)" > /etc/iscsi/initiatorname.iscsi" but that seems like an un-Ansible way of doing it.
Thanks in advance.
FWIW, if you really do have an array:
[u'string1', u'string2', u'string3']
And you want your template/whatever result to be NOT:
ABC=[u'string1', u'string2', u'string3']
But you prefer:
ABC=["string1", "string2", "string3"]
Then, this will do the trick:
ABC=["{{ iscsiname.stdout_lines | list | join("\", \"") }}"]
(extra backslashes due to my code being in a string originally.)
Use a filter to avoid unicode strings:
InitiatorName = {{ iscsiname.stdout_lines | to_yaml }}
Ansible Playbook Filters
To avoid the 80 symbol limit of PyYAML, just use the to_json filter instead:
InitiatorName = {{ iscsiname.stdout_lines | to_yaml }}
In my case, I'd like to create a python array from a comma seperated list. So a,b,c should become ["a", "b", "c"]. But without the 'u' prefix because I need string comparisations (without special chars) from WebSpher. Since they seems not to have the same encoding, comparisation fails. For this reason, I can't simply use var.split(',').
Since the strings contains no special chars, I just use to_json in combination with map(trim). This fixes the problem that a, b would become "a", " b".
restartApps = {{ apps.split(',') | map('trim') | list | to_json }}
Since JSON also knows arrays, I get the same result than python would generate, but without the u prefix.

Silent exporting of globals using %GOF in Caché

I would like to know if it's possible to use "^%GOF" without user interaction. I'm using Caché 2008. ^%GO isn't an option as it's to slow. I'm using input from a temporary file for automatically answer the questions, but it can fail (rarely happens).
I couldn't find the routine of this utility in %SYS. Where is it located?
Thanks,
Answer: Using "%SYS.GlobalQuery:NameSpaceList" to get list of globals (system globals excluding).
Set Rset = ##class(%ResultSet).%New("%SYS.GlobalQuery:NameSpaceList")
d Rset.Execute(namespace, "*", 0)
s globals=""
while (Rset.Next()){
s globalName=Rset.Data("Name")_".gbl"
if (globals=""){
s globals = globalName
}else{
s globals = globals_","_globalName
}
d ##class(%Library.Global).Export(namespace, globals, "/tmp/export.gof", 7)
The only drawback is that if you have a namespace with concatination of globals exceeding the maximum allowed for a global entry, the program crashes. You should then split the globals list.
I would recommend that you look at the %Library.Global() class with output format 7.
classmethod Export(Nsp As %String = $zu(5), ByRef GlobalList As %String, FileName As %String, OutputFormat As %Integer = 5, RecordFormat As %String = "V", qspec As %String = "d", Translation As %String = "") as %Status
Exports a list of globals GlobalList from a namespace Nsp to FileName using OutputFormat and RecordFormat.
OutputFormat can take the values below:
1 - DTM format
3 - VAXDSM format
4 - DSM11 format
5 - ISM/Cache format
6 - MSM format
7 - Cache Block format (%GOF)
RecordFormat can take the values below:
V - Variable Length Records
S - Stream Data
You can find it in the class documentation here: http://docs.intersystems.com/cache20082/csp/documatic/%25CSP.Documatic.cls
I've never used it, it looks like it would do the trick however.
export your global to file
d $system.OBJ.Export("myGlobal.GBL","c:\global.xml")
import global from your file
d $system.OBJ.Load("c:\global.xml")
Export items as an XML file
The extension of the items determine what
type they are, they can be one of:
CLS - classes
CSP - Cache Server Pages
CSR - Cache Rule files
MAC - Macro routines
INT - None macro routines
BAS - Basic routines
INC - Include files
GBL - Globals
PRJ - Studio Projects
OBJ - Object code
PKG - Package definition
If you wish to export multiple classes then separate then with commas or
pass the items("item")="" as an array or use wild cards.
If filename is empty then it will export to the current device.
link to docbook
edit: adding "-d" as qspec value will suppress the terminal output of the export. If you want to use this programmtically, it might get in the way.
And just for completeness' sake:
SAMPLES>s IO="c:\temp\test.gof"
SAMPLES>s IOT="RMS"
SAMPLES>s IOPAR="WNS"
SAMPLES>s globals("Sample.PersonD")=""
SAMPLES>d entry^%GOF(.globals)
SAMPLES>
-> results in c:\temp\test.gof having the export. You can define up to 65435 globals in you array (named globals in this example)
But I would recommend you go with DAiMor's answer as this is the more 'modern' way.
To avoid maximum string error, you should use subscripts instead of comma delimited string:
Set Rset = ##class(%ResultSet).%New("%SYS.GlobalQuery:NameSpaceList")
d Rset.Execute(namespace, "*", 0)
while (Rset.Next()) {
s globals(Rset.Data("Name"))="" // No need for _".gbl" in recent Cache
}
d ##class(%Library.Global).Export(namespace, .globals, "/tmp/export.gof", 7) // Note dot before globals