How to print out only the children from the findall result? - openxml

Here is the xml markup
<p:transition spd="slow">
<p:push dir="u"/>
</p:transition>
Here is my code:
namespaces = {
'p': 'http://schemas.openxmlformats.org/presentationml/2006/main'
}
transitions = et.findall('p:transition', namespaces)
if transitions:
# Only worry about the first node for now
for p in transitions[0].iter():
print p.tag, p.text
The print output is
{http://schemas.openxmlformats.org/presentationml/2006/main}transition
{http://schemas.openxmlformats.org/presentationml/2006/main}push None
I would like to skip the p:transition node. How can I tell p:transition is a parent node of p:push?
In addition, is there a easy to strip the expanded namespace prefix from the tag printout: {http://schemas.openxmlformats.org/presentationml/2006/main} (or in this case p:push is fine)

Related

Dgraph: How do you query for strings that start with a specific sequence of letters?

For example, let's say your database contains a bunch of People{ name } objects, and you want to find everyone whose name starts with the letter "H" or "Mi".
You should use Regex https://docs.dgraph.io/query-language/#regular-expressions
But, regex accepts at least 3 characters.
Other way you could do is using "has" function and ordering by the predicate.
https://docs.dgraph.io/query-language/#has
{
me(func: has(name), first: 50, orderasc: name) {
name
}
}
But with 'Has' function you don't have full control. It will follow an alphabetical order.
One way to have this level of control is to create a pseudo-indexing structure. Every time you add a user you relate him to a tree that will play the role of indexing. And through this tree you can count and traverse the edge.
e.g:
Let's assume you know the UID of your tree (or that you use Upser Block).
_:NewUser <name> "Lucas" .
_:NewUser <pred1> "some1" .
_:NewUser <pred2> "some2" .
# Now we gonna relate Lucas to our tree and <0x33f> is the uid for the letter L of "Lucas"
<0x33f> <firstCharacter> _:NewUser . #This is a reverse record (you don't need to use #reverse tho)
The same example using Upsert Block:
upsert {
query {
v as var(func: type(myTree)) #filter(eq(letter, "L"))
}
mutation {
set {
_:NewUser <name> "Lucas" .
_:NewUser <pred1> "some1" .
_:NewUser <pred2> "some2" .
uid(v) <firstCharacter> _:NewUser .
}
}
}
A basic search in the tree would be like:
{
q(func: type(myTree)) #filter(eq(letter, "L")) {
letter
firstCharacter { name pred1 pred2 }
}
}
Or
{
q(func: type(myTree)) #filter(eq(letter, "L")) {
letter
count(firstCharacter)
}
}
Cheers.

sorting nodes according to its members

I don't know where to start. I have a long list of nodes comprised of descendant members, for which I want to make a linked tree, a plain text database in the form of child/parent. For example:
N115713
N115713 N96394
N117904 N18574
N140517 N171639 N179536 N208718 N210073 N226737 N4647 N80403
N171639
N171639 N18574
N171639 N208718
N171639 N208718 N210073
N171639 N208718 N210073 N3690
N171639 N208718 N210073 N96585
N171639 N210073
N18574
N18574 N80403
Obviously, "N115713" will go downstream of "N115713 N96394" but I seem unable to turn that recognition into an algorithm. There are several hundred nodes having up to several dozen members. Pointers to get started? I'm using perl.
Thanks!
UPDATE: Well, I have an idea but haven't been able to implement it yet. I'm searching each line in turn for the other lines it's a "member" of then selecting that result which has the next highest number of members as its parent.
Since the main problem here is to check if the input data is consistent and does not have cycles, I recommend using some graph-theoretical module, for example Graph.
If your data allows a child to have multiple parents you have to check if the directed graph produced from your data does not have a cycle.
Otherwise, if your data should be a tree, you have to check that the undirected graph does not have a cycle.
I sketched up a simple script, that implements these checks and outputs child/parent pairs, it is pretty self explanatory:
use strict;use warnings;
use Graph;
my $g=Graph->new(directed=>1);
while(<>) {
chomp;
my #fields=split;
# this assumes that each line starts with a parent and goes down through its descendants
# adjust the logic to your needs
my $parent;
for my $child(#fields) {
$g->add_vertex($child);
if ($parent) {
$g->add_edge($child,$parent);
}
$parent=$child;
}
}
# check if we have a DAG
my #cycle = $g->find_a_cycle();
if (#cycle) {
printf "The directed graph has a cycle: %s\n", join ',', #cycle
}
# check if we have a tree
my $un_g = $g->undirected_copy();
#cycle = $un_g->find_a_cycle();
if (#cycle) {
printf "The undirected graph has a cycle: %s\n", join ',', #cycle
}
print "child,parent\n";
for my $edge(sort { $a->[0] cmp $b->[0] } $g->edges) {
printf "%s,%s\n", $edge->[0], $edge->[1];
}
And the output for your data:
The undirected graph has a cycle: N179536,N171639,N208718
child,parent
N115713,N96394
N117904,N18574
N140517,N171639
N171639,N208718
N171639,N18574
N171639,N179536
N171639,N210073
N179536,N208718
N18574,N80403
N208718,N210073
N210073,N226737
N210073,N96585
N210073,N3690
N226737,N4647
N4647,N80403

why pyang does not throw error in this scenario

the following when condition is referring to a non existing node. I wonder why pyang does not throw error ? It does, if I use a wrong prefix though.
can you review the when conditions (embedded in the module) please.
is it allowed (in when expression) to refer to schema out of the augment itself ?
module mod-w-1 {
namespace "http://example.org/tests/mod-w-1";
prefix m1;
container m1 {
leaf b1 {
type string;
}
}
}
module when-tests {
namespace "http://example.org/tests/when-tests";
prefix wt;
import mod-w-1 {
prefix m1;
}
augment "/m1:m1" {
// when "/m1:m1/b3 = 'abc'";
// there is no b3, so, should be invalid.
// when "/m1:m1/b1 = 'abc'";
// a payload or data situation that has m1/b1 != 'abc' will cause the
// data that fits this augment content will be invalid/rejected.
/* for ex;
<m1>
<b1>fff</b1>
<x>sfsf</x>
<conditional>
<foo>dddd</foo>
</conditional>
</m1>
is invalid, hence, the <x> and <conditional> parts will be
rejected.
*/
leaf x {
type string;
}
container conditional {
leaf foo {
type string;
}
}
}
}
That is because pyang does not validate semantics of XPath expressions at all, only their syntax - and a couple of additional checks, such as function and prefix usage. You will need another YANG compiler to validate those properly.
def v_xpath(ctx, stmt):
try:
toks = xpath.tokens(stmt.arg)
for (tokname, s) in toks:
if tokname == 'name' or tokname == 'prefix-match':
i = s.find(':')
if i != -1:
prefix = s[:i]
prefix_to_module(stmt.i_module, prefix, stmt.pos,
ctx.errors)
elif tokname == 'literal':
# kind of hack to detect qnames, and mark the prefixes
# as being used in order to avoid warnings.
if s[0] == s[-1] and s[0] in ("'", '"'):
s = s[1:-1]
i = s.find(':')
# make sure there is just one : present
if i != -1 and s[i+1:].find(':') == -1:
prefix = s[:i]
# we don't want to report an error; just mark the
# prefix as being used.
my_errors = []
prefix_to_module(stmt.i_module, prefix, stmt.pos,
my_errors)
for (pos, code, arg) in my_errors:
if code == 'PREFIX_NOT_DEFINED':
err_add(ctx.errors, pos,
'WPREFIX_NOT_DEFINED', arg)
elif ctx.lax_xpath_checks == True:
pass
elif tokname == 'variable':
err_add(ctx.errors, stmt.pos, 'XPATH_VARIABLE', s)
elif tokname == 'function':
if not (s in xpath.core_functions or
s in yang_xpath_functions or
(stmt.i_module.i_version != '1' and
s in yang_1_1_xpath_functions) or
s in extra_xpath_functions):
err_add(ctx.errors, stmt.pos, 'XPATH_FUNCTION', s)
except SyntaxError as e:
err_add(ctx.errors, stmt.pos, 'XPATH_SYNTAX_ERROR', e)
Line 1993 of statements.py.
Note that an XPath expression referring to a non-existent node is technically not invalid, not from XPath specification perspective. It just means that an empty node set will be selected by the location path (and that your condition will be false forever).
Yes, you can refer to nodes that are "above" the augment's target node or are its siblings - in fact, you always should when when statement is in play (it should not refer to any node made conditional by it).
Also, you should never attempt to break "module confinement" with a non-prefixed node test (such as b3 and b1). The XPath expression is only able to see names that are defined in imports of the defining module and the defining module itself. For example, even if b3 is later augmented by some unknown third module, your condition would still evaluate to false. It is best to assume that non-prefixed names belong to the defining module's namespace.

Is there an easy way to add/remove/modify query parameters of a URL in Tritium?

I saw a very manual way of doing this in another post: How do I add a query parameter to a URL?
This doesn't seem very intuitive, but someone there mentioned an easier way to accomplish this using the upcoming "URL scope". Is this feature out yet, and how would I use it?
If you're using the stdlib mixer, you should be able to use the URL scope which provides helper functions for adding, viewing, editing, and removing URL params. Here's a quick example:
$original_url = "http://cuteoverload.com/2013/08/01/buttless-monkey-jams?hi=there"
$new_url = url($original_url) {
log(param("hi"))
param("hello", "world")
remove_param("hi")
}
log($new_url)
Tritium Tester example here: http://tester.tritium.io/9fcda48fa81b6e0b8700ccdda9f85612a5d7442f
Almost forgot, link to docs: http://tritium.io/current (You'll want to click on the URL category).
AFAIK, there's no built-in way of doing so.
I'll post here how I did to append a query param, making sure that it does not get duplicated if already on the url:
Inside your functions/main.ts file, you can declare:
# Adds a query parameter to the URL string in scope.
# The parameter is added as the last parameter in
# the query string.
#
# Sample use:
# $("//a[#id='my_link]") {
# attribute("href") {
# value() {
# appendQueryParameter('MVWomen', '1')
# }
# }
# }
#
# That will add MVwomen=1 to the end of the query string,
# but before any hash arguments.
# It also takes care of deciding if a ? or a #
# should be used.
#func Text.appendQueryParameter(Text %param_name, Text %param_value) {
# this beautiful regex is divided in three parts:
# 1. Get anything until a ? or # is found (or we reach the end)
# 2. Get anything until a # is found (or we reach the end - can be empty)
# 3. Get the remainder (can be empty)
replace(/^([^#\?]*)(\?[^#]*)?(#.*)?$/) {
var('query_symbol', '?')
match(%2, /^\?/) {
$query_symbol = '&'
}
# first, it checks if the %param_name with this %param_value already exists
# if so, we don't do anything
match_not(%2, concat(%param_name, '=', %param_value)) {
# We concatenate the URL until ? or # (%1),
# then the query string (%2), which can be empty or not,
# then the query symbol (either ? or &),
# then the name of the parameter we are appending,
# then an equals sign,
# then the value of the parameter we are appending
# and finally the hash fragment, which can be empty or not
set(concat(%1, %2, $query_symbol, %param_name, '=', %param_value, %3))
}
}
}
The other features you want (remove, modify) can be achieved similarly (by creating a function inside functions/main.ts and leveraging some regex magic).
Hope it helps.

WWW::Mechanize::Firefox looping though links

I am using a foreach to loop through links. Do I need a $mech->back(); to continue the loop or is that implicit.
Furthermore do I need a separate $mech2 object for nested for each loops?
The code I currently have gets stuck (it does not complete) and ends on the first page where td#tabcolor3 is not found.
foreach my $sector ($mech->selector('a.link2'))
{
$mech->follow_link($sector);
foreach my $place ($mech->selector('td#tabcolor3'))
{
if (($mech->selector('td#tabcolor3', all=>1)) >= 1)
{
$mech->follow_link($place);
print $_->{innerHTML}, '\n'
for $mech->selector('td.dataCell');
$mech->back();
}
else
{
$mech->back();
}
}
You cannot access information from a page when it is no longer on display. However, the way foreach works is to build the list first before it is iterated through, so the code you have written should be fine.
There is no need for the call to back as the links are absolute. If you had used click then there must be a link in the page to click on, but with follow_link all you are doing is going to a new URL.
There is also no need to check the number of links to follow, as a for loop over an empty list will simply not be executed.
To make things clearer I suggest that you assign the results of selector to an array before the loop.
Like this
my #sectors = $mech->selector('a.link2');
for my $sector (#sectors) {
$mech->follow_link($sector);
my #places = $mech->selector('td#tabcolor3');
for my $place (#places) {
$mech->follow_link($place);
print $_->{innerHTML}, '\n' for $mech->selector('td.dataCell');
}
}
Update
My apologies. It seems that follow_link is finicky and needs to follow a link on the current page.
I suggest that you extract the href attribute from each link and use get instead of follow_link.
my #selectors = map $_->{href}, $mech->selector('a.link2');
for my $selector (#selectors) {
$mech->get($selector);
my #places = map $_->{href}, $mech->selector('td#tabcolor3');
for my $place (#places) {
$mech->get($place);
print $_->{innerHTML}, '\n' for $mech->selector('td.dataCell');
}
}
Please let me know whether this works on the site you are connecting to.
I recommend to use separate $mech object for this:
foreach my $sector ($mech->selector('a.link2'))
{
my $mech = $mech->clone();
$mech->follow_link($sector);
foreach my $place ($mech->selector('td#tabcolor3'))
{
if (($mech->selector('td#tabcolor3', all=>1)) >= 1)
{
my $mech = $mech->clone();
$mech->follow_link($place);
print $_->{innerHTML}, '\n'
for $mech->selector('td.dataCell');
#$mech->back();
}
# else
# {
# $mech->back();
# }
}
I am using WWW:Mechanize::Firefox to loop over a bunch of URLs with loads of Javascript. The page does not render immediately so need test if a particular page element is visible (similar to suggestion in Mechanize::Firefox documentation except 2 xpaths in the test) before deciding next action.
The page eventually renders a xpath to 'no info' or some wanted stuff after about 2-3 seconds. If no info we go to next URL. I think there is some sort of race condition with both xpaths not existing at once causing the MozRepl::RemoteObject: TypeError: can't access dead object error intermittently (at the sleep 1 in the loop oddly enough).
My solution that seems to work/improve reliability is to enclose all the $mech->getand$mech->is_visible in an eval{}; like this:
eval{
$mech->get("$url");
$retries = 15; #test to see if element visible = page complete
while ($retries-- and ! $mech->is_visible( xpath => $xpath_btn ) and ! $mech->is_visible( xpath => $xpath_no_info )){
sleep 1;
};
last if($mech->is_visible( xpath => $xpath_no_info) ); #skip rest if no info page
};
Others might suggest improvements on this.