Get out edges with properties - networkx

I have a Digraph G with some nodes and some edges:
G.add_edges_from([ ...
('X', 'Y', {'property': 85}), ...
('X', 'T', {'property': 104}), ...
...])
However, when I run G.out_edges('X'), it returns OutEdgeDataView([('X', 'Y'), ('X', 'T')]). Instead, I want to get a list of tuples with the edges (with the property), like this:
[('X', 'Y', {'property': 85}), ('X', 'T', {'property': 104})]
How should I get these results?
Thanks!

You can use networkx.DiGraph.out_edges(data=True).
import networkx as nx
G = nx.DiGraph()
G.add_edges_from([
('X', 'Y', {'property': 85}),
('X', 'T', {'property': 104}),
('Z', 'X', {'property': 104}),
])
print(G.out_edges('X', data=True))
[('X', 'Y', {'property': 85}), ('X', 'T', {'property': 104})]

I found no better answer than simply writing a list comprehension like this:
def get_out_edges(node, all_edges):
return [e for e in all_edges if node == e[0]]

Related

Postgres: How Can I DRY up Multiple WHERE IN (a,b,c,...) Clauses?

In a programming language like Javascript, instead of doing:
['a', 'b', 'c', 'd'].map(...);
['a', 'b', 'c', 'd'].filter(...);
['a', 'b', 'c', 'd'].forEach(...);
I can do the following:
const names = ['a', 'b', 'c', 'd'];
names.map(...);
names.filter(...);
names.forEach(...);
If I have several SQL statements in a file:
SELECT * FROM foo WHERE something IN ('a', 'b', 'c', 'd');
SELECT * FROM bar WHERE something_else IN ('a', 'b', 'c', 'd');
SELECT * FROM baz WHERE another_thing IN ('a', 'b', 'c', 'd')
Is there a similar way I can "create an array variable" and then use it repeatedly in all those queries? I know things get complicated because ('a', 'b', 'c', 'd') isn't actually an array, and I'm not sure if I should be using a literal variable, a view, or a function to hold the ('a', 'b', 'c', 'd') part.
The closest analogy would be a temporary table.
CREATE TEMP TABLE targets (t text);
COPY targets FROM stdin;
a
b
c
d
...thousands more rows
\.
SELECT foo.* FROM foo JOIN targets ON foo.x = targets.t
However, it is less common in a database to need to match one set of values against multiple tables because that can imply your database structure needs reworking.

IN clause for multiple columns on the same value set

I have a WHERE clause like below:
WHERE t1.col1 IN ('a', 'b', 'c') OR t2.col2 IN ('a', 'b', 'c');
I know that the two IN clauses will always have the same contents. Is there any way I can avoid duplicating them?
Something like:
WHERE (t1.col1 OR t2.col2) IN ('a', 'b', 'c')
You can use the array overlaps operator:
where array[t1.col1, t2.col2] && array['a','b','c']
with flt(x) as (values('{a,b,c}'::text[]))
select
...
from ..., flt
where t1.col1 = any(flt.x) or t2.col2 = any(flt.x);
There are two ways to use some constant values in the PostgreSQL. Using CTE as shown above. Or using "options":
set foo.bar to '{a,b,c}';
select '{c,d,e}'::text[] && current_setting('foo.bar')::text[];

Dash, circular callback inputs

I have a problem with an application I want to code with Dash. I have two different checklists (indicator A, indicator B). My goal is that the user can choose multiple options for only one indicator. So it is allowed to choose a,b,c and 1. It should also be possible to select c and 2,3 at the same time. The selection of a,c and 1,2 on the other hand, should be prevented. My approach to this is the following code:
app.layout = html.Div([
html.Label('indicator A'),
dcc.Checklist(
id = 'i_a',
options=[
{'label': 'a', 'value': 'a'},
{'label': 'b', 'value': 'b'},
{'label': 'c', 'value': 'c'}
],
value = ['a']
),
html.Label('indicator B'),
dcc.Checklist(
id='i_b',
options=[
{'label': '1', 'value': '1'},
{'label': '2', 'value': '2'},
{'label': '3', 'value': '3'}
],
value=['1']
),
])
#app.callback(
Output('i_b', 'value'),
Input('i_a', 'value')
)
def change_b(value_a):
return ['1']
#app.callback(
Output('i_a', 'value'),
Input('i_b', 'value')
)
def change_b(value_b):
return ['a']
This creates an endless loop because the callbacks trigger each other. However, I have no idea how to solve the problem. I am grateful for any help :)
I think what you need is the new circular callbacks ability. Check out the examples on this page, which I think are very similar to what you're doing.

Join two pipelinedRDDs

I am trying to join two pipelinedRDDs using .join() in pyspart jupyter notebook
First RDD:
primaryType.take(5)
['DECEPTIVE PRACTICE',
'CRIM SEXUAL ASSAULT',
'BURGLARY',
'THEFT',
'CRIM SEXUAL ASSAULT']
Second RDD:
districts.take(5)
['004', '022', '008', '003', '001']
Join RDDs:
rdd_joined = primaryType.join(districts)
rdd_joined.take(5)
Output:
[]
What am I donig wrong here?
There should be some unique key to join both the rdds, so use rdd.zipWithIndex() to create indexes for both the rdds and then try to join them
districts.take(5)
['004', '022', '008', '003', '001']
primaryType.take(5)
['DECEPTIVE PRACTICE',
'CRIM SEXUAL ASSAULT',
'BURGLARY',
'THEFT',
'CRIM SEXUAL ASSAULT']
districts=districts.zipWithIndex()
districts.take(5)
[('004', 0), ('022', 1), ('008', 2), ('003', 3), ('001', 4)]
districts=districts.map(lambda (x,y):(y,x))
primaryType=primaryType.zipWithIndex()
primaryType=primaryType.map(lambda (x,y):(y,x))
primaryType.join(districts).map(lambda (x,y):y).take(5)
[('DECEPTIVE PRACTICE', '004'), ('CRIM SEXUAL ASSAULT', '001'), ('CRIM SEXUAL ASSAULT', '022'), ('BURGLARY', '008'), ('THEFT', '003')]

SQL: PIVOTting Count & Percentage against a column

I'm trying to produce a report that shows, for each Part No, the results of tests on those parts in terms of the numbers passed and failed, and the percentages passed and failed.
So far, I have the following:
SELECT r2.PartNo, [Pass] AS Passed, [Fail] as Failed
FROM
(SELECT ResultID, PartNo, Result FROM Results) r1
PIVOT (Count(ResultID) FOR Result IN ([Pass], [Fail])) AS r2
ORDER By r2.PartNo
This is half of the solution (the totals for passes and fails); the question is, how do I push on and include percentages?
I haven't tried yet, but I imagine that I can start again from scratch, and build up a series of subqueries, but this is more a learning exercise - I want to know the 'best' (most elegant or most efficient) solution, so I thought I'd seek advice.
Can I extend this PIVOT query, or should I take a different approach?
DDL:
CREATE TABLE RESULTS (
[ResultID] [int] NOT NULL,
[SerialNo] [int] NOT NULL,
[PartNo] [varchar](10) NOT NULL,
[Result] [varchar](10) NOT NULL);
DML:
INSERT INTO Results VALUES (1, '100', 'ABC', 'Pass')
INSERT INTO Results VALUES (2, '101', 'DEF', 'Pass')
INSERT INTO Results VALUES (3, '100', 'ABC', 'Fail')
INSERT INTO Results VALUES (4, '102', 'DEF', 'Pass')
INSERT INTO Results VALUES (5, '102', 'DEF', 'Pass')
INSERT INTO Results VALUES (6, '102', 'DEF', 'Fail')
INSERT INTO Results VALUES (7, '101', 'DEF', 'Fail')
UPDATE:
My solution, based on bluefeet's answer is:
SELECT r2.PartNo,
[Pass] AS Passed,
[Fail] as Failed,
ROUND(([Fail] / CAST(([Pass] + [Fail]) AS REAL)) * 100, 2) AS PercentFailed
FROM
(SELECT ResultID, PartNo, Result FROM Results) r1
PIVOT (Count(ResultID) FOR Result IN ([Pass], [Fail])) AS r2
ORDER By r2.PartNo
I've ROUNDed a FLOAT(rather than CAST to DECIMAL twice) because its a tiny bit more efficient, and I've also decided that we only real need the failure %age.
It sounds like you just need to add a column for Percent Passed and Percent Failed. You can calculate those columns on your PIVOT.
SELECT r2.PartNo
, [Pass] AS Passed
, [Fail] as Failed
, ([Pass] / Cast(([Pass] + [Fail]) as decimal(5, 2))) * 100 as PercentPassed
, ([Fail] / Cast(([Pass] + [Fail]) as decimal(5, 2))) * 100 as PercentFailed
FROM
(
SELECT ResultID, PartNo, Result
FROM Results
) r1
PIVOT
(
Count(ResultID)
FOR Result IN ([Pass], [Fail])
) AS r2
ORDER By r2.PartNo