Editing the labels on a flow chart with DiagrammeR - charts

I’m trying to make a flow chart with R. Attached is the chart I made in word (which is what I'm trying to get to). I don’t want to copy and paste it, I want to actually make it in R. I’ve been using DiagrammeR to try, and the code is below.
I'm having the main trouble with the labels, how to change some parts to bold and make them a nice distance away from the nodes. I've added in the blue and pink boxes in my code, which I like.
Code:
library(DiagrammeR)
graph <- "
digraph boxes_and_circles{
# Add node statements
# This states that all following nodes have a box shape
node[
shape=box,
style=rounded,
fontname=Helvetica,
penwidth=2,
fixedsize = true
width=4
]
# Connect the nodes with edge statements
edge[
arrowhead = normal,
arrowtail = none
]
# These are the main nodes at top of graph
'##1'->'##2'
[label=' Cleaning Function:
Text to lower case
Contractions expanded
Numbers replaced
Abbreviations expanded (Qdap)
NA’s ignored
Kerns replaced
White space removed', fontname=Helvetica, fontsize=20, fontweight=bold]
'##2'->'##3'
'##2'->'##4'
# Make subnodes with boxes around for tidy text grouping
# graph, node, and edge definitions
graph [
compound = true,
nodesep = 1,
ranksep = 0.25,
color = pink
]
# subgraph for tidy text, direct the flow
subgraph cluster0 {
'##3'->'##5'
[label=' -Tokenization
-Lemetisation
-Stop words removed', fontname=Helvetica, fontsize=20, fontweight=bold]
}
# Make subnodes with boxes around for Dictionary grouping
# graph, node, and edge definitions
graph [
compound = true,
nodesep = 1,
ranksep = .25,
color = blue
]
# subgraph for Dictionary direct the flow
subgraph cluster1 {
node [
fixedsize = true,
width = 3
]
'##4'->'##6' [label=' Scoring function (sentimentr)
Inner Join (dplyr)',fontname=Helvetica]
'##6'->'##7' [label=' Grouping
Summarise (dplyr)',fontname=Helvetica]
'##7'->'##8'
}
#Add a graph statement to change the properties of the graph
graph[nodesep=1] #this modifies distance between nodes
}
# Name the nodes
[1]: 'Response Data'
[2]: 'Clean Data'
[3]: 'Tidy Text'
[4]: 'Dictionary Creation'
[5]: 'Visualisation'
[6]: 'Sentiment Lexicon'
[7]: 'Summarised Text'
[8]: 'Visualisation and Statistics'
"

Related

plotly go Scattermapbox is not showing markers

I some how have encountered some odd bug. When i try to create a Scattermapbox the markers dont render. This bug came out of no where, It was working perfectly fine then my internet went out and now for the last 8 hours it has not been working.
Ive tried running it in different IDE's
running it in google colab to make sure its not my machine
different data sets.
i am unsure what i have done wrong
The tooltips do display however when i hover over the invisible points.
and if use the export to png button everything is shown.
but no matter what it wont show up on the actual map itself and i am at my wits end.
I will include the callback function bellow.
#app.callback(
Output('2dmap','figure'),
[Input('2dgraph', 'clickData'),
Input('checklist', 'value')])
def update_map_2d(clickData,checklist):
# =============================================================================
# P1. Render Map when no point is clicked
# =============================================================================
# If No point has been clicked
if clickData is None:
#make a map
maps2d = go.Figure(go.Scattermapbox(
lat=[], # set lat and long
lon=[],
mode='markers',
marker =({'size':5.5}) # make markers size variable
))
# set up map layout
maps2d.update_layout(
autosize=True, # Autosize
hovermode='closest', # Show info on the closest marker
showlegend=True, # show legend of colors
mapbox=dict(
accesstoken=mapbox_access_token, # token
bearing=0, # starting facing direction
# starting location
center=dict(
lat=td.cLat,
lon=td.cLong
),
#angle and zoom
pitch=0,
zoom=12
),
#height and width
width=1000,
height=1000
)
return maps2d
else:
xCoord = int(clickData['points'][0]['x'])
yCoord = int(clickData['points'][0]["y"])
solutionRow = preatoFrontier.loc[(preatoFrontier['x'] == xCoord)&(preatoFrontier['y'] == yCoord)]
solId = int(solutionRow['SolId'])
#solId = 49
solution = td.getSolution(solutions, solId)
color = []
for row in solution['upGrade']:
if row == 0:
color.append('grey')
if row == 1:
color.append('green')
if row == 2:
color.append('blue')
if row == 3:
color.append('red')
solution['color'] = color
solution2 = solution[solution['color'].isin(checklist)]
maps2d = go.Figure(go.Scattermapbox(
lat=solution2['lat'],
lon=solution2['long'],
mode='markers',
#marker =({'color':solution['color']},{'size':5.5})
marker=dict(
size=12,
color=solution2['color'], #set color equal to a variable
colorscale='Viridis', # one of plotly colorscales
showscale=True
)
))
#=============================================================================
# P3. Map Layout
#=============================================================================
#set up map layout
maps2d.update_layout(
autosize=False, # Autosize
hovermode='closest', # Show info on the closest marker
showlegend=True, # show legend of colors
mapbox=dict(
accesstoken=mapbox_access_token, # token
bearing=0, # starting facing direction
# starting location
center=dict(
lat=td.cLat,
lon=td.cLong
),
#angle and zoom
pitch=0,
zoom=10
),
#height and width
width=1000,
height=1000
)
return maps2d
After a lot of hair pulling, i thought to try creating a new venv and uploading packages one by one and running to see how and where it fails. The last package i installed before it broke was dash-tools and sure enough some how that was causing mapbox to bug out hard. So dont install dash-tools

HierarchicalGraphMachine hiding nested states

I've been experimenting with the HierarchicalGraphMachine class to help visualise the machine structures as I edit them.
from transitions.extensions import HierarchicalGraphMachine as Machine
count_states = ['1', '2', '3', 'done']
count_trans = [
['increase', '1', '2'],
['increase', '2', '3'],
['decrease', '3', '2'],
['decrease', '2', '1'],
['done', '3', 'done'],
['reset', '*', '1']
]
counter = Machine(states=count_states, transitions=count_trans, initial='1')
states = ['waiting', 'collecting', {'name': 'counting', 'children': counter, 'initial': '1'}]
transitions = [
['collect', '*', 'collecting'],
['wait', '*', 'waiting'],
['count', 'collecting', 'counting']
]
collector = Machine(states=states, transitions=transitions, initial='waiting')
collector.get_graph(show_roi=False).draw('count1.png', prog='dot')
This generates the expected graphic showing both the parent and nested states in full (I'm not yet authorised to upload the graphics).
Is there a way to generate a the full parent state machine graphic without expanding the nested states? For example reducing the nested states to an empty box.
I've tried "show_roi=True", but this only shows the current transition event, and removes all other states.
Depending on whether you use the pygraphviz (default in 0.8.8 and prior) or graphviz backend, get_graph may return a pygraphiv.AGraph object or a custom transitions.Graph. An AGraph is easier to manipulate while the second is basically the pure graph notation in dot. However, you can manipulate both according to your needs. For instance, you could filter edges and nodes from an AGraph and rebuild a 'flat' version of it:
# your code here ...
collector.collect()
graph = collector.get_graph()
# iterate over all edges; We know that parent and child states are connected
# with an underscore. We just collect the root element of each source
# and target element of each edge. Furthermore, we collect the edge color,
# and the label which is stored either in 'label', 'taillabel' or 'headlabel'
new_edges = [(edge[0].split('_')[0],
edge[1].split('_')[0],
edge.attr['color'],
edge.attr['label']
or edge.attr['taillabel']
or edge.attr['headlabel']) for edge in graph.edges()]
# States with children are noted as subgraphs. We collect their name and their
# current color.
new_nodes = [(sgraph.graph_attr['label'], sgraph.graph_attr['color'])
for sgraph in graph.subgraphs()]
# We add all states that have no children and also do not contain an
# underscore in their name. An underscore would suggest that this node/state
# is a child/substate.
new_nodes += [(node.name, node.attr['color'])
for node in graph.nodes() if '_' not in node.name]
# remove everything from the graph obeject
graph.clear()
# and add nodes and edges again
for name, color in new_nodes:
graph.add_node(name, color=color)
for start, target, color, label in new_edges:
if label:
graph.add_edge(start, target, color=color, label=label)
graph.draw('agraph.png', prog='dot')
This results in the following graph:
You see that I also collected the edge and node color to visualize the last transition but graph.clear() removed all the 'default' styling attributes.
They could be copied and restored as well or we could only remove nodes, edges and subgraphs. This depends on how much you are willing to mess with (py)graphviz.

How to keep selected data persistent through callback in Dash/Plotly's clustered bar chart

I'm using Dash to plot some data. I currently have a clustered bar chart with two data sets (one for each bar in the clusters.) These data sets have their name and the corresponding color of the bars displayed in the top, left-hand corner of the figure. They can be clicked to be toggled on and off, which will remove their corresponding bars from the chart.
Separately, I have a checklist of items that can be displayed in the chart. I am using a callback to update the graph so that it only displays what the user has checked. This updates the graph as expected, however, it also resets the bars/datasets such that both are enabled. Ie. if you select only one of the bars, then select some new checklist items, it will display the new checklist items and both of the bars.
My thinking is that the logical way to do this is to pass some variable as a second input to the callback function, then set up the outputted figure within the function to only display the proper bars. However, I can't seem to find a variable that contains this data.
From what I can tell, the accessible properties of the Plotly graph object are 'id', 'clickData', 'clickAnnotationData', 'hoverData', 'clear_on_unhover', 'selectedData', 'relayoutData', 'figure', 'style', 'className', 'animate', 'animation_options', 'config', and 'loading_state'.
I've investigated all of these, and it seems that none hold the data that I am looking for. Does anyone know of an easy way to access this data?
This is how my callback is working right now:
#app.callback(
dash.dependencies.Output('theGraph', 'figure'),
[dash.dependencies.Input('theChecklist','values'),
dash.dependencies.Input('theGraph', 'clickData')
]
)
def updateGraph(checklistValues, figureInput):
#print to see what the variables hold
print(checklistValues)
print(figureInput)
figure=go.Figure(
data = [
go.Bar(
x = df[df['MyColumnName'].isin(checklistValues)].groupby('MyColumnName').size().index,
y = df[df['MyColumnName'].isin(checklistValues)].groupby('MyColumnName').size().values,
name = 'Bar 1'
),
go.Bar(
x = df[df['MyColumnName'].isin(checklistValues)].groupby('MyColumnName')['# cores'].sum().reset_index()['MyColumnName'],
y = df[df['MyColumnName'].isin(checklistValues)].groupby('MyColumnName')['# cores'].sum().reset_index()['MyOtherColumnName'],
name = 'Bar 2'
)
],
layout=go.Layout(
title='The Title',
showlegend=True,
legend=go.layout.Legend(
x=0,
y=1.0
),
margin=go.layout.Margin(l=40, r=40, t=40, b=110)
)
)
return figure

Mapbox heatmap by point value

There is an example of the heatmap https://www.mapbox.com/mapbox-gl-js/example/heatmap/ by the number of markers/points on the area. But is there a way to display a heatmap by average pins/markers values? For example if I have 5 pins and their average prop value speed=3 then it will be shown as green cluster/heatmap and if their av. prop val is 6 then it will be red cluster/heatmap.
I found that "clusterAggregates" property can help, but can't find any example of using it.
Thanks
I'll leave my way to do so. Old question, which is sometimes risen, but there are no nice sollution, so... Turf's hexgrid (http://turfjs.org/docs/#hexGrid) can help:
const hexagons = hexGrid(bbox, size);
const collection = // collection of your points;
const hexagonsWithin = collect(hexagons, collection, "propertyToAgretateFrom", "propertyToAggregateIn");
const notEmptyHexagonValues = hexagonsWithin.features.filter(({ properties }) => properties.propertyToAggregateIn.length !== 0);
const notEmptyHexagons = {
"type": "FeatureCollection",
"features": notEmptyHexagonValues,
};
// at this point you're having not empty hexagons as a geojson, which you can add to the map
collect is another method from turf, whatcollection should be you can look up in the docs, because it's changing a lot.
The general idea behind is to "divide" visible part of map (bbox) into hexagons by hexGrid method and and aggregate some properties that you need from every marker inside of every hexagon you'll have into the array, so you can get an average value, for example. And assign a color based on it.
Let's say we have feature.properties.propertyToAgretateFrom as 4 and 5 in two markers. After the aggregation, if these markers were inside one polygon, you'll have it feature.properties.propertyToAggregateIn: [4, 5] - this feature is polygon. From this on you can do pretty much everything you want.

adding annotations to pdf using perl

I'm using the perl module PDF::API2::Annotation to add annotations to my pdf files.
There is support to decide where the annot will be created using a rect. Something like this:
$annot->text( $text, -rect => [ 10, 10, 10, 10 ] );
which works fine, but I'm having problem to be accurate on where to put my annotations.
I know the lower left corner of the pdf is (0,0). Let's say i want to put an annotation exactly in the middle of the page, any idea how can i achieve that?
according to this https://www.leadtools.com/help/leadtools/v18/dh/to/leadtools.topics.pdf~pdf.topics.pdfcoordinatesystem.html
a pdf is divided to points, and each point is 1/72 inch. and a pdf size is so the middle should be
(306,396)
But thats not even close to the middle.
You can get the size of the page media box and then calculate the middle from that:
# get the mediabox
my ($llx, $lly, $urx, $ury) = $page->get_mediabox;
# print out the page coordinates
say "page mediabox: " . join ", ", $llx, $lly, $urx, $ury;
# output: 0, 0, 612, 792 for the strangely-shaped page I created
# calculate the midpoints
my $midx = $urx/2;
my $midy = $ury/2;
my $annot = $page->annotation;
# create an annotation 20 pts wide and high around the midpoint of the mediabox
$annot->text($text, -rect=>[$midx-10,$midy-10,$midx+10,$midy+10]);
As well as the media box, you can also get the page crop box, trim box, bleed box, and art box:
for my $box ('mediabox', 'cropbox', 'trimbox', 'artbox', 'bleedbox') {
my $get = "get_$box";
say "$box dimensions: " . join ", ", $page->$get;
}
These are usually all the same unless the document has been set up for professional printing with a bleed area, etc.