plotly r sankey add_trace - r-plotly

i am reading the document https://plotly.com/r/reference/sankey/, and want to change the links color for a sankey chart. But i can't quite understand the parameters in add_trace() function
where should i specify the color value?
add_trace(p,type='sankey', color=????)

You haven't provided a minimal reproducible example, so I can't jump right into your code. But I think I can point you in the right direction.
In the documentation you screenshotted, it's saying that the color argument is one key of the list link that defines links in the plot. Using this example from the R plotly documentation for adding links, let's take a look at where that goes:
library(plotly)
library(rjson)
json_file <- "https://raw.githubusercontent.com/plotly/plotly.js/master/test/image/mocks/sankey_energy.json"
json_data <- fromJSON(paste(readLines(json_file), collapse=""))
fig <- plot_ly(
type = "sankey",
domain = list(
x = c(0,1),
y = c(0,1)
),
orientation = "h",
valueformat = ".0f",
valuesuffix = "TWh",
node = list(
label = json_data$data[[1]]$node$label,
color = json_data$data[[1]]$node$color,
pad = 15,
thickness = 15,
line = list(
color = "black",
width = 0.5
)
),
link = list(
source = json_data$data[[1]]$link$source,
target = json_data$data[[1]]$link$target,
value = json_data$data[[1]]$link$value,
label = json_data$data[[1]]$link$label,
#### Color goes here! ####
color = "yellow"
)
)
fig <- fig %>% layout(
title = "Energy forecast for 2050<br>Source: Department of Energy & Climate Change, Tom Counsell via <a href='https://bost.ocks.org/mike/sankey/'>Mike Bostock</a>",
font = list(
size = 10
),
xaxis = list(showgrid = F, zeroline = F),
yaxis = list(showgrid = F, zeroline = F)
)
fig
The plotly documentation can be a bit opaque at times. I have found it helpful to sometimes review the documentation for python. For example, this part of the python documentation does give some more guidance about changing link colors.

Related

Alluvial plot - reorder lodes

I have created an alluvial plot but, for visibility purposes I would like to move one lode in one of the axes: more specifically I would like the "NA" of the "Type of surgery" to be at the top so the last 4 axes are aligned.
This is the code I used on R:
aes(y = ID, axis1 = Reason, axis2 = Response, axis3=Type_of_surgery, axis4=Margins, axis5=RT_post_op, axis6=Chemo_post_op)) +
geom_alluvium(aes(fill = Type_of_surgery), width = 1/12,aes.bind = TRUE) +
geom_flow(aes.bind = TRUE) +
geom_stratum(width = 1/3, fill = "grey", color = "white") +
geom_label(stat = "stratum", aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Reason", "Response","Type of surgery", "Margins","RT post op", "Chemo post-op"), expand = c(0.1,0.1)) +
scale_fill_brewer(type = "qual", palette = "Pastel1") +
ggtitle("TBC") ```
This is the plot I obtained:
[Alluvial plot][1]
[1]: https://i.stack.imgur.com/nDCIZ.png
I am beginning on the world of coding so any help would be most welcome,
Thank you all for your help,
JB

Correcting satellite image overlays for Rayshader

I'm trying to improve the look of Rayshader by overlaying more recent (higher detail) satellite imagery (that I'm getting from the {leaflet} packages) but the overlay doesn't match with the 3D rendering.
Ideally I'm looking for a open-source solution that can get global satellite imagery. Bonus points if you find finer detail data for my area of interest - Hawaii.
One method using {geoviz} and {rayshader} uses the slippy_overlay() function to create a number of overlay images from either Mapbox (satellite, mapbox-streets-v8, mapbox-terrain-v2, mapbox-traffic-v1, terrain-rgb, mapbox-incidents-v1) or Stamen. Although I found mapbox-terrain-v2 the best it still lacks the detail I would like. Since it requires setting up an API for mapbox I just use stamen/watercolor below:
library(geoviz)
library(rayshader)
### Maui
lat = 20.785700
lon = -156.259204
square_km = 22
max_tiles = 10
dem <- mapzen_dem(lat, lon, square_km, max_tiles)
elev_matrix = matrix(
raster::extract(dem, raster::extent(dem), buffer=1000),
nrow = ncol(dem),
ncol = nrow(dem)
)
ambmat <- ambient_shade(elev_matrix, zscale = 30)
raymat <- ray_shade(elev_matrix, zscale = 30, lambert = TRUE)
watermap <- detect_water(elev_matrix)
overlay_img <-
slippy_overlay(dem,
image_source = "stamen",
image_type = "watercolor",
png_opacity = 0.3,
max_tiles = max_tiles)
elev_matrix %>%
sphere_shade(sunangle = 270, texture = "imhof4") %>%
add_water(detect_water(elev_matrix), color="imhof4") %>%
add_shadow(ray_shade(elev_matrix,zscale=3,maxsearch = 300),0.5) %>%
add_shadow(ambmat,0.5) %>%
add_overlay(overlay_img) %>%
plot_3d(elev_matrix,
solid = T,
water = T,
waterdepth = 0,
wateralpha = 0.5,
watercolor = "lightblue",
waterlinecolor = "white",
waterlinealpha = 0.5,
zscale= raster_zscale(dem) / 3,
fov=0,theta=135,zoom=0.75,phi=45, windowsize = c(1000,800))
I'm trying to adapt Will Bishop's workflow for getting overlays with the leaflet package but the result is very odd. Will's approach is a bit different as it fetches elevation data from USGS, which doesn't have baythmetric elevation which is must for me - so I used geoviz
library(leaflet)
# define bounding box with longitude/latitude coordinates
bbox <- list(
p1 = list(long = -156.8037, lat = 20.29737),
p2 = list(long = -155.7351, lat = 21.29577)
)
leaflet() %>%
addTiles() %>%
addRectangles(
lng1 = bbox$p1$long, lat1 = bbox$p1$lat,
lng2 = bbox$p2$long, lat2 = bbox$p2$lat,
fillColor = "transparent"
) %>%
fitBounds(
lng1 = bbox$p1$long, lat1 = bbox$p1$lat,
lng2 = bbox$p2$long, lat2 = bbox$p2$lat,
)
What's the area of my hillshade from geoviz?
dim(dem)
780 780 1
Okay so the overlay image needs to be 780 x 780 so I modify the helper functions to download the overlay with the World_Imagery base map:
define_image_size <- function(bbox, major_dim = 780) {
# calculate aspect ration (width/height) from lat/long bounding box
aspect_ratio <- abs((bbox$p1$long - bbox$p2$long) / (bbox$p1$lat - bbox$p2$lat))
# define dimensions
img_width <- ifelse(aspect_ratio > 1, major_dim, major_dim*aspect_ratio) %>% round()
img_height <- ifelse(aspect_ratio < 1, major_dim, major_dim/aspect_ratio) %>% round()
size_str <- paste(img_width, img_height, sep = ",")
list(height = img_height, width = img_width, size = size_str)
}
get_arcgis_map_image <- function(bbox, map_type = "World_Imagery", file = NULL,
width = 780, height = 780, sr_bbox = 4326) {
require(httr)
require(glue)
require(jsonlite)
url <- parse_url("https://utility.arcgisonline.com/arcgis/rest/services/Utilities/PrintingTools/GPServer/Export%20Web%20Map%20Task/execute")
# define JSON query parameter
web_map_param <- list(
baseMap = list(
baseMapLayers = list(
list(url = jsonlite::unbox(glue("https://services.arcgisonline.com/ArcGIS/rest/services/{map_type}/MapServer",
map_type = map_type)))
)
),
exportOptions = list(
outputSize = c(width, height)
),
mapOptions = list(
extent = list(
spatialReference = list(wkid = jsonlite::unbox(sr_bbox)),
xmax = jsonlite::unbox(max(bbox$p1$long, bbox$p2$long)),
xmin = jsonlite::unbox(min(bbox$p1$long, bbox$p2$long)),
ymax = jsonlite::unbox(max(bbox$p1$lat, bbox$p2$lat)),
ymin = jsonlite::unbox(min(bbox$p1$lat, bbox$p2$lat))
)
)
)
res <- GET(
url,
query = list(
f = "json",
Format = "PNG32",
Layout_Template = "MAP_ONLY",
Web_Map_as_JSON = jsonlite::toJSON(web_map_param))
)
if (status_code(res) == 200) {
body <- content(res, type = "application/json")
message(jsonlite::toJSON(body, auto_unbox = TRUE, pretty = TRUE))
if (is.null(file))
file <- tempfile("overlay_img", fileext = ".png")
img_res <- GET(body$results[[1]]$value$url)
img_bin <- content(img_res, "raw")
writeBin(img_bin, file)
message(paste("image saved to file:", file))
} else {
message(res)
}
invisible(file)
}
Now download the file, then load it
image_size <- define_image_size(bbox, major_dim = 780)
# fetch overlay image
overlay_file <- "maui_overlay.png"
get_arcgis_map_image(bbox, map_type = "World_Imagery", file = overlay_file,
# width = image_size$width, height = image_size$height,
sr_bbox = 4326)
overlay_img <- png::readPNG("maui_overlay.png")
Okay let's make the plot
elev_matrix %>%
sphere_shade(sunangle = 270, texture = "imhof4") %>%
add_water(detect_water(elev_matrix), color="imhof4") %>%
add_shadow(ray_shade(elev_matrix,zscale=3,maxsearch = 300),0.5) %>%
add_shadow(ambmat,0.5) %>%
add_overlay(overlay_img, alphacolor = 1) %>%
plot_3d(elev_matrix,
solid = T,
water = T,
waterdepth = 0,
wateralpha = 0.5,
watercolor = "lightblue",
waterlinecolor = "white",
waterlinealpha = 0.5,
zscale= raster_zscale(dem) / 3,
fov=0,theta=135,zoom=0.75,phi=45, windowsize = c(1000,800))
As you can see the overlay image is rotated to the hillshade.
Now I'm also realizing that fetching satellite with a bounding box method isn't ideal when you're trying to show bathymatrix data. It would be ideal to subset this overlay somehow programmatically but I'll probably just end up using inkscape once I've figured out how to rotate the overlay.
I tried to use the {magick}'s image_rotate() function to no avail:
library(magick)
maui <- magick::image_read("maui_overlay.png")
image_rotate(maui, 30) # -> maui_30
# image_write(maui_30, path = "maui_overlay_30.png", format = "png")
But magick has changed the dimensions:
# A tibble: 1 x 7
format width height colorspace matte filesize density
<chr> <int> <int> <chr> <lgl> <int> <chr>
1 PNG 1068 1068 sRGB TRUE 0 38x38
And will give an error with rayshader:
overlay_img <- png::readPNG("maui_overlay_30.png")
elev_matrix %>%
sphere_shade(sunangle = 270, texture = "imhof4") %>%
add_water(detect_water(elev_matrix), color="imhof4") %>%
add_shadow(ray_shade(elev_matrix,zscale=3,maxsearch = 300),0.5) %>%
add_shadow(ambmat,0.5) %>%
add_overlay(overlay_img, alphacolor = 1) %>%
plot_3d(elev_matrix,
solid = T,
water = T,
waterdepth = 0,
wateralpha = 0.5,
watercolor = "lightblue",
waterlinecolor = "white",
waterlinealpha = 0.5,
zscale= raster_zscale(dem) / 3,
fov=0,theta=135,zoom=0.75,phi=45, windowsize = c(1000,800))
Error in add_overlay(., overlay_img, alpha = 0.8) : argument 3 matches multiple formal arguments
The answer couldn't have been simpler... it needed to be transposed overlay_img = aperm(overlay_img, c(2,1,3)).

different clusters with same method

I am stuck in a problem with hierarchical clustering. I want to make a dendrogram and a heatmap, with a distance method of correlation (d_mydata=dist(1-cor(t(mydata))) and ward.D2 as clustering method.
As a gadget in the package pheatmap you can plot the dendrogram on the left side to visualize the clusters.
The pipeline of my analysis would be this:
create the dendrogram
test how many cluster would be the optimal (k)
extract the subjects in each cluster
create a heatmap
My surprise comes up when the dendrogram plotted in the heatmap is not the same as the one plotted before even when methods are the same.
So I decided to create a pheatmap colouring by the clusters classified before by cutree and test if the colours correspond to the clusters in the dendrogram.
This is my code:
# Create test matrix
test = matrix(rnorm(200), 20, 10)
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")
test<-as.data.frame(test)
# Create a dendrogram with this test matrix
dist_test<-dist(test)
hc=hclust(dist_test, method="ward.D2")
plot(hc)
dend<-as.dendrogram(hc, check=F, nodePar=list(cex = .000007),leaflab="none", cex.main=3, axes=F, adjust=F)
clus2 <- as.factor(cutree(hc, k=2)) # cut tree into 2 clusters
groups<-data.frame(clus2)
groups$id<-rownames(groups)
#-----------DATAFRAME WITH mydata AND THE CLASSIFICATION OF CLUSTERS AS FACTORS---------------------
test$id<-rownames(test)
clusters<-merge(groups, test, by.x="id")
rownames(clusters)<-clusters$id
clusters$clus2<-as.character(clusters$clus2)
clusters$clus2[clusters$clus2== "1"]= "cluster1"
clusters$clus2[clusters$clus2=="2"]<-"cluster2"
plot(dend,
main = "test",
horiz = TRUE, leaflab = "none")
d_clusters<-dist(1-cor(t(clusters[,7:10])))
hc_cl=hclust(d_clusters, method="ward.D2")
annotation_col = data.frame(
Path = factor(colnames(clusters[3:12]))
)
rownames(annotation_col) = colnames(clusters[3:12])
annotation_row = data.frame(
Group = factor(clusters$clus2)
)
rownames(annotation_row) = rownames(clusters)
# Specify colors
ann_colors = list(
Path= c(Test1="darkseagreen", Test2="lavenderblush2", Test3="lightcyan3", Test4="mediumpurple", Test5="red", Test6="blue", Test7="brown", Test8="pink", Test9="black", Test10="grey"),
Group = c(cluster1="yellow", cluster2="blue")
)
require(RColorBrewer)
library(RColorBrewer)
cols <- colorRampPalette(brewer.pal(10, "RdYlBu"))(20)
library(pheatmap)
pheatmap(clusters[ ,3:12], color = rev(cols),
scale = "column",
kmeans_k = NA,
show_rownames = F, show_colnames = T,
main = "Heatmap CK14, CK5/6, GATA3 and FOXA1 n=492 SCALE",
clustering_method = "ward.D2",
cluster_rows = TRUE, cluster_cols = TRUE,
clustering_distance_rows = "correlation",
clustering_distance_cols = "correlation",
annotation_row = annotation_row,
annotation_col = annotation_col,
annotation_colors=ann_colors
)
anyone with the same issue? Am I making an stupid mistake?
Thank you in advance

Exclude items from training set data

I have my data in two colors and excluded_colors.
colors contains all colors
excluded_colors contains some colors that I wish to exclude from my trainingset.
I am trying to split the data into a training and testing set and ensure that the colors in excluded_colors are not in my training set but exist in the testing set.
In order to achieve the above, I did this
var colors = spark.sql("""
select colors.*
from colors
LEFT JOIN excluded_colors
ON excluded_colors.color_id = colors.color_id
where excluded_colors.color_id IS NULL
"""
)
val trainer: (Int => Int) = (arg:Int) => 0
val sqlTrainer = udf(trainer)
val tester: (Int => Int) = (arg:Int) => 1
val sqlTester = udf(tester)
val rsplit = colors.randomSplit(Array(0.7, 0.3))
val train_colors = splits(0).select("color_id").withColumn("test",sqlTrainer(col("color_id")))
val test_colors = splits(1).select("color_id").withColumn("test",sqlTester(col("color_id")))
However, I'm realizing that by doing the above the colors in excluded_colors are completely ignored. They are not even in my testing set.
Question
How can I split the data in 70/30 while also ensuring that the colors in excluded_colors are not in training but are present in testing.
What we want to do is remove the "excluded colors" from the training set but have them in the testing and have a training/test split of 70/30.
What we need is a bit of math.
Given the total dataset (TD) and the excluded colors dataset (E) we can say that for train dataset (Tr) and test dataset (Ts) that:
|Tr| = x * (|TD|-|E|)
|Ts| = |E| + (1-x) * |TD|
We also know that |Tr| = 0.7 |TD|
Hence x = 0.7 |TD| / (|TD| - |E|)
Now that we know the sampling factor x, we can say:
Tr = (TD-E).sample(withReplacement = false, fraction = x)
// where (TD - E) is the result of the SQL expr above
Ts = TD.sample(withReplacement = false, fraction = 0.3)
// we sample the test set from the original dataset

Total distance of route using Leaflet routing machine in rMaps/rCharts

I would like to produce a shiny app that asks for two addresses, maps an efficient route, and calculates the total distance of the route. This can be done using the Leaflet Routing Machine using the javascript library, however I would like to do a bunch of further calculations with the distance of the route and have it all embedded in a shiny app.
You can produce the map using rMaps by following this demo by Ramnathv here. But I'm not able to pull out the total distance travelled even though I can see that it has been calculated in the legend or controller. There exists another discussion on how to do this using the javascript library - see here. They discuss using this javascript code:
alert('Distance: ' + routes[0].summary.totalDistance);
Here is my working code for the rMap. If anyone has any ideas for how to pull out the total distance of a route and store it, I would be very grateful. Thank you!
# INSTALL DEPENDENCIES IF YOU HAVEN'T ALREADY DONE SO
library(devtools)
install_github("ramnathv/rCharts#dev")
install_github("ramnathv/rMaps")
# CREATE FUNCTION to convert address to coordinates
library(RCurl)
library(RJSONIO)
construct.geocode.url <- function(address, return.call = "json", sensor = "false") {
root <- "http://maps.google.com/maps/api/geocode/"
u <- paste(root, return.call, "?address=", address, "&sensor=", sensor, sep = "")
return(URLencode(u))
}
gGeoCode <- function(address,verbose=FALSE) {
if(verbose) cat(address,"\n")
u <- construct.geocode.url(address)
doc <- getURL(u)
x <- fromJSON(doc)
if(x$status=="OK") {
lat <- x$results[[1]]$geometry$location$lat
lng <- x$results[[1]]$geometry$location$lng
return(c(lat, lng))
} else {
return(c(NA,NA))
}
}
# GET COORDINATES
x <- gGeoCode("Vancouver, BC")
way1 <- gGeoCode("645 East Hastings Street, Vancouver, BC")
way2 <- gGeoCode("2095 Commercial Drive, Vancouver, BC")
# PRODUCE MAP
library(rMaps)
map = Leaflet$new()
map$setView(c(x[1], x[2]), 16)
map$tileLayer(provider = 'Stamen.TonerLite')
mywaypoints = list(c(way1[1], way1[2]), c(way2[1], way2[2]))
map$addAssets(
css = "http://www.liedman.net/leaflet-routing-machine/dist/leaflet-routing-machine.css",
jshead = "http://www.liedman.net/leaflet-routing-machine/dist/leaflet-routing-machine.js"
)
routingTemplate = "
<script>
var mywaypoints = %s
L.Routing.control({
waypoints: [
L.latLng.apply(null, mywaypoints[0]),
L.latLng.apply(null, mywaypoints[1])
]
}).addTo(map);
</script>"
map$setTemplate(
afterScript = sprintf(routingTemplate, RJSONIO::toJSON(mywaypoints))
)
# map$set(width = 800, height = 800)
map
You can easily create a route via the google maps api. The returned data frame will have distance info. Just sum up the legs for total distance.
library(ggmap)
x <- gGeoCode("Vancouver, BC")
way1txt <- "645 East Hastings Street, Vancouver, BC"
way2txt <- "2095 Commercial Drive, Vancouver, BC"
route_df <- route(way1txt, way2txt, structure = 'route')
dist<-sum(route_df[,1],na.rm=T) # total distance in meters
#
qmap(c(x[2],x[1]), zoom = 12) +
geom_path(aes(x = lon, y = lat), colour = 'red', size = 1.5, data = route_df, lineend = 'round')