Can I call a pyspark process from an interactive bokeh plot? - pyspark

I have a bokeh plot (interactive) that I would like to use on a large dataset on the backend. Is there a way to have the bokeh spawn a pyspark job to collate the data on the server then return it so bokeh can plot it?

It should also be possible to accomplish this using the second generation Bokeh Server. However, your question is so broad that it is impossible to provide any specifics. Developing such an example for the Bokeh server would make a great contribution to the Bokeh project that I expect would benefit many people. However, I expect developing it would also take a lot of discussion, collaboration, and iteration, which Stack Overflow is not very good for. (SO is good for self-contained answers to narrow questions.) Accordingly, I encourage you to bring this question to the public mailing list.

Using javascript callbacks
https://docs.bokeh.org/en/latest/docs/user_guide/interaction/callbacks.html
You can call a webservice with ajax, so I guess you can do anything you want.

Related

How to embed custom plots in the H2O flow UI in Scala or R?

After some investigation, I have found out that the sparkling water H2O flow UI has a very limited set of plots - just Box plots, and distributions, for data visualization in Scala.
But if I want to use a third party library (need recommendations on this, I have already checked the Scala-charts library), how would I embed the generated plots in the H2O flow UI itself?
I’ve seen a couple of examples of this over the years, but the real answer is this really isn’t supported well.
Here is a pointer to the best example I can remember:
https://github.com/h2oai/sparkling-water/blob/master/examples/flows/2016_H2O_Tour_Chicago.flow
If you really want to do this, the best guide is the source code of H2O Flow.

I want to use Anylogic as a visualization tool for a big log file? Is it possible?

I asked a similar question to Anylogic linkedin forum so sorry for multiple posting (possibly for some of you). I just heard about the Anylogic program. My purpose is not simulation I want to visualize a log file. I want to allow the admin-user (who will be the user of the Anylogic model) to enter some settings which will cause some filtering and I want to visualize the whole file with Anylogic.
The file is a communication file. Possibly I will show communication attendees and interactions using Anylogic. I want to emphasize unnormal patterns in the log using visual and interactive properties of Anylogic. There may also some need for like zooming in and out during the execution of the model.
Is it something very difficult to do? I am a Java developer. I can understand that I should have to learn Anylogic. What other skills and development and test environments (Ide etc.) do I need?
I plan to do a serial of implementations for several log file types and currently I am trying to find the best tool which will allow me to make changes in visualization part of the models easily till I find the best representation of the data.
There are some examples of Anylogic which are installed built-in but I couldn't see an example which suits my situation. I do not know where to start. If someone helps me to start the design I would be very happy :)
Thank you for your attention..
Edit:
I am attaching a sample stereoscopic view model and a sample view. I want to do something similar to this. Is it ok with AnyLogic?
Ferda
simple answer: yes its possible.
Some more comments:
I am currently working on a very similar project actually. For me as an experienced AnyLogic user, it is very natural and AnyLogic offers all the features you ask for.
Is it something very difficult to do?
That depends on how quickly you can learn AL. But if you are experienced with Java, it will not be too hard, I imagine.
What other skills and development and test environments (Ide etc.) do I need?
None, really. You need to figure out how to use the visual elements of AL. All of them can be changed statically via the AL IDE but you can always change them dynamically via Java code. That is very important to realize and play around with.
I am attaching a sample stereoscopic view model and a sample view. I want to do something similar to this. Is it ok with AnyLogic?
Yes, that can be done.
I suggest you try checking the example models that come with AL. If you find something that looks like what you need, try to figure out how they did it. Then try to recreate it in a simple example model for yourself.

does downsampling of big data in python bokeh server work? where documented?

Is there documentation how to program and use the bokeh server? There is a nice interactive web plot of 4gb of ocean data on youTube, https://www.youtube.com/watch?v=B-P3yA-P-sY
but I can not find any description how it was done. I have ~10Tbytes of data from the Greenbank radio telescope, and I would like to write something similar for exploring it.
The server documentation seems broken for this, if I look at
http://docs.bokeh.org/en/0.10.0/docs/user_guide/server.html#downsampling-with-server
it just goes in circles, there isn't anything there.
Can someone help with big data downsizing examples with bokeh server?
Perhaps let me know where the code for the 4gb of ocean data example is.
That demo showed off custom downsampling code written for a very old version of Bokeh and Bokeh Server. Nowadays, Datashader provides automatic downsampling integrated fully with Bokeh via the high-level HoloViews package. HoloViews creates a Bokeh object with callbacks already set up for zoom and pan events, calling Datashader to regrid/downsample the data as needed. There should be plenty of examples to follow online now, such as the Datashader LANDSAT example or the HoloViz Advanced Dashboards tutorial. Or just see the Datashader website.

Big data visualization using "search, show context, and expand on demand" concept [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm trying to visualize a really huge network (3M nodes and 13M edges) stored in a database. For real-time interactivity, I plan to show only a portion of the graph based on user queries and expand it on demand. For instance, when a user clicks a node, I expand its neighborhood. (This is called "Search, Show Context, Expand on Demand" on this paper).
I have looked into several visualization tools, including Gephi, D3, etc. They take a text file as input, but I don't have any idea how they can connect a database and update the graph based on users' interaction.
The linked paper implemented a system like that, but they didn't describe the tools they were using.
How can I visualize such data with above criteria?
There are several solutions out there, but basically every one is using the same approach:
create layer on top of your source to let you query at high level
create a front end layer to talk with the level explained above
use the visualization tool you want
As miro marchi pointed, there are several solutions to achieve this goal, some of them locked to particular data sources others with much more freedom but that would require some coding skills.
Datasource
I would start with the choice of the source type: from the type of data probably I would choice either Neo4J, Titan or OrientDB (if you fancy something more exotic with some sort of flexibility).
All of them offer a JSON REST API, the former with a proprietary system and language (Cypher) and the other two using the Blueprint / Rexster system.
Neo4J supports the Blueprint stack as well if you like Gremlin over Cypher.
For other solutions, such other NoSQL or SQL db probably you have to code a layer above with the relative REST API, but it will work as well - I wouldn't recommend that for the kind of data you have though.
Now, only the third point is left and here you have several choices.
Generic Viz tools
Sigma.js it's a free and open source tool for graph visualization quite nice. Linkurious is using a fork version of it as far as I know in their product.
Keylines it's a commercial graph visualization tool, with advanced stylings, analytics and layouts, and they provide copy/paste demos if you are using Neo4J or Titan. It is not free, but it does support even older browsers - IE7 onwards...
VivaGraph it's another free and open source tool for graph visualization tool - but it has a smaller community compared to SigmaJS.
D3.js it's the factotum for data visualization, you can do basically every kind of visualization based on that, but the learning curve is quite steep.
Gephi is another free and open source desktop solution, you have to use an external plugin with that probably but it does support most of the formats out there - graphML, CSV, Neo4J, etc...
Vendor specific
Linkurious it's a commercial Neo4J specific complete tool to search/investigate data.
Neo4J web-admin console - even if it's basic they've improved a lot with the newer version 2.x.x, based on D3.js.
There are also other solutions that I probably forgot to mention, but the ones above should offer a good variety.
Other nodes
The JS tools above will visualize well up to 1500/2000 nodes at once, due to JS limits.
If you want to visualize bigger stuff - while expanding - I would to recommend desktop solutions such Gephi.
Disclaimer
I'm part of the the Keylines dev team.

BPMB visualization

We need to visualize BP (business process) into BPMN, but NOT by hands using modeler. We need to do it automatically in crm-web-based system written on PHP. I have input data (etc. array, xml, not care...(but not BPEL)), then I need to process it into nice BPMN graph (using SVG).
We have first nice-looking realization of it. We use matrix to draw: several times goes through matrix and optimize graph each time, no no, it working fast, but it not agile, hard to rebuilt, upgrade, add new features... We made this algorithm by ourselves (I mean we didn't find it in google or books). Problem is that we couldn't find any algorithms in the internet. I suppose we don't know correct keywords to do it. Every try returned us to BPEL vis. from BPMN, "Data flow vis." returned modelers...
Please help us to find some algorithms, or give correct keywords to find out information.
Think you're probably looking for "graph layout algorithms". The only library I'm aware of that can (I think) generate BPMN directly is the yFiles library from yWorks. It's not free. They do however offer a free application using the library that does auto-layout. Perhaps you could do some prototyping with that.
If that's not applicable, there are several other options. I'm not aware any of these can generate BPMN symbols directly; you'd have to construct the symbols. However all will auto-layout graphs according to various algorithms. Also all open source/free.
graphviz. Written in C. Quite old now but well used, stable and scalable.
tulip. Newer than graphviz. Haven't used it but heard good things about flexibility and scalability.
see also this post for javascript based options.
There are many more, just google for graph layout algorithms / libraries.
hth.