how to pass data in Zeppelin to DS3.js for Spark visualization - scala

The graph options with Zeppelin are pretty basic. So I am looking for an example of how to do something simple, like a barchart, with ds3.js. From what I can tell that would be the best graphing library to use to create stunning graphs.
Anyway my question is how to pass data to the JavaScript code. With regular Zeppelin charts you write scala or other code and then save that in a dataframe. Then on the next line you use the %sql option and you can write a SQL command and then buttons appear to let you graph the data.
But what I have found looking on the internet is no indication that data created in the scala code section would be passed to the Angular section where you put the ds3.js code.
Some examples I found are like this one where all the html and Javascript is put in one giant print statement in the scala code https://rawkintrevo.org/2016/09/20/gelly-on-apache-flink/
And then there is an example like this one Using d3.js with Apache Zeppelin where the Zeppelin line is all JavaScript, but the data is just a locally created array.
So I need (1) an example and (2) some understanding of how RDDs ad Dataframes can be passed into the JavaScript code, which of course is on a different line that the scala code. How do you bring objects in the scala section of the notebook into scope for the Javascript section.

You can refer to zeppelin docs for a good getting-started guide to creating custom visualization. Also, you might want to check out the code of some of the built-ins viz.
Regarding how data from DataFrames are passed to js, I'm pretty sure z.show or %sql triggers dataFrame.take(${zeppelin.spark.maxResult}) which collects the RDD[T] as a Seq[T] object to the driver whose elements are then used to render graphs.
Alternatively if you have a javascript graph defined in another paragraph, you can also usez.angularBind("values", rdd.take(maxResult)) to send the data to the angular view. There's a really nice answer here on the subject which might help.
Hope you find this helpful.

Related

Is it possible to use data frame in r-exams?

I would like to paste the data-frame from the R environment to the latex part (question or solution part) when creating exercises in r-exams. Later the exercises will be imported into Moodle. Is that possible in r-exams? We saw it is possible when the object is matrix object via $\Sexpr{toLatex(matrix_obj)}$. But a similar way does not seem to work with the data-frames. Thank you!
A data.frame would usually be included as a {tabular} in LaTeX and there are various packages for automatic conversion like xtable or using the function kable() in knitr. For PDF output this also works nicely including all vertical and/or horizontal lines included in the table. However, for HTML-based output (as for Moodle) the table as such is converted correctly but without any lines.
An overview of a couple of solutions is available as:
Different copies of question with table for Moodle with R-Exams
Moreover, Kenji Sato has proposed to inject some dedicated CSS code to handle the table formatting in HTML. We are currently working on some automated way of including this in R/exams:
https://www.kenjisato.jp/en/post/2020/07/moodle-bordered-table/

VSCode ANTLR4 Plugin: Export Call Graph to JSON?

The vscode-antlr4 plugin for VisualStudio Code has a nice call-graph feature which visualizes (as a dendrogram) how grammar (and lexer) rules interact. You can save the graphic as SVG.
Is there a way to export the information as JSON? I wouldn't mind going into the plugin's code to find a way to do it.
My aim is to create reachability graphs for individual rules, i.e. graphs that show from which other rules a particular rule can be reached (transitively). The "calls" and "is-called" information from the call-graph feature would be a nice starting point.
The data for the call graph comes from a source context instance (for each grammar file there's a single source context to manage all details for it). See the function getReferenceGraph, which collects the relations into a map object. You can use that object to generate a JSON object from it. Or you create another function, taking this one as template, to generate the JSON directly, without the overhead required for the UI.

MS azure, zeppelin load scala file

I am following an OReilly book, "Advanced Analytics with spark" book. It seems that they expect you to use the command shell to follow the examples in the book (PuTTY). But i don't want to use that. I'd prefer to use Zeppelin. I'd like to create notebooks, but my own comments into the code etc.
So, using an Azure subscription, I spin up a Spark cluster and go into zeppelin. I am able to follow the guide fine for the most part. But there is one bit that trips me up. And its probably pretty basic.
You are asked to create a scala file called "StatsWithMissing.scala" with code in it. I do that. I upload it to blob to: //user/Zeppelin
(this is where i expect the Zeppelin user directory to be)
Then it asks you to run the following;
":load StatsWithMissing.scala"
At this point it gives the error:
:1: error: illegal start of definition
My first question is, where exactly is this scala file supposed to be on Blob Storage for Zeppelin to see it? How do i determine that? Is where i am putting it correct?
And second what does this message mean? Does it not like the Load statement?
I believe the Interpreter set at the top of the page is Livy, and that covers scala.
Any help would be great.
Regards
Conor

Matlab converting library to model

I'm working on a script to convert a Simulink library to a plain model, meaning it can be simulated, it does not auto-lock etc.
Is there a way to do this with code aside from basically copy-pasting every single block into a new model? And if it isn't, what is the most efficient way to do the "copy-paste".
I was not able to find any clues as how to approach this problem here, or on Google, or on the official documentation or on the MathWorks forum so I'm at a loss on how to proceed.
Thank you in advance!
I don't think it's possible to convert a library to a model, but you can programmatically add library blocks to models like so:
sys = 'testModel';
new_system(sys);
open_system(sys);
add_block('Simulink/Sources/Sine Wave', [sys, '/MySineWave']);
save_system(sys);
close_system(sys);
sim(sys);
You could even use the find_system command to list all the blocks in a library and then loop through them all and create a new model for each using the above code.

how to run the example of uima-text-segmenter?

I want to call the API of uima-text-segmenter https://code.google.com/p/uima-text-segmenter/source/browse/trunk/INSTALL?r=22 to run an example.
But I don`t know how to call the API...
the readme said,
With the DocumentAnalyzer, run the following descriptor
`desc/textSegmenter/wst-snowball-C99-JTextTilingAAE.xml` by taking the
uima-examples data as input.
Could anyone give me some code which could be run directly in main func for example?
Thanks a lot!
Long answer:
The link describes how you would set up the application from within the Eclipse UIMA environment. This sort of set-up is typically targeted at subject matter specialists with little or no coding experience. It allows them to work (relatively fast) with UIMA in a declarative way: all data structures and analysis engines (computing blocks within UIMA) are declared in xml (with a GUI on top of it), after which the framework takes care of the rest. In this scenario, you would typically run a UIMA pipeline using a run configuration from within Eclipse (or the included UIMA pipeline runner application). Luckily, UIMA allows you to do exactly the same from code, but I would recommend using UIMAFit (http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e137) for this purpose instead of of UIMA, as it bundles lots of useful things and coding shortcuts.
Short answer:
Using UIMAFit, you can call Factory methods that create CollectionReader (read input), AnalysisEngine (process input) and Consumer objects (write/do other stuff) from (third-party provided) XML files. Use these methods to construct your pipeline and the SimplePipeline class to run it. To extract the data you need, you would manipulate the CAS object (containing your data) in a Consumer object, possibly with a callback. You could also do this in a Analysis Engine object. I recommend using DKPro's FeaturePathFactory (https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.api.featurepath-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/api/featurepath/FeaturePathFactory.java?spec=svn1811&r=1811) to quickly access the feature you are after.
Code examples:
http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e137 contains examples, but they all go in the opposite direction (class objects are used in the factory methods, instead of XML files - XML is generated from these classes). Take a look at the UIMAFit API to find the method you need, AnalysisEngineDescription from XML for example: http://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/factory/AnalysisEngineFactory.html#createEngineDescriptionFromPath-java.lang.String-java.lang.Object...-