Is it possible, and if so how, to generate in a Jupyter notebook a pure report without showing any source code?
The use case and a motivation for this business friendliness (read: no code to read, just results) Our data scientists will mine the data, find patterns and phenomena of interest and then create business friendly visualizations/reports with text, charts and tables, but without any visible source code.
It looks like this Hide-Input-All extension would do.
Related
I'm about to submit this project but I want to make sure the Github page looks good before I do. For some reason, not all the languages are showing up and I don't know why. I've tried to find ways to edit this under settings but I've yet to find anything.
As you see in the images below, on the homepage it says the Repo is 100% Jupyter notebooks, but if you click on "languages" you'll see that there are python and csv files as well that seem to be unaccounted for.
If anyone knows how I can change this please let me know. It's not very important but I think it'd look much nicer if the breakdown of languages was more accurate. Thank you!
GitHub uses Linguist to figure out which languages are part of your project. It has a languages.yml file to defined the multitude of languages to look for. Some are markup languages (like jupyter notebook), some programming languages, etc.
That percentage you see is calculated based on the bytes of code for each language. The more you have of one type, the higher the percentage.
Note, however, that this library excludes all files that it determines to be binary data, vendor code, generated code, documentation, or defined as data (in your case csv) or prose (think markdown), whilst taking into account any overrides.
IF your python code is small enough, even in 2 files, it won't get show up. Just write more python if you want it to show up.
The second screenshot provided is when you click on the languages and it's purpose is exactly what you are looking for - to give better details on the current project and what it comprises of in detail. This language bar is just an overview. It need not be 100% accurate.
FIY - It also matters which is your main branch, since it takes that into account.
Conclusion - don't worry about it. Whoever needs to see it, will see what your project has in terms of contents.
I am working on a Firefox extension it is a context menu for automating bbCode, HTML, MarkDown, etc.
It is functional now, but I have a file which is the size of the rest of the code combined, an Excel spreadsheet. (Yes, I know, lame)
I use it as a database for organizing the menu IDs, arguments, internationalization, etc.
It's useful, for me at least, and I want to make it available, but it is not properly a part of the application, it should be a part of the background documentation.
Is there a way in Github/Git to separate documentary files from those that a part of execution of the code?
I've looked through Github, and I haven't found a spot (with revision control) to put this.
With the rise of Spark, Scala has gained a tremendous momentum as programming language of choice for data science applications.
To increase the efficiency when working on data science applications, specialized IDEs have been released for
R (e.g. RStudio) and
Python (e.g. Spyder or Rodeo, see Is there something like RStudio for Python?).
Is there something similar for Scala?
Unfortunately there doesn't seem to be any dedicated Data Science IDEs for Scala at this time. I think these would be your best options:
IntelliJ Worksheets:
This is basically a text editor with an output window which gets updated as often as you want. Eclipse has something similar, I just prefer IntelliJ.
Pros:
Backed by IntelliJ's fantastic code completion, error checking, and sbt/maven integration.
You can prototype within the same project setup as your actual development system (if you have one).
Cons:
I am not aware of any caching/selective evaluation so the entire worksheet is evaluated each time you want an answer, something you may not want if you have some operations which take a long time to complete.
No workspace variables window or plot integration.
Jupyter Notebooks
The Jupyter Notebook is a generalization of the iPython notebook which now supports dozens of interpreted languages (new kernels are being added all of the time).
Pros:
Scala and Spark Scala Kernels are fairly easy to install, both have the ability to add maven/sbt dependencies and JARs.
The cells in the notebook can be run individually (allowing you to train a model once and use it many times, for example).
The cells support markdown (with LaTeX!) which can be rendered on its own (a github example), allowing you to use your notebooks as a report/demonstration.
Notebooks are backed by a Notebook Server so you could easily use a more powerful computer as your notebook server and then interact with the notebook from another location.
Some kernels have autocompletion.
Looks like there is some plot integration (example) but it is not totally polished.
Cons:
Not all kernels are perfect, some have bugs or limited functionality.
No workspace variables window.
You really need to be careful about the ordering of your cells, failure to do so can cause a lot of confusion.
For most of the data-sciency stuff I do I use Jupyter but it is far from perfect. In order for Scala to really take over as a Data Science language it really needs more data science libraries (scikit-learn is sooo far ahead here) and it needs a solid plotting library (there are a few options but none I have seen both use idiomatic Scala and are able to run without a server). I think as soon as it has those two elements it will become more popular and hopefully someone will make a nice RStudio-esque IDE.
Your best shot (nothing like rstudio but this would be your best shot for scala) is apache zeppelin
I would recommend you to look at Scala IDE for Eclipse. But i think, it really depends on your personal choice in which you are comfortable writing the code. For testing code by code, i would still use jupyter notebook
I am planning to port an existing application (or at least part of it where we process data to create graphs interactively) into an ipython based UI. I am wondering if it is possible to create a menu based app using ipython notebook as an engine. Any functionality to create menu-based applications in Ipython? From my experience with Ipython so far, I guess this is not available.
I am thinking of mimicking it by creating html code in markdown cells that will produce menus as select lists, choosing and submitting from there would call some cgi on a server that would update lower parts of the notebook using AJAX. Anyone did similar stuff?
Nothing prevent you from reusing the component.
We try to make them as reusable as possible and is should be easy to use our javascript to create your own js frontend. cf #minrk example here.
If some modification make component more standalone and reusable Patches are welcomed. at some point we might even have each component (codecell, tooltip, completer) installable with bower/component.io/whatever
I would recommend not to add menu through javascript in markdown cell as it will be disable soon.
You might want to have a look at Exhibitionist that uses ipython notebook for some noce stuff.
I am working on an academic research regarding some very long functions in the Linux kernel (link, link).
For that research, I would like to use some code flow visualization tool, that would be able to plot a graph in which each vertex is a decision point and each edge is a piece of code which runs in a consequent way.
Do you know of any good, open source project that can visualize C code?
Perhaps a tool like KCacheGrind would be of help. It generates call graphs based on actual calls and cannot pre-generate a call graph without actually running the program, which may not suit your needs, but then it again it may.
History flow's are very neat for changes/diff across multiple versions.
Codeplex has a project, Dependency Visualizer which does support C also.
Gprof2Dot can render oprofile, this would get you dynamic info also.
CodeViz also (static tool) would work.
If your using gcc, gcc-xml has an introspector plugin also todo this.
You appears to want to acquire a flowchart of C source code ("decisions", "code blocks").
Something like this C flowchart?
To do this correctly, esp. for Linux kernal code, I'd expect you to have to preprocess the code first to get rid of macros and conditionals. I would assume that GCC would construct such a graph internally and that you ought to be able to get your hands on that graph.
Doxygen does some amount of 'visualization',
but you need to work on the code a bit for it to be usable.
Another interesting thing to check would be lxr
Linux Cross Referencer is a software toolset for indexing and presenting source code repositories. LXR was initially targeted at the Linux source code, but has proved usable for a wide range of software projects. lxr.linux.no is currently running an experimental fork of the LXR software.
I can recommend Sourcetrail. Can work with a compile_commands.json. Not sure if it's still maintained, though. But it's foss and you can fork it!