Scala IDE for data science applications (like RStudio / Spyder / Rodeo) - scala

With the rise of Spark, Scala has gained a tremendous momentum as programming language of choice for data science applications.
To increase the efficiency when working on data science applications, specialized IDEs have been released for
R (e.g. RStudio) and
Python (e.g. Spyder or Rodeo, see Is there something like RStudio for Python?).
Is there something similar for Scala?

Unfortunately there doesn't seem to be any dedicated Data Science IDEs for Scala at this time. I think these would be your best options:
IntelliJ Worksheets:
This is basically a text editor with an output window which gets updated as often as you want. Eclipse has something similar, I just prefer IntelliJ.
Pros:
Backed by IntelliJ's fantastic code completion, error checking, and sbt/maven integration.
You can prototype within the same project setup as your actual development system (if you have one).
Cons:
I am not aware of any caching/selective evaluation so the entire worksheet is evaluated each time you want an answer, something you may not want if you have some operations which take a long time to complete.
No workspace variables window or plot integration.
Jupyter Notebooks
The Jupyter Notebook is a generalization of the iPython notebook which now supports dozens of interpreted languages (new kernels are being added all of the time).
Pros:
Scala and Spark Scala Kernels are fairly easy to install, both have the ability to add maven/sbt dependencies and JARs.
The cells in the notebook can be run individually (allowing you to train a model once and use it many times, for example).
The cells support markdown (with LaTeX!) which can be rendered on its own (a github example), allowing you to use your notebooks as a report/demonstration.
Notebooks are backed by a Notebook Server so you could easily use a more powerful computer as your notebook server and then interact with the notebook from another location.
Some kernels have autocompletion.
Looks like there is some plot integration (example) but it is not totally polished.
Cons:
Not all kernels are perfect, some have bugs or limited functionality.
No workspace variables window.
You really need to be careful about the ordering of your cells, failure to do so can cause a lot of confusion.
For most of the data-sciency stuff I do I use Jupyter but it is far from perfect. In order for Scala to really take over as a Data Science language it really needs more data science libraries (scikit-learn is sooo far ahead here) and it needs a solid plotting library (there are a few options but none I have seen both use idiomatic Scala and are able to run without a server). I think as soon as it has those two elements it will become more popular and hopefully someone will make a nice RStudio-esque IDE.

Your best shot (nothing like rstudio but this would be your best shot for scala) is apache zeppelin

I would recommend you to look at Scala IDE for Eclipse. But i think, it really depends on your personal choice in which you are comfortable writing the code. For testing code by code, i would still use jupyter notebook

Related

Tool for testing/grading IPython notebook homeworks

I'll be teaching a scientific computing class with IPython notebook in the next term. Both the course content and the homework will be distributed/returned as IPython notebooks.
I remember that about half a year ago, I had stumbled across a tool designed to hand-in homework as IPython notebook. In my recollection, it had some really nice features such as
tracking of returned homework tasks per student
integrated grading system
auto-testing for errors / code compliance
unit-testing of code segments
auto-grading features based on various metrics (e.g. speed of implementation)
Unfortunately, it seem I never saved the link - anyone seen this or any similar tool?
Writing this question actually made me think about the right buzz-words for my web search - et voila "notebook grading system" gives https://github.com/jupyter/nbgrader right at the top (quite in contrast to "ipython notebook homework tool")!
Sorry for the noise...

Easiest was to share group project code written in MatLab

We are working on a group project written in MatLab. We all need to be able to access and write the same program, sometime simultaneously. We are working on a scientific Linux distribution. We are all physicists so we would rather find a very simple - ideally GUI, solution.
It sounds like GitHub would enable us to write simultaneously and merge mismatched code but it seems so complicated. We don't really understand the push/pull/fork/commit terminology and we would rather not study it if there is an easier option.
What is the path of least resistance for a group project in Matlab?
I regularly use Subversion for group MATLAB projects. It has what I find a slightly simpler workflow than Git/GitHub.
The latest versions of MATLAB integrate directly with Subversion, so you can check things in and out directly from within the MATLAB workspace. Alternatively you can use TortoiseSVN, which integrates within Windows Explorer (I believe there is an equivalent for Linux as well).
However, I'll speak bluntly - Git and GitHub are really not that hard, and I'm pretty sure that anyone who's clever enough to be a physicist working with MATLAB is clever enough to understand them as well. Although Subversion is a bit simpler to learn, Git and GitHub have a lot of advantages, they integrate well with many other services, and they're just overall kind of better. The latest versions of MATLAB integrate directly with them in the same way as with Subversion.

Using IntelliJ like IntelliJ

I am a backend developer working on a cocktail of JVM based languages(mostly Java). I have been using Eclipse IDE for nearly 4years until a week ago I was mandated to use IntelliJ. I had a look at IntelliJ documentation to figure out the advantages it offers for me over Eclipse,Netbeans,STS etc but it was information overload. Currently I have changed the keymap to Eclipse. I believe IntelliJ has lot more potential which is waiting to be unearthed. What specific advantages does it offer w.r.t exposing/testing REST API, connecting to NoSQL DBs,refactoring code etc over Eclipse.
I won't be able to produce a detailed comparison matrix between IntelliJ and Eclipse but, for me, here are the most common things I miss when using eclipse on the workstation of some colleagues:
code navigation only with the keyboard
refactorings and their associated shortcuts (unlike Alt+Shift+L, Ctrl+Alt+V is actually "smart")
efficient "Find usages" (Ctrl+Alt+F7 or Alt+F7)
While debugging, real code edition with autocompletion (watched, code expressions, code fragments)
The new debugging visualization right in the editor (introduced by IntelliJ 14)
So are only the few things I use when helping colleagues. I'm not even talking about smart autocompletion, property files,
I admit that since I switched over to IntelliJ (~5 years ago) I never dove again into Eclipse. I'm sure they've improved a lot since but, from what I see on a daily basis with some coworkers, I'd rather stop programming than using Eclipse =D
Give it a real and serious try. There is a small time needed to got used to it but once you learn the key mapping and most of its productivity features, I'm almost certain you won't regret it. So far, I know nobody who has.

What are the efficiencies afforded by Emacs or Vim vs Eclipse? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I started coding around 5 years ago. I was introduced through Java and Eclipse which both have substantial stigma attached in the programming community. A number of people at the company I currently intern at prefer emacs or vim. I can't see how a basic text editor is faster or easier than an IDE in general although I appreciate some things like building tend to be faster from the command line.
Is this a case of the 'old-boys' club or can it be more efficient to program a project in this way?
Can you provide some use cases to demonstrate? If I were advocating Eclipse I'd say refactoring and auto-completion were pretty handy tools.
Gav
Vim / Emacs
Very fast/efficient code writing
Low memory footprint
Quick access to command line
Infinite possibilities through scripting/plugins
Never have to leave the keyboard
Eclipse
Full-featured IDE for many languages
Great refactoring support
All of them
Cross-platform
Feature rich
Extensible through plugins
I typically find myself writing volumes of code through vim and performing debugging tasks through my IDE. Familiarity with the code base is certainly a factor, as an IDE is a great tool for jumping around and learning unfamiliar source code.
I got started in IDEs like Eclipse, but switched to Vim about 2 years ago.
Reasons you may want to use a text-mode editor:
It can be used as an IDE for just about any language (you learn it once and use it for everything)
It can do all those fancy things like auto-completion, refactoring and many more complex operations, which you can extend by adding macros or plug-ins
It works just about everywhere (and can be used through an SSH shell)
You don't need a GB of ram to run it
If you really persevere, you will find that working in an editor like this will eventually be faster, and in fact becomes ingrained as a sort of 'muscle memory'. This means you can code without slowing down to think about the process.
The argument "Eclipse for Java" is a different argument than "Eclipse for [something that isn't java]". Eclipse does rock for Java.
I mean, vi is like a screwdriver, or maybe a swiss army knife, and Eclipse is like a big CNC combo mill and asphalt spreader. You don't exactly compare them, you kind of just use both.
Also, are you working inside something giant, which you know little about, but which Eclipse understands? An example would be working on Eclipse itself. Here, Eclipse has perfect visibility, total language support, and you need the toast prompts and the documentation links.
But if you are typing in a 100-line Ruby program to convert an SQL database, Eclipse doesn't add much value, especially considering its baggage.
It's also critical to set up vi right, or you won't grok the appeal. Autoindent, showmatch, tab handling, and various other options should be set. You should have a easy way to generate a tags file. Google can find tag generators, or just write one from scratch, with a few lines of shellcode and sed(1).
I don't consider refactoring to be a criteria. That's not something you do once an hour or even once a day. Sure, fire up the big IDE when you need to refactor. Oh, and don't expect automated refactoring of anything except Java.
Finally, vi can actually do a lot of things that the IDE can't begin to do. The grouped regular expression global substitution is kind of a generalized refactor-anything engine. To appreciate the vi gestalt you need to learn the line (":") mode. Briefly, it's like having sed(1) inside your editor.
It all depends on what you want/expect and what your usage model is.
If you're looking for a Java IDE, Eclipse is difficult to beat. It's written in Java, for Java, by Java folks.
If you're looking for a tool to edit files from the command line quickly, Emacs or vi both fit the bill.
If you're looking for a tool from which you never have to leave because it can do anything you want (send/read mail, manage projects, todo lists, compile, debug, etc. etc. etc.), then Emacs is more "efficient".
If you're looking for reasons to switch editors, figure out what you want. If you want a better Eclipse, vi and Emacs won't give you that, stick to Eclipse.
If you're looking for a small, nimble editor, vi will fit the bill.
If you're looking for the ultimately extensible editor, Emacs is the way.
Whichever tool you decide to go with, immerse yourself. Learn all of the ins and outs, extend it to meet your needs. Use it to its limits and become efficient in its use.
Emacs can be a powerful IDE, but having gone from Emacs to Eclipse, I have to say I would never go back. Eclipse just offers so many features that you can't get within Emacs.
Mylyn and scoped views of the data and files I'm using, the debugging UI, CVS UI, are all built in and easy to get and use. I'll use the mouse a little to get'em.
First things first. VIM is more productive for programming than Eclipse. Your personal productity in VIM may be abysmal, but the potential cap of VIM is much higher. This is a fact.
VIM is a martial art. It feels unnatural when you first use it. And you can't even make it work. It takes years of practice to gradually become productive. You focus on mastering a little detail at first. Slowly all these bits you master add up until text is flowing effortlessly out of your finger tips onto the screen. Complicated edits that would make your co-worker sigh will jump from your hands before he can finish his exhale. There are few people who can use VIM. Fewer who can use it productivley. And you may never meet a master in your life time. But they are rumored to exist.
VIM is designed to keeep your hands on home-row. Moving your hand from the keyboard to the mouse is demoralizing. It's a gross motor movement. Moving your arm has a phsycological effect that hurts your motivation. Using VIM, someone could bolt your wrists onto the keyboard and you could still easily open up files, split windows, open tabs, build the project, search/replace, change fonts, change colors, etc. And all at lightning speed.
VIM is modal. That means you don't have to do complex key combinations where you hold down control+shift+Key. This hurts your hands in the long run. Instead you execute commands. There is no need for key combos to due to the modal nature.
We store data in our memory like computers do. Our memory can only hold a few values at a time. See how many distinct integers you can hold in your head before they start to slip away. We overcome this human limitation by writing stuff down. If data falls out of our memory we can easily look at what we wrote down to get it back. If your time is spent doing gross motor, physical things you are losing time that could have been spent on processing data in your brain. You want your mind to flow onto the screen without any effort at all. It may not sound like much but VIM's ability to effortlessly transfer what is in your mind to the screen is a BIG productivity boost. It's hard to put in words what I'm trying to say.
VIM supports code completion. Both textual and look-up based. It can pull text from mulitple files. Anything you desire can be had in VIM. Either make it yourself or use something someone else cooked up.
VIM supports goign to definitions with ctags. You can also find all references of an item. Again, anything you desire can be had in VIM.
The scripting of VIM is huge. You can download or create thousands of color schemes and change colors in an instant. Try to change fonts or colors in Microsoft Visual Studio and it will hang for 20 seconds while it loads data. It won't let you store color schemes and you must spend 30 minutes tweaking your colors and fonts every time you want a change of scenery. In VIM you can set line spacing to zero to fit more lines of code on the screen. I get over 80 lines. Visual studio uses 2 pixels of spacing for every line and you can't adjust it!!! Less lines = more scrolling = less productivity = forced to use small fonts for more lines = eye strain.
Split windows are opened in an instnat in VIM. It's usefull when you need to look at data in one section of the code that's far from the place you are typing (or in a different file). You don't have to spend time resizing windows, or worry about GUI windows overlapping each other and falling behind each other. Un-related code windows can be opened in tabs as to not take up screen space, but allow quick switching.
VIM as an IDE: http://www.youtube.com/watch?v=MQy2rVOf-z0&feature=fvwrel
VIM the revenge: http://www.youtube.com/watch?v=lQNFfhC4QI8
I've used vi for years to edit code in a variety of languages, and really love it. But I've found IDEs like Eclipse to be even nicer for Java development, and now I tend to work in Eclipse almost entirely. I drop out to use vi from time to time for a few specialized activities like bulk-inserting copyright notices, mainly those dealing with certain kinds of rote edits that aren't well automated in Eclipse. I also have my Windows .java file type mapped to vi for when I just want to look at a source file without waiting for Eclipse to open up.
Some of the attractive features in Eclipse are:
method name completion
error highlighting
pop-up javadoc comments
refactoring
I do find it a lot more efficient than vi in general, so you should try it out and see if it that holds true for you too.
I remember reading somewhere about a study which showed that people perceived keyboard shortcuts as more efficient than mousing, when in fact it wasn't always.
Another psychological effect is that we attach value to things which are expensive, i.e. since Emacs is harder to learn it must be better in some way.
I think those effects could explain a lot of the extreme affection some people have for Emacs/Vi.
However, in the case of Eclipse, I find it can be very slow and even crash occasionally, but that is not a case against IDEs in general.
I use both Eclipse, VS and Emacs regularly. I would use TextMate too, but I don't have a mac anymore. It depends on what I am doing, more specifically, what system best supports my language and tools.
I know people who spend considerably more time programming their editor, than they spend doing something useful. Some of them even admit themselves that they only do it for the challenge. Other people often claim that Emacs/Vi can do much more than IDEs, because they are scriptable. Well, most IDEs (including Eclipse) can also be scripted. In that sense almost all editors are equivalent (though, I admit, some editors are more easily scripted than others).
If you like IDEs, my advice is to keep using one. There is no One True Editor.
EDIT:
This seems to be the article Nick Bastin is referring to. I agree that it is far from a definitive source. However, I still think my point about perceived and real productivity not being the same thing still holds.
It depends on the languages.
For Java or .NET use an IDE (Eclipse,Netbeans,Visual Studio...).
For almost all the other languages(C,C++,ruby,python,haskell,lisp...) vi and emacs are better in my opinion.
The efficiency provided from vim/emacs is mostly afforded by their heavy keyboard use. In these programs you can do most anything directly from the keyboard, rather than having to stop and use the mouse.
I would anytime go for emacs rather than eclipse. I also have to say that bare bone emacs, is not that great, but after some tweaking, you will never want to let it go. In particular I will tell you how helpful emacs was while writing my Master's thesis, that should make clear why Eclipse is inferior, just because it is less versatile.
I my master thesis I wrote in the following programming languages: C++, Python and R. Complementary, I had to write the thesis, for which I used LaTeX. Moreover, I had to write a bunch of shell scripts and cmake scripts. Guess what? Emacs has great support for all of it. Specially, it was a pleasure to work with AuCTeX to produce LaTeX documents. Then, Emacs provides the great ESS mode for working with R. Likewise, it provides facilities for python. Once I had my cmake scripts for building the C++ code, I only had to call compile from withing emacs and I was done. Eclipse cannot do this things altogether, therefore you will need to learn to use many different programs. Note taking? There is org-mode for that, and it is great!
And then, my program needed of a very powerful computer (not like any laptop). So, I could just do everything remotely from within emacs!!!! Using tramp, I found myself doing remote interactive evaluation of R code, remote compiling, executing and debuging C++ code, and everything within the same good local emacs window I had been using. In contrast, my friends who used a separate tool for everything were much slower in developing software that was meant to run in another computer.
Like this I have some other stories, but I think, this will give you a good idea on the things you can do with Emacs. All in all , I think choosing to use emacs (despite the learning curve) has been the most productive decision I have ever made.
Hope it helps.
I'd say the actual vim/emacs editors are far superior to the Eclipse text editor in terms of the shortcuts they offer. However, I completely agree with you about refactoring.
Most people have to write scripts to do the sort of level of refactoring Eclipse is capable of. I think part of it is bragging rights or people just doing it the same way they always have.
We've had this argument at work recently. My take was that one single feature I couldn't do without is Emacs's autocomplete. Eclipse's autocomplete is based on syntactical analysis - the code gets parsed, and as you type code you're offered choice of completion. Emacs' autocomplete is base on simple textual analysis. That means it works in plain text, in comments, in documentation - everywhere. I keep saying the Emacs' autocomplete is what IntelliSense wants to be when it grows up.
Update:
Eclipse does offer Alt-/ which is supposed to be similar to Emacs. Not sure how well it works, though.
The only place I prefer an IDE is for debugging. I set up my vim environment for debugging but is was so painful to use, so clunky, that I now just switch to my IDE (Netbeans) when I need to debug. vim is great for text editing, the IDE is great for more complex stuff (like debugging, and some project management related tasks).
Like some of the posts above, I started out with an IDE (Eclipse). From there I moved to Emacs, and then I moved back to a rich text-editor (TextMate).
For me, the efficiency was the ability to have an editor at the interface level. Allowing me to integrate other service I've built up (or others) into my pseudo-IDE environment.

Emacs in the era of IDEs

I am relatively new in a software development. I have noticed that in some cases a text editor with extended text processing capabilities (I use Notepad++) gives me a better productivity than an IDE (I use the Eclipse and the Netbeans). In the era of IDEs, does it makes sense to learn emacs (or some other tool that you suggest?)
Yes, and no.
Yes for the exact same reason why a doctor should be able to get an approximate diagnose from your symptoms by using his experience, and not putting the list of symptoms in a google query and find the answer.
Yes for the exact same reason why airliner pilots are taught to fly without fly-by-wire even if all airplanes are today fly-by-wire, so almost everybody is able to keep them flying.
No, because if you need specific tools to make your life easier, such as GUI designers, Intellisense, access to documentation, then clearly Emacs is not enough.
Still, I remember that many developers at Microsoft organized a fund raising for uganda vim children.
Summing up, you need to use what makes you more productive. In many cases, emacs (or vim) is more productive than a huge IDE that makes coffee.
Even if you were using an IDE, it's still useful to know Emacs/VIM. You don't have your IDE around all the time, and while doing something via SSH, you don't really have any other option (yeah you can use nano, but thats not very effective).
When you do software development, you often deal with a lot of text besides code. I may use an IDE for most of my coding, but often I'll use Vim for plain ol' text viewing and manipulation.
Sometimes I need to view code, SQL scripts, XML, CSV, or TXT files. Other times I may want to perform bulk replacements on those files, or extract out certain chunks of text from it.
IDEs are good for writing and refactoring code, but aren't meant to be used for generic text viewing and manipulation. For that, I'd recommend having the full power of something like Emacs or Vim. Notepad++ can be good too.
In short, use the right tool for the job.
Everyone here seems to think that Emacs/Vim are light-weight compared to IDE's. This couldn't be further from the truth. Even the best IDE's do not have the features that Emacs does. In what IDE's can you program completely without moving your hands from the keyboard, read your email, chat with Jabber, run an integrated debugger, view your calendar, program your own functions, and send dbus commands? That's only the surface of what Emacs allows; I'm sure Vim has similar capabilities.
Ignoring productivity completely; remember why you started programming in the first place. You like to create stuff, you like to know how stuff works, you like creating clever solutions to obscure problems, you like tinkering, you like learning new things, you like creating tools that help you do things.
With this in mind the answer is yes, learn it if you think it will add to the delight to your days. Maybe you will also get some work done along the way. Fiddling with eamcs will not make you melt. You might even make a life-long friend of it. Happy Hacking.
Emacs is an IDE. In fact, you could argue it's a whole operating system.
vi, on the other hand, is an editor.
Yes, it makes sense to learn vi, since it's about the only editor you can use on anything vaguely posix, even if the GUI isn't running or the network is incredibly slow (vi is usable at 300 baud). Basically, it's the unix administrator's safety net of an editor. I've used it to rescue myself from broken device drivers on an OS X server that would only come up so far as single user, so even the must GUIfied Unix out there can still be saved by humble vi.
It makes some sense to learn emacs too, but perhaps not quite so much these days.
I would say it helps to learn emacs (or say vim).
Personally I'm comfortable using IDE (eclipse) for Java developement, when I code in perl or python I prefer to use emacs. Also if you are in a resource constriant environment (say) starting an IDE like Eclipse (which would crawl if there is anything less than 1GB RAM) to write a perl script might not be feasible. In such cases a vim would be neat tool of choice.