How to profile Akka applications?

How to profile Akka applications? - scala

I have a small Akka application that passes many messages between its actors and each actor does some calculations on the data it receives. What I want is to profile this application in order to see which parts of the code take up most time and so on.
I tried VisualVM but I cannot really understand what's going on. I added a picture of the profiler output.
My questions are
What for example is this first line and why does it take up so much time? (scala.concurrent.forkjoin.ForkJoinPool.scan())
Can Akka applications because of their asynchronous behaviour be profiled well at all?
Can I see for instance how long one specific actor(-type) works for one specific message(-type) it receives?
Are there other best-practices for profiling Akka applications?

There are packages not profiled by default and it is their time that is accounted in the profile of scala.concurrent.forkjoin.ForkJoinPool.scan(). If all the hidden packages are allowed to be sampled, the true CPU time consumers will be revealed. For example, the following before/after illustrative profiles uncover that threads are put to sleep most of the time by sun.misc.Unsafe.park waiting to be unparked.
Akka applications can be profiled quite well with proper instrumentation and call tracing. Google's prominent Dapper, a Large-Scale Distributed Systems Tracing Infrastructure paper contains detailed explanation of the technique. Twitter created Zipkin based on that. It is open sourced and has an extension for distributed tracing of Akka. Follow its wiki for a good explanation of how to set up a system that allows to
trace call hierarchies inside an actor system;
debug request processing pipelines (you can log to traces, annotate them with custom key-value pairs);
see dependencies between derived requests and their contribution to resulting response time;
find and analyse slowest requests in your system.
There is also a new kid on the block, Kamon. It is a reactive-friendly toolkit for monitoring applications that run on top of the JVM, which is specially enthusiastic to applications built with the Typesafe Reactive Platform. That definitely means yes for Akka and the integration comes in the form of the kamon-akka and kamon-akka-remote modules that bring bytecode instrumentation to gather metrics and perform automatic trace context propagation on your behalf. Explore the documentation starting from Akka Integration Overview to understand what it can and how to achieve that.

Just a couple of days ago TypeSafe announced that TypeSafe console now is free. I don't know what can be better for profiling Scala/Akka applications. Of cause you can try JProfiler for JVM languages, I've used it with Java projects, but it's not free and for Java.

I was thinking about profiling/metrics in code since I also use Akka/Scala a lot for building production applications, but I also eager to hear alternative ways to make sure that application is healthy.
Metrics (like Dropwizard)
Very good tool for collecting metrics in the code, with good documentation and embedded support for Graphite, Ganglia, Logback, etc.
It has verbose tools for collecting in-app statistics like gauges, counter histograms, timings - information to figure out what is the current state of your app, how many actors were created, etc, if they are alive, what the current state is of majority of actors, etc.
Agree, it's a bit different from profiling but helps a lot to find roots of the problem, especially if integrated with some char building tool.
Profilers like (VisualVM, XRebel)
Since I'm a big fun of doing monitoring, it still answers a slightly different question - what are current insights of my application right now?
But there is quite another matter may disturb us - what of my code is slow (or sloppy)?
For that reason, we have VisualVM and another answers to this question - how to profile Akka actors with VisualVM.
Also, I'd suggest trying XRebel profiler that just adds a bit more firepower to process of figuring out what code makes app slower. It's also paid but on my project it saved a lot of time dealing with sloppy code.
New Relic
I'd suggest it for some playground projects since you can get some monitoring/profiling solutions for free, but on more serious projects I'd go for things I highlighted above.
So I hope, that my overview was helpful.

Related

Simple JVM to JVMs communication framework?

I know there are lots of options out there, and sorry to ask such a similar question again, but it's different enough to warrant it -- I think. I have one Java app, let's call it the "master", that will do some work, and then it needs to inform other Java apps in other JVMs about it. Today they are on the same machine, but this will not always be the case.
I'd prefer something that has an easy way to add/remove listeners (i.e., other JVMs), etc...so RMI or Web Services are not suitable as there'd be too much manual coding there to look after who is what, etc.
I'd also like the ability to add new Java apps (again, in other JVMs obviously) to the master's 'notify list', whatever it may be, without much effort -- preferably without needing to rebuild the master app.
What I'd really like is an easy messaging/communication framework, which requires some simple configuration.
I'm overwhelmed by the amount of frameworks and options out there...JMS, jgroups, the various MQ frameworks, RMI, Jini, etc, Web Services.
I'm looking for fast, simple, reliable, and easy! Any suggestions? I don't need complex or particularly advanced features.

Your master will have to be a server which is always available and the clients will have to register/unregister.
Maybe you can have a look at http://mina.apache.org/mina-project/userguide/ch2-basics/sample-tcp-server.html
Mina is also integrated in the Apache Camel project. (warning: Camel is a very addictive framework. The risk exists you will try to use it for all your future background processing :)

Which are the kind of applications/services/components where the Actors model (Scala, Erlang) is best suited for?

Besides the benefits of this model over the shared-memory model, I'm just trying to understand where to apply it for higher levels use-cases.

As to Scala, Actors model fits most of the multi-threaded cases one can think about:
Swing GUI application
Web Applications (see Lift framework)
Application Server in multicore environment:
Batch processing of requests/data
Background tracking tasks
Notifications & Scheduled tasks
Actors model makes design much clearer and greatly simplifies interprocess communication.

OTP Framework : Provides really good framework for network based applications.
Helps in making fault tolerant applications . (process restart using Supervisor's in OTP).
Both Synchronous and Asynchronous modes of communication can be done using gen_server.
Event based callbacks can be used using gen_event.
State machine can be programmed easily using gen_fsm (In case you need to follow some states in your application).
A process crash does not bring the whole application down. Only that particular process crashes.
Functional programming language.
A lot easier to program at binary level.
Garbage collection.
Native compilation option.
Fair amount of good useful modules are available.
Able to make good solid concurrent applications easily.
And lots more.... I really enjoyed working on some applications in erlang , making those in c/c++ would have been very difficult.

Real time system concept proof project

I'm taking an introductory course (3 months) about real time systems design, but any implementation.
I would like to build something that let me understand better what I'll learn in theory, but since I have never done any real time system I can't estimate how long will take any project. It would be a concept proof project, or something like that, given my available time and knowledge.
Please, could you give me some idea? Thank you in advance.
I programm in TSQL, Delphi and C#, but I'll not have any problem in learning another language.

Suggest you consider exploring the Real-Time Specification for Java (RTSJ). While it is not a traditional environment for constructing real-time software, it is an up-and-coming technology with a lot of interest. Even better, you can witness some of the ongoing debate about what matters and what doesn't in real-time systems.
Sun's JavaRTS is freely available for download, and has some interesting demonstrations available to show deterministic behavior, and show off their RT garbage collector.
In terms of a specific project, I suggest you start simple: 1) Build a work-generator that you can tune to consume a given amount of CPU time; 2) Put this into a framework that can produce a distribution of work-generator tasks (as threads, or as chunks of work executed in a thread) and a mechanism for logging the work produced; 3) Produce charts of the execution time, sojourn time, deadline, slack/overrun of these tasks versus their priority; 4) demonstrate that tasks running in the context of real-time threads (vice timesharing) behave differently.
Bonus points if you can measure the overhead in the scheduler by determining at what supplied load (total CPU time produced by your work generator tasks divided by wall-clock time) your tasks begin missing deadlines.

Try to think of real-time tasks that are time-critical, for instance video-playing, which fails if tasks are not finished (e.g. calculating the next frame) in time.
You can also think of some industrial solutions, but they are probably more difficult to study in your local environment.

You should definitely consider building your system using a hardware development board equipped with a small processor (ARM, PIC, AVR, any one will do). This really helped remove my fear of the low-level when I started developing. You'll have to use C or C++ though.
You will then have two alternatives : either go bare-metal, or use a real-time OS.
Going bare-metal, you can learn :
How to initalize your processor from scratch and most importantly how to use interrupts, which are the fastest way you have to respond to an externel event
How to implement lightweight threads with fast context switching, something every real-time OS implements
In order to ease this a bit, look for a dev kit which comes with lots of documentation and source code. I used Embedded Artists ARM boards and they give you a lot of material.
Going with the RT OS :
You'll fast-track your project, and will be able to learn how to fine-tune a RT OS
You may try your hand at an open-source OS, such as Linux or the BSDs, and learn a lot from the source code
Either choice is good, you will get a really cool hands-on project to show off and hopefully better understand your course material. Good luck!

As most realtime systems are still implemented in C or C++ it may be good to brush up your knowledge of these programming languages. Many realtime systems are also embedded systems, so you might want to play around with a cheap open source one like BeagleBoard (http://beagleboard.org/). This will also give you a chance to learn about cross compiling etc.

How do I plan an enterprise level web application?

I'm at a point in my freelance career where I've developed several web applications for small to medium sized businesses that support things such as project management, booking/reservations, and email management.
I like the work but find that eventually my applications get to a point where the overhear for maintenance is very high. I look back at code I wrote 6 months ago and find I have to spend a while just relearning how I originally coded it before I can make a fix or feature additions. I do try to practice using frameworks (I've used Zend Framework before, and am considering Django for my next project)
What techniques or strategies do you use to plan out an application that is capable of handling a lot of users without breaking and still keeping the code clean enough to maintain easily?
If anyone has any books or articles they could recommend, that would be greatly appreciated as well.

Although there are certainly good articles on that topic, none of them is a substitute of real-world experience.
Maintainability is nothing you can plan straight ahead, except on very small projects. It is something you need to take care of during the whole project. In fact, creating loads of classes and infrastructure code in advance can produce code which is even harder to understand than naive spaghetti code.
So my advise is to clean up your existing projects, by continuously refactoring them. Look at the parts which were a pain to change, and strive for simpler solutions that are easier to understand and to adjust. If the code is even too bad for that, consider rewriting it from scratch.
Don't start new projects and expect them to succeed, just because your read some more articles or used a new framework. Instead, identify the failures of your existing projects and fix their specific problems. Whenever you need to change your code, ask yourself how to restructure it to support similar changes in the future. This is what you need to do anyway, because there will be similar changes in the future.
By doing those refactorings you'll stumble across various specific questions you can ask and read articles about. That way you'll learn more than by just asking general questions and reading general articles about maintenance and frameworks.
Start cleaning up your code today. Don't defer it to your future projects.
(The same is true for documentation. Everyone's first docs were very bad. After several months they turn out to be too verbose and filled with unimportant stuff. So complement the documentation with solutions to the problems you really had, because chances are good that next year you'll be confronted with a similar problem. Those experiences will improve your writing style more than any "how to write good" style guide.)

I'd honestly recommend looking at Martin Fowlers Patterns of Enterprise Application Architecture. It discusses a lot of ways to make your application more organized and maintainable. In addition, I would recommend using unit testing to give you better comprehension of your code. Kent Beck's book on Test Driven Development is a great resource for learning how to address change to your code through unit tests.

To improve the maintainability you could:
If you are the sole developer then adopt a coding style and stick to it. That will give you confidence later when navigating through your own code about things you could have possibly done and the things that you absolutely wouldn't. Being confident where to look and what to look for and what not to look for will save you a lot of time.
Always take time to bring documentation up to date. Include the task into development plan; include that time into the plan as part any of change or new feature.
Keep documentation balanced: some high level diagrams, meaningful comments. Best comments tell that cannot be read from the code itself. Like business reasons or "whys" behind certain chunks of code.
Include into the plan the effort to keep code structure, folder names, namespaces, object, variable and routine names up to date and reflective of what they actually do. This will go a long way in improving maintainability. Always call a spade "spade". Avoid large chunks of code, structure it by means available within your language of choice, give chunks meaningful names.
Low coupling and high coherency. Make sure you up to date with techniques of achieving these: design by contract, dependency injection, aspects, design patterns etc.
From task management point of view you should estimate more time and charge higher rate for non-continuous pieces of work. Do not hesitate to make customer aware that you need extra time to do small non-continuous changes spread over time as opposed to bigger continuous projects and ongoing maintenance since the administration and analysis overhead is greater (you need to manage and analyse each change including impact on the existing system separately). One benefit your customer is going to get is greater life expectancy of the system. The other is accurate documentation that will preserve their option to seek someone else's help should they decide to do so. Both protect customer investment and are strong selling points.
Use source control if you don't do that already
Keep a detailed log of everything done for the customer plus any important communication (a simple computer or paper based CMS). Refresh your memory before each assignment.
Keep a log of issues left open, ideas, suggestions per customer; again refresh your memory before beginning an assignment.
Plan ahead how the post-implementation support is going to be conducted, discuss with the customer. Make your systems are easy to maintain. Plan for parameterisation, monitoring tools, in-build sanity checks. Sell post-implementation support to customer as part of the initial contract.
Expand by hiring, even if you need someone just to provide that post-implementation support, do the admin bits.
Recommended reading:
"Code Complete" by Steve Mcconnell
Anything on design patterns are included into the list of recommended reading.

The most important advice I can give having helped grow an old web application into an extremely high available, high demand web application is to encapsulate everything. - in particular
Use good MVC principles and frameworks to separate your view layer from your business logic and data model.
Use a robust persistance layer to not couple your business logic to your data model
Plan for statelessness and asynchronous behaviour.
Here is an excellent article on how eBay tackles these problems
http://www.infoq.com/articles/ebay-scalability-best-practices

Use a framework / MVC system. The more organised and centralized your code is the better.
Try using Memcache. PHP has a built in extension for it, it takes about ten minutes to set up and another twenty to put in your application. You can cache whatever you want to it - I cache all my database records in it - for every application. It does wanders.
I would recommend using a source control system such as Subversion if you aren't already.

You should consider maybe using SharePoint. It's an environment that is already designed to do all you have mentioned, and has many other features you maybe haven't thought about (but maybe you will need in the future :-) )
Here's some information from the official site.
There are 2 different SharePoint environments you can use: Windows Sharepoint Services (WSS) or Microsoft Office Sharepoint Server (MOSS). WSS is free and ships with Windows Server 2003, while MOSS isn't free, but has much more features and covers almost all you enterprise's needs.

Testing a client-server application

I am coding a client-server application using Eclipse's RCP.
We are having trouble testing the interaction between the two sides
as they both contain a lot of GUI and provide no command-line or other
remote API.
Got any ideas?

I have about 1.5 years worth of experience with the RCP framework, I really liked it. We simply JUnit for testing...
It's sort of cliche to say, but if it's not easy to test, maybe the design needs some refactoring?
Java and the RCP framework provide great facilities for keeping GUI code and logic code separate. We used the MVC pattern with the observer, observable constructs that are available in Java...
If you don't know about observer / observable construct that are in Java, I would HIGHLY recommend you take a look at this: http://www.javaworld.com/javaworld/jw-10-1996/jw-10-howto.html, you will use it all the time and your apps will be easier to test.

As a former Test & Commissioning manager, I would strongly argue for a test API. It does not remove the need for User Interface testing, but you will be able to add automated tests and non regression tests.
If it's absolutely impossible, I would setup a test proxy, where you will be able to:
Do nothing (transparent proxy). Your app should behave normally.
Spy / Log data traffic. Add a filter mechanism so you don't grab everything
Block specific messages. Your filter system is very useful here
Corrupt specific messages (this is more difficult)
If you need some sort of network testing:
Limit general throughput (some libraries do this)
Delay messages (same remark)
Change packet order (quite difficult)

Have you considered using a UI functional testing tool? You could check out HP's QuickTest Professional which covers a wide varieties of UI technologies.

we are developing one client-server based application using EJB(J2EE) technology, Eclips and MySQL(Database). pl suggest any open source testing tool for functional testing .
thanks
Hitesh Shah

Separate your client-server communication into a pure logic module (or package). Test this separately - either have a test server, or use mock objects.
Then, have your UI actions invoke the communications layer. Also, have a look at the command design pattern, using it may help you.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse