I'm trying to test the performance of RocksDB. I installed both DB-Bench for performance testing and also RocksDB. I have no idea how I can start RocksDB service on Ubuntu and run the performance benchmark using db-bench. Any help/advice is greatly appreciated
RocksDB is not an application, it is built into the application as a library. I think RocksDB ships with a simple demonstrator called rdb, but they also include tools for benchmarking, for which they primarily suggest you use db_bench. That will give you information about how fast RocksDB is on your system as compared to others, but in the end the performance will greatly depend on how you use it in your application.
Related
I implemented a server application with Play Framework.
I built native packages for different Operating Systems (Linux, Windows, Mac OS X) with SBT Native Packager.
This application requires a NoSQL Database. In particular, I am using MongoDB. Is there a way to embed MongoDB binary/package in my native package? Is this the best practice? Or do you suggest to install MongoDB and my Play application with two different packages?
If it is not possible / recommended to embed MongoDB in a package, do you suggest another DBMS (for instance Nitrite Database)? Thanks
This is not really best practise. Play has H2 in-memory DB embedded but this is only intended for development (because it is quicker than something that reads/writes to disk as well).
You really want to have your Mongo (or whatever other data store you decide to use) instance running in a different process, and packaged, deployed, stopped, started separately from your Play application.
You could probably figure out how to package it with your Play application and then have some script run during app startup to setup the database and load any existing data in -dbpath ie. whenever you redeploy/restart your application. But then you would have to stop/redeploy your Mongo binaries each time you redeploy a code change. You may update your application several times over a year but you are unlikely going to want to update your Mongo binaries as often. I could go on, but don't do it. It is best practise to manage your data stores separately from your applications.
Production system : HDP-2.5.0.0 using Ambari 2.4.0.1
Aplenty demands coming in for executing a range of code(Java MR etc., Scala, Spark, R) atop the HDP but from a desktop Windows machine IDE.
For Spark and R, we have R-Studio set-up.
The challenge lies with Java, Scala and so on, also, people use a range of IDEs from Eclipse to IntelliJ Idea.
I am aware that the Eclipse Hadoop plugin is NOT actively maintained and also has aplenty bugs when working with latest versions of Hadoop, IntelliJ Idea I couldn't find reliable inputs from the official website.
I believe the Hive and HBase client API is a reliable way to connect from Eclipse etc. but I am skeptical about executing MR or other custom Java/Scala code.
I referred several threads like this and this, however, I still have the question that is any IDE like Eclipse/Intellij Idea having an official support for Hadoop ? Even the Spring Data for Hadoop seems to lost traction, it anyways didn't work as expected 2 years ago ;)
As a realistic alternative, which tool/plugin/library should be used to test the MR and other Java/Scala code 'locally' i.e on the desktop machine using a standalone version of the cluster ?
Note : I do not wish to work against/in the sandbox, its about connecting to the prod. cluster directly.
I don't think that there is a genereal solution which would work for all Hadoop services equally. Each solution has it's own development, testing and deployment scenarios as they are different standalone products. For MR case you can use MRUnit to simulate your work locally from IDE. Another option is LocalJobRunner. They both allow you to check your MR logic directly from IDE. For Storm you can use backtype.storm.Testing library to simalate topology's workflow. But they all are used from IDE without direct cluster communications like in case wuth Spark and RStudio integration.
As for the MR recommendation your job should ideally pass the following lifecycle - writing the job and testing it locally, using MRUnit, then you should run it on some development cluster with some test data (see MiniCluster as an option) and then running in on real cluster with some custom counters which would help you to locate your malformed data and to properly maintaine the job.
Pardon if I can't give more pointers, but I'm really a noob at wildfly. I'm using version 9.0.2.
I have deployed jbpm-console, drools, and dashboard - no problems here. I restart wildfly using the jboss CLI, and when I login again, the repositories won't appear in the web interface or on disk (atleast nothing that grepping or find will show).
I'm using the H2 database. I'm not even sure where to look, does anyone have any idea?
Thanks in advance!
After enough reading through the docs, it would seem that it's necessary to configure jBPM to persist. From the docs:
"By default, the engine does not save runtime data persistently. This means you can use the engine completely without persistence (so not even requiring an in memory database) if necessary, for example for performance reasons, or when you would like to manage persistence yourself. It is, however, possible to configure the engine to do use persistence by configuring it to do so. This usually requires adding the necessary dependencies, configuring a datasource and creating the engine with persistence configured."
https://docs.jboss.org/jbpm/v5.3/userguide/ch.core-persistence.html
I'd like to use zookeeper in one of my applications for distributed configuration management. The application is currently running in distributed environment and having to restart nodes for configuration files changes is a headache.
However, we want the zookeeper process to be started from within the application. The point is to reduced startup dependency and reduce operational cost. We've already have startup/shutdown scripts for the application and we need to reduce impact for operations team.
Has any one done something similar? Is this setup recommended or there are better solutions? Any tip or feedback is appreciated.
I have a blog post that describes how to embed Zookeeper in an application. The Zookeeper developers don't recommend it, though, and I would tend to agree now, though I had the same rationale for embedding it that you do - to reduce the number of moving parts.
You want to keep your ZK cluster stable but you will need to restart your app to do code updates, etc, impacting the ZK cluster stability.
Ultimately you will end up using your ZK cluster for multiple apps and those extra moving parts will be amortized over a number of projects.
I want to know how can I speed up RSA 7.5( which is an IDE by IBM having eclipse under the hood with websphere server runtimes) mainly server start. The first time I start it after computer reboot it loads after, but after that it takes for ever to start/stop the server. The debug mode for server takes for ever to start.
I am using server 7 run time for IBM RSA 7.5.
So bascially RAD/RSA has websphere run times which allows to configure the server runtime start/stop within RAD/RSA. The run time allows you to develop webapps and test time on the server on deploy it on the websphere run time.
The problem I am facing is with the websphere runtime which works fine after computer reboot but is very slow after several deployments/publishing of the same web app.
I would be grateful you give performance tips for speed up RSA server start/shutdown and overall performance tips. I have plenty of memory like 12 GB with i7 Core 6 cores on Win7.
Of course of your are running the server in debug mode it's going to be a lot slower, but you have a few options like putting the server in development mode or doing some fine tuning as to which applications should start. Take a look at these articles:
Rational Application Developer Performance Tips- Case study: Tuning WebSphere Application Server V7 and V8 for performance
Performance tuning WebSphere Application Server 7 on AIX 6.1
WebSphere tuning for the impatient: How to get 80% of the performance improvement with 20% of the effort
WebSphere Performance Monitoring & Tuning
Some of these are a bit dated but they have some good information that may still be relevent to your issues; especially the first one.
Make sure that the workspaces are stored on a local disk.
edit - forgot this: buy a SSD disk. It makes a huge difference when developing.
If you have a virus scanner, disable on-access scan in the SDP installation directory including the server plugin, and in all your workspaces.
Uninstall any applications (ears) you are not using - the more you have installed the longer the server takes to startup. If your server is taking too long to start, RAD/RSA will assume it has timed out and stop it before it finishes starting - if this happens then increase the start timeout limit by double-clicking your server in the Servers tab and modifying the values in the Timeouts section.
Oh, and If you have a lot of datasources defined, and autostarting connection pools with alot of connections, it may also take a while to start the pools.
But that can't explain it all... I haven't tested, but since WAS and RSA seems to spend a lot of time doing absolutely nothing, I am starting to suspect it's trying to download schemas or something. If you have the time, you could try to trace and see if you find something like that...
I came across this post while trying to troubleshoot my RSA performance. I figured I would update it with a recommendation for improving performance on RSA 8.0.4.
http://publib.boulder.ibm.com/infocenter/radhelp/v8/index.jsp?topic=/com.ibm.performance.doc/topics/cperformance.html has some excellent tips on improving performance in the "Performance Tips" section. After implementing just some of the "Always" tips I've found my memory reducing significantly and performance being much faster.
You should start with the "Always" tips and then move to the "Sometimes" and "Rarely" ones for finer tuning.