Unable to run YCSB successfully for ElasticSearch - nosql

I am new to both YCSB and ElasticSearch. I was able to run YCSB easily for Cassandra. However, I have not been able to do the same with ES (or perhaps I have but I am not sure).
Following the steps documented in YCSB/elasticsearch, I was able to start the test and I even got the results. I am not sure on which ES instance is it running on? For Cassandra, I had to start the Cassandra Server myself and then run the tests (providing the hosts details along with the ycsb command). ES, on the other hand, does not require us to do anything of that sort. So how does YCSB run these tests. I didn't even have my local ES instance up but the tests gave results.
Any insights would really help?
Thanks!

Related

Can I catch events such as on Executor start in Apache Spark?

What I want to do, is for the executor to start a program, such as a profiling tool, when it starts (that is, before it start executing any task). In this way, it would be possible to monitor things like CPU usage of an executor. Does Spark provide such hooks/callbacks? I have used SparkListener, but that is used by the driver side. Do we have a similar thing for Executors?
This should work for your requirement.
http://spark.apache.org/developer-tools.html#profiling
Setup yourkit to work with both drivers and slaves (executors). It doesn't start profiling unless you tell it. Connect to master or slave, start profiling and then run your tests.
Happy profiling!!

Where does Spark store its profiling results?

One can view profiling statistics of a Spark program through a browser on port 4040. However, the cluster I am running it on, doesn't have a browser and also I'm not the admin. Is this information also logged in some file, so that I can make my own tool to visualize the statistics by reading such file?
Note that I run Spark over YARN. But sometimes I also run using local standalone mode. So any answer related to that is also appreciated.

How to monitor how much resources a query uses in mongodb

I'm using MongoDB on Ubuntu 14.04 and I need to test CPU/Memory usage of different types of queries.
I'm wondering weather there is a script I can write or method I can use. I have tried using iotop however it doesn't to be useful. some guidance will be appreciated

Seeking examples of scripts/syntax for testing MongoDB with YCSB

I'm testing the performance of MongoDB on a single system using YCSB. I'd like to get a sense of the performance using SSDs compared to spinning disks.
I have CentOS, MongoDB, and YCSB installed. I have stumbled around a bit with basic examples, but have yet to see a step by step of starting from this setup to loading to running to reviewing. I keep seeing bits and pieces, but not enough to get me up and running.
If anyone could please provide a command line for these steps, it would be most appreciated!
Thanks
Here's a guide on how to run Yahoo! Cloud System Benchmark (YCSB) using Mongodb.
https://github.com/samanca/YCSB/tree/master/mongodb
https://github.com/brianfrankcooper/YCSB/wiki
Working example using Python and Java to test Mongodb:
https://github.com/richcar58/MongoDBTools/blob/master/RunYcsb/runycsb/fabfile.py

mongodb sharding issue with 2.5.5 development version

I am trying to perform performance testing for one of my application using MongoDB. I am using 2.5.5 development version. Sharding works fine when I try to read and write data using mongos.
To perform performance testing I need to start 600-700 mongoconnection threads to the mongos. Each thread queries around 2000 documents which is distributed on two shards. This test runs fine for few minutes but after sometime it stops working with the error "Connection refused by one of the shard". Looking closely at it I found that server runs out of ports when these many threads request data.
Could anyone please have a look and let me know if it is a MongoDB bug in the dev version or is it something which I am doing wrong while connecting to database.
your help will be much appreciated.
Thanks,
Vibhu