MongoLab and Elasticsearch - mongodb

My Mongo database is hosted at MongoLab. I'd like to use ElasticSearch as a full text search engine on top of my DB.
As I understand MongoDB needs to run as a replica-set, but I don't have any control on how the database run. I'm currently using the 500mb free plan.
On the top of that, I'm using the scala playframework.
Was anyone successful with those technologies and services?
Update:
Finally I'm not using MongoDB anymore, and went straight for a ElasticSearch solution.
I found this nice cloud host providing a 500MB free plan http://facetflow.com/
It was very useful for my development.
I didn't find any satisfying Scala library for ES, therefore I'm using Dispatch and make direct http requests to the ES instance.
I hope that someone will find this useful.

Just a quick note ... MongoHQ has oplog support with their MongoDB Elastic Deployments ... those could help you with using Elastic Search and River.
http://blog.mongohq.com/elastic-deployments-now-with-oplog-access/

I haven't looked into this too deeply, but you might want to check out Searchly http://www.searchly.com/features/ . The features mention
Built-in crawler for crawling web pages and databases. (Currently MongoDB)
If you try this out, please let me know how it goes. I will do the same.

Update:
I haven't tried searchly, but I was able to start a MongoDB instance in replica mode on OpenShift.
I have also an Elastic Search server running on the same OpenShift "gear".
Now I need time to try connecting those two together, and then the fun will start :-)

Related

AWS glue with MongoDB Atlas

I've tried multiple things to try to connect AWS glue to MongoDB atlas. Has someone been successful in doing so and if so, please can someone help me with the steps.
The AWS documentation claims that it should work with any compatible MongoDB link but it doesn't.
I am facing a similar issue. I checked with the AWS support team and it seems like they have a huge backlog of similar issues where customers have requested the ability to connect to MongoDB Atlas. Unfortunately, they don't have an ETA for this.
Either you can opt to migrate to AWS Document DB and then use Glue to crawl your data store or you probably have to think of some other way to get your data from atlas to a layer that is supported properly by Glue.

How to monitor how much resources a query uses in mongodb

I'm using MongoDB on Ubuntu 14.04 and I need to test CPU/Memory usage of different types of queries.
I'm wondering weather there is a script I can write or method I can use. I have tried using iotop however it doesn't to be useful. some guidance will be appreciated

Seeking examples of scripts/syntax for testing MongoDB with YCSB

I'm testing the performance of MongoDB on a single system using YCSB. I'd like to get a sense of the performance using SSDs compared to spinning disks.
I have CentOS, MongoDB, and YCSB installed. I have stumbled around a bit with basic examples, but have yet to see a step by step of starting from this setup to loading to running to reviewing. I keep seeing bits and pieces, but not enough to get me up and running.
If anyone could please provide a command line for these steps, it would be most appreciated!
Thanks
Here's a guide on how to run Yahoo! Cloud System Benchmark (YCSB) using Mongodb.
https://github.com/samanca/YCSB/tree/master/mongodb
https://github.com/brianfrankcooper/YCSB/wiki
Working example using Python and Java to test Mongodb:
https://github.com/richcar58/MongoDBTools/blob/master/RunYcsb/runycsb/fabfile.py

Run MongoDB in Azure

How to run MongoDB on WindowsAzure? Should instance be deployed on a virtual machine? Are there any out-of-the-box solutions like images for virtual machines or anything else? How to run replica sets on WindowsAzure?
I saw this article http://docs.mongodb.org/ecosystem/platforms/windows-azure/ but I feel like it is already out of date. Is it?
Any best practices, help or info would be appreciated!
The article that you refer to describes the options quite well. You have three options:
Running MongoDB in worker roles (as linked to in the article). Before Azure VMs, worker roles were the only option, but I wouldn't recommend it.
You can try the MongoDB database as as service offerings that are available in the add-ons store. This would be a good way to try it out. For longer term, you will have to ask around for peoples' experience.
I recommend that you run MongoDB on a Linux VM. That way you have full control and support from the linux/MongoDB community. Replica sets would the be 'out the box'. The article links to a walkthrough on a CentOS image. You can also get a pre-built image from VMDepot such as this Ubuntu one. The VMDepot images seem to work very well and are a good start for people with less Linux experience.
Edit: MongoLab seems to be gaining traction, and is getting support from Scott Guthrie. As a service that has affinity with Azure datacentres, it is worth evaluating.
You can use MongoLab - Here goes the Tutorial on Azure
Using MongoLab all the maintenance (atleast in DB engine itself) will be taken care by MongoLab guys. That will remove lot of maintenance overheads on your side.

MongoDB on Azure Cloud

Is MongoDB for Azure production ready ?
Can anyone share some experience with it ?
Looks like comfort is missing for using it for prod.
What do you think ?
Edit: Since there is a misunderstanding in my question i will try to redefine it.
The information i look into from the community is sharing an info of someone who is running mongo on windows azure to share experience from it.
What i mean by experience is not how to run it in the cloud(we already have the manual on 10gens faq) nor how many bugs it have(we can see that in mongo-azure jira).
What i am looking for is that how it is going with performance ?
Are there any problems(side effects) from running mongodb on azure ?
How does mongodb handle VM recycling ?
Does anyone tried sharding ?
In the end, is the mongo-azure worker role from 10gens stable for using it in production ?
Hope this clears out.
A bit of clarification here. MongoDB itself is production-ready. And MongoDB works just fine in Windows Azure, as long as you set up the scaffolding to get it to work in the environment. This typically entails setting up an Azure Drive, to give you durable storage. Alternatively, using a replicaset, you effectively have eventual consistency across the set members. Then, you could consider going with a standalone (or standalone with hot standby). Personally, I prefer a replicaset model, and that's typical guidance for production MongoDB systems.
As far as 10gen's support for Windows Azure: While the page #SyntaxC4 points to does clarify the wrapper is in a preview state, note that the wrapper is the scaffolding code that launches MongoDB. This scaffolding was initially released in December 2011, and has had a few tweaks since then. It uses the production MongoDB bits (and works just fine with version 2.0.5 which was published on May 9). One caveat is that the MongoDB replicaset roles are deployed alongside your application's roles, since the client app needs visibility to all replica set nodes (to properly build the set). To avoid this limitation, you'd need to run mongos and the entry point (and that's not part of 10gen's scaffolding solution).
Forgetting the preview scaffolding a moment: I have customers running MongoDB in production, with custom scaffolding. One of them is running a rather large deployment, with multiple shards, using a replicaset per shard.
So... does it work in Windows Azure? Yes. Should you take advantage of 10gen's supplied scaffolding? If you're just looking for a simple way to launch a replicaset, I think it's fine. If you want a standalone model, or a shard model, or if you need a separate deployment for MongoDB, you'd currently need to do this on your own (or modify the project 10gen published).
MongoLab is now offering Mongo as a service on Azure MongoLab Blog
Free Demo account is 0.5 GB storage are available in the Windows Azure Store
The warning message on their site says that it's a preview. This would mean that there would be no support for it at a product level in Windows Azure.
If you want to form your own opinion on a comfort level, you can take a look at their bug tracking system and get a feeling for what people are currently reporting as issues.