Specifying storage for mongoDB server in fresh install on Azure

Specifying storage for mongoDB server in fresh install on Azure - mongodb

I'm planning to install MongoDB server 4.4 on a Ubuntu Linux vm on Azure. In the marketplace I see images already available. Why are these chargeable hourly? It's the community edition which is free, so I cannot understand the charge
Secondly, I plan on adding a 10TB data disk for storing the ever increasing data. But if I use preinstalled image then I suppose mongoDB server will get installed on the OS disk(128GB) whereas I would prefer it used the data disk(10TB).
Does it make sense to then use the preinstalled image? I found a mongodb setting at https://docs.mongodb.com/manual/reference/configuration-options/#mongodb-setting-storage.dbPath to specify storage path. Both the OS disk and data disk will be premium SSD. Will it make a difference if the software is installed on OS disk and collections/databases get stored in data disk?

The reason why you see pricing is because it's neither added by Canonical nor MongoDB publishers. The golden image you shared is being offered by Cloud Infrastructure Services. They not only offer MongoDB CE images but also packer, docker compose etc., built on top of Opensource OS.
As you say both OS & Data disks will be provisioned using Premium SSDs. I wouldn't worry much about performance.
To cut down costs, I would propose below things.
Create MongoDB image yourself by picking free version of Ubuntu/Centos or whatever. You can use packer or Azure Image Builder. I would setup automation to periodically build images with different versions of Mongo & Linux flavor & version them
Provision VM, install MongoDB app on top of the VM & regularly patch, upgrade both linux & application packages
If you have MongoDB server on-prem, you could create a snapshot of OS with MongoDB application packaged in it, convert it into vhd file & export the image onto Azure. You can follow documentation here
If MongoDB offers golden images as private offers & you have existing license contract with them. Check out with MongoDB, register your subscription & you may be eligible to access their golden images on Azure appearing as "Private Offers". RedHat offers it's images to customers this way.
The image being offered on Azure doesn't make much financial sense. Let's assume you have 3 node cluster.
Pricing for using image is :: 0.025 euros per hour
per day it would be :: 24*0.025 = 0.6 euros
yearly cost could be :: 365*0.6 = 219 euros
combined cost for 3 nodes :: 219*3 = 657 euros
If you have multiple environments, multiple databases in different regions, cost just gets multiplied on & on.
Now, the choice is yours!

Related

MongoDB Performance in Docker

I did an experiment by running a python app that is writing 2000 records into mongoDB.
The details of my setup of the experiment as follows:
Test 1: Local PC - Python App running on Local PC with mongoDB on Local PC (baseline)
Test 2: Docker - Python App on Linux Container with mongoDB on Linux Container with persist volume
Test 3: Docker - Python App on Linux Container with mongoDB on Linux Container without persist volume
I’ve generated the result in chart - on average writing data on local PC is about 30 secs. Where else on Docker, it takes about 80plus secs. Hence it seems like writing on Docker is almost 3 times slower than writing on local PC itself.
Should I want to improve the write speed or performance of the mongoDB in docker container, what is the recommended practice? Or should I put the mongoDB as a external volume without docker?
Thank you!
graph

Your system is not consistent in many ways - dynamic storage and CPU performance, other processes, dynamic system settings etc. There are a LOT of underlying things under storage only.
60 sec tests are not enough for anything
Simple operations are not good enough for baseline comparisons
There is ZERO performance impact with storage and CPU in case of containers, there is an impact in networking, but i assume, this is not applicable here
Databases and database management systems must be optimized in special ways, there is no "install and run" approach. We, sysadmins/db admins usually need days to have it running smoothly. Also, performance changes over time.

After couple of weeks of testing and troubleshooting. I finally got the answer and I shall share my findings with the rest of the DevOps or anyone who facing the same issue as me
Correct this statement if needed, Docker Container was started off with Linux, Microsoft join the container bandwagon late and in order to for the container works (with Linux), the DevOps team need to install Linux WSL2 in Windows. And that cost extra overheads which resultant in the process speed.
So to improve the performance speed with containers, the setup should be in Linux OS instead of Windows OS. (and yes the speed reduce drastically)

Setting up backup strategy for backing up postgresql database on cloud foundry

We have setup a community postgresql service on Cloud Foundry (IBM Blumix). This is a free service and no automated backup and recovery is supported out of the box.
Is there a way to set up a standby server or a regular backup in case there is any data corruption/failure?
IBM compose and ElephantSQL can provide this service at a cost, butwe are not ready for it yet.

PostgreSQL is an experimental service and there is not a dashboard and other advanced features (Daily backup for example) that you can find in other services that you mentioned. If you want to do a backup you could write an ad-hoc script that 'saves'\exports all tables as you want and run it every day.
If you need PostegreSQL you can create a PostegreSQL by compose service $17.50 / mo for the first GB and $12 for Extra GB )

We used Postgresql Studio and deployed it on IBM Bluemix. The database service was connected to the pgstudio interface (This restricts the access to only connected databases). We also had to make minor changes to pgstudio so that we could use pg_dump with the interface.
The result: We could manually dump the data. This solution works well as we could take regular dumps (though manually).

In the free tier you are right in saying that you cant get the backup. Those features are available only in Compose for PostgresSQL service - but that's a paid service.

Tool to create mongodb sharded cluster

I need a tool to manage a cluster of mongodbs. With an increasing number of machines, it is hard to maintain each machine without a tool.
More details:
The database grows around 50 MB per day, so they are approximately 1.5 GB per month. The mongodb is great for this because just increase a machine in your cluster resolve the size problem. The problem is that this change requires entering the host configuration and make the changes manually. I'd like to optimize the time of the team with a tool that allows remote execution, for example, run and save scripts or similar.

You can use Juju to create mongodb cluster :
https://github.com/charms/mongodb
http://www.youtube.com/playlist?list=PLyZVZdGTRf8g-5E7ppGGpxrraStyi493V
http://www.jorgecastro.org/2014/03/17/introducing-juju-bundles/

You need a provisioning tool, like Vagrant. Or SSH wrapper like Fabric.

MongoDB Cloud Manager has a Public API that you can use for cluster deployment automation.
Here's the link to the (very well-written) official docs.

Does azure support things like mongodb and redis?

Can you use mongodb and redis/memcached with azure?
I'm guessing no but just want to make sure.
It turns out they do support things other than .net, are they using linux servers then?

You can very easily run mongodb in Windows Azure. I presented this at MongoSV - video here.
EDIT: In December 2011, 10gen published their official MongoDB+Azure code on github. This contains a project for replica-sets, as well as a demo ASP.NET MVC application (taken from the Windows Azure Platform Training Kit) that uses a replica set for its storage.
Standalone servers are straightforward, except you have to deal with scale-out: you can't have multiple instances of a standalone server simultaneously, so you'll need to plan for this: take all but one out of the load balancer, or only launch mongod if you can acquire the Cloud Drive lock.
Replicasets are doable, as I demonstrated at MongoSV. However, I didn't cover the intricacies of graceful shutdown of a replicaset to ensure zero data loss.
You can run memcached as well - see David Aiken's post about this. Note: Now that the AppFabric Cache service is live, you should look into the pros/cons of using that over memcached. Cost-wise, AppFabric Cache should run much less, as you don't have to pay for role instances to host your cache. More info about AppFabric Cache here.

You now also have the option of running Redis in Windows Azure on Linux virtual machines ! In the case of Redis, this would allow you to use the "official" build instead of the "unsupported" Windows build ... For MongoDB, both choices seem equally valid (running on Linux virtual machines, "plain" Windows virtual machines, or using 10gen's package to run on "managed" VMs (Cloud Services).

FYI, there's now a Redis installer for Windows Azure available from MS Open Tech (my team). Here's a tutorial on how to use it: http://ossonazure.interoperabilitybridges.com/articles/how-to-deploy-redis-to-windows-azure-using-the-command-line-tool

[UPDATE] Azure now supports MongoDB and Redis.
http://azure.microsoft.com/blog/2014/04/22/announcing-new-mongodb-instances-on-microsoft-azure/
http://azure.microsoft.com/en-us/services/cache/

In the Azure Store you can now select Redis Cloud as an add-on.
Heres the Azure store description:
"Redis Cloud is a fully-managed cloud service for hosting and running Redis in a highly-available and scalable manner, with predictable and stable top performance. Tell us how much memory you need and get started instantly with your new Redis database."
PUBLISHED DATE 3/31/2014
You can access the store by selecting the "New" button in the Azure portal then "Store". I have yet to use it but it looks promising.

Azure now has a first-party Redis service, currently in preview:
http://azure.microsoft.com/en-us/documentation/articles/cache-dotnet-how-to-use-azure-redis-cache/

Running JIRA on a VM

Anyone have any success or failure running Jira on a VM?
I am setting up a new source control and defect tracking server. My server room is near full and my services group suggested a VM. I saw that a bunch of people are running SVN on VM (including NCSA). The VM would also free me from hardware problems and give me high availability. Finally, it frees me from some red tape and it can be implemented faster.
So, does anyone know of any reason why I shouldn't put Jira on a VM?
Thanks

We just did the research for this, this is what we found:
If you are planning to have a small number of projects (10-20) with 1,000 to 5,000 issues in total and about 100-200 users, a recent server (2.8+GHz CPU) with 256-512MB of available RAM should cater for your needs.
If you are planning for a greater number of issues and users, adding more memory will help. We have reports that allocating 1GB of RAM to JIRA is sufficient for 100,000 issues.
For reference, Atlassian's JIRA site (http://jira.atlassian.com/) has over 33,000 issues and over 30,000 user accounts. The system runs on a 64bit Quad processor. The server has 4 GB of memory with 1 GB dedicated to JIRA.
For our installation (<10000 issues, <20 concurrent sessions at a time) we use very little server resources (<1GB Ram, running on a quad-core processor we typically use <5% with <30% peak), and VM didn't impact performance in any measurable ammount.

I don't see why you shouldn't run jira off a vm - but jira needs a good amount of resources, and if your vm resides on a heavily loaded machine, it may exhibit poor performance. Why not log a support request (support.atlassian.com) and ask?

We run Jira on a virtual machine - VMWare running Windows Server 2003 SE and storing data on our SQL Server 2000 server. No problems, works well.

My company moved our JIRA instance from a hosted physical server to an Amazon EC2 instance recently, and everything is holding up pretty well. We're using an m1.large instance (64-bit o/s with 4 virtual cores and 8GB RAM), but that's way more than we need just for JIRA; we're also hosting Confluence and our corporate Web site on the same EC2 instance.
Note that we are a relatively small outfit; our JIRA instance has 25 users (with maybe 15 of them active) and about 1000 JIRA issues so far.

We run our JIRA (and other Atlassian apps) instance on Linux-based VM instances. Everything run very nicely.

Disk access speed with JIRA on VM...
http://confluence.atlassian.com/display/JIRA/Testing+Disk+Access+Speed
I'm wondering if the person who is using JIRA with VM (Chris Latta) is running ESX underneath - that may be faster than a windows host.

I have managed to run Jira, Bamboo, and FishEye from a set of virtual machines all hosted from the same server. Although I would not recommend this setup for production in most shops. Jira has fairly low requirements by today's standards. Just be sure you can allow enough resources from your host machine things should run fine.

If, by VM, you mean a virtual instance of an OS, such as an instance of linux running on Xen, VMWare, or even Amazon EC2, then Jira will run just fine. The only time you need to worry about virtual systems is if you're doing something that depends on hardware, such as running graphical 3D apps, or say something that uses a fax modem or a Digium telephony card with Asterisk.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse