Easy way to create services on HDP cluster - service

We have a company with a HDP 2.6.4 cluster and an outsource / offshore team that handles different ops tasks. Due to quite strict data access policies, that cannot have access to quite a few datasets. However, we do need them to be able to 24/7 monitor and (ideally) execute different jobs.
So I'm in position as someone in Big data team to enable them to do so, but without access to data. Not being sure what would HDP 2.6 have to offer, I do know that there are certain tools that enable devs to develop all kind of API endpoints, which could then be mapped to different ETL jobs / shell scripts etc.
Would this be optimal approach from an architectural standpoint?
I was thinking of getting us something like Dreamfactory, but opensource and something I can run on premise. Any ideas?
Cheers!

DreamFactory can generate REST APIs for a multitude of databases, among them MySQL, Microsoft SQL Server, Oracle, PostgreSQL, and MongoDB. Along with the API, DreamFactory will also auto-generate an extensive set of interactive Swagger documentation for your API. Are you looking to connect databases?

Related

Deploying Strapi on Kubernetes (GKE)

I want to deploy Strapi on GKE (Kubernetes), I have a docker-compose file, and I think I can use kcompose to create the deployment.
My questions is, has anyone used Mongodb Atlas + GKE or should I deploy Mongo on my own?
The question is more opinion based. It all depends on your needs.
If your needs match one of below you should stay with MongoDB:
Your app runs on-prem and contracts or privacy statements dont allow you to store data with a 3rd party.
You need large storage but not much query power.
There is other privacy/compliance issues.
Your app does not have internet access (firewalls, isolated environments)
You are running 3rd party applications that require a very old version of MongoDB
Here are some MongoDB Altas advantages:
Easily deploy, modify, and elastically scale their database clusters with a few clicks or an API call
Gain complete visibility into the performance the database and the underlying instances
Focus more on development, with built-in operational and security best practices such as geographically distributed, auto-healing clusters, and always-on authentication and encryption.
The best way would be if you will check how work with MongoDB Atlas on GCP looks alike. You can check this tutorial.

Is there any free WebUI tools for administrating and monitoring MongoDb server instance similar to Atlas Cloud?

I am searching for tool, which should provide collections overview, queries to them, replicas configurations, instance performance dashboards.
You can look for zabbix as elementary monitoring. However for querying the DB for data you need to use custom shell scripts or metric reporting tools to get that information

MongoDB on Azure worker role

I m developing an application using SignalR to manage websockets and allow my clients to dialog between each other.
I m planning to host this back-office on an Azure worker role. As my SignalR requests carry data that is most of the time saved in the database, I m wondering if NoSQL's MongoDB instead of the classic SQL Server/Entity Framework couple should be a good approach.
Assuming that my application's data types will be strings for most of them, I think MongoDB will be a reliable and a performant solution, and it will allow me to get rid of Azure's SQL's database costs.
For information, the Azure worker role will be running on a machine with the following hardware: 1 core CPU, 3.5GB RAM and 50GB SSD storage.
Do you think I m on a good start with this architecture ?
Thanks
Do you think I m on a good start with this architecture?
In a word, no.
A user asked a similar question regarding running Redis on Worker Roles - Setting up Redis on Azure cloud service worker role - all of the content on that Q/A is relevant in the MongoDb context.
I'd suggest that you read my answer as it goes into more detail, but as an overview of why this is a bad architectural approach:
You cannot guarantee when a Worker Role will be restarted by the Azure Service Fabric.
In a real-world implementation of Mongo, you would run multiple nodes within a cluster, with a single Worker Role (as you have suggested in your question) this won't be possible.
You will need to manage your MongoDb installation within the Worker Role and they simply aren't designed for this.
If you are really fixed on using Mongo, I would suggest that you use a hosted solution such as MongoLabs (as suggested in earlier answers), or consider hosting it on Azure IaaS VM's.
If you are not fixed on using Mongo, I would sincerely suggest that you look at Azure DocumentDb (also suggested above), Microsoft's Azure NoSQL offering - I have used it in several production systems already and it is certainly a capable NoSQL solution; granted, it may not have all of the features available with MongoDb.
If you are looking at a NoSQL solution for caching of data (i.e. not long term storage), I would suggest you take a look at Azure Redis Cache, which is a very capable Redis offering.
Azure has its own native NoSQL Document database called DocumentDB, have you had a look at it? If I were you I would use DocumentDB unless there are some special requirements that you have that you have not mentioned, but from what little requirement info that you have posted DocumentDB would do just fine. I don't think that it is quite similar to MongoDB in terms of the basic functionality, see this article for a comparison between Azure DocumentDB and MongoDB.

MongoDB on Azure Cloud

Is MongoDB for Azure production ready ?
Can anyone share some experience with it ?
Looks like comfort is missing for using it for prod.
What do you think ?
Edit: Since there is a misunderstanding in my question i will try to redefine it.
The information i look into from the community is sharing an info of someone who is running mongo on windows azure to share experience from it.
What i mean by experience is not how to run it in the cloud(we already have the manual on 10gens faq) nor how many bugs it have(we can see that in mongo-azure jira).
What i am looking for is that how it is going with performance ?
Are there any problems(side effects) from running mongodb on azure ?
How does mongodb handle VM recycling ?
Does anyone tried sharding ?
In the end, is the mongo-azure worker role from 10gens stable for using it in production ?
Hope this clears out.
A bit of clarification here. MongoDB itself is production-ready. And MongoDB works just fine in Windows Azure, as long as you set up the scaffolding to get it to work in the environment. This typically entails setting up an Azure Drive, to give you durable storage. Alternatively, using a replicaset, you effectively have eventual consistency across the set members. Then, you could consider going with a standalone (or standalone with hot standby). Personally, I prefer a replicaset model, and that's typical guidance for production MongoDB systems.
As far as 10gen's support for Windows Azure: While the page #SyntaxC4 points to does clarify the wrapper is in a preview state, note that the wrapper is the scaffolding code that launches MongoDB. This scaffolding was initially released in December 2011, and has had a few tweaks since then. It uses the production MongoDB bits (and works just fine with version 2.0.5 which was published on May 9). One caveat is that the MongoDB replicaset roles are deployed alongside your application's roles, since the client app needs visibility to all replica set nodes (to properly build the set). To avoid this limitation, you'd need to run mongos and the entry point (and that's not part of 10gen's scaffolding solution).
Forgetting the preview scaffolding a moment: I have customers running MongoDB in production, with custom scaffolding. One of them is running a rather large deployment, with multiple shards, using a replicaset per shard.
So... does it work in Windows Azure? Yes. Should you take advantage of 10gen's supplied scaffolding? If you're just looking for a simple way to launch a replicaset, I think it's fine. If you want a standalone model, or a shard model, or if you need a separate deployment for MongoDB, you'd currently need to do this on your own (or modify the project 10gen published).
MongoLab is now offering Mongo as a service on Azure MongoLab Blog
Free Demo account is 0.5 GB storage are available in the Windows Azure Store
The warning message on their site says that it's a preview. This would mean that there would be no support for it at a product level in Windows Azure.
If you want to form your own opinion on a comfort level, you can take a look at their bug tracking system and get a feeling for what people are currently reporting as issues.

Does azure support things like mongodb and redis?

Can you use mongodb and redis/memcached with azure?
I'm guessing no but just want to make sure.
It turns out they do support things other than .net, are they using linux servers then?
You can very easily run mongodb in Windows Azure. I presented this at MongoSV - video here.
EDIT: In December 2011, 10gen published their official MongoDB+Azure code on github. This contains a project for replica-sets, as well as a demo ASP.NET MVC application (taken from the Windows Azure Platform Training Kit) that uses a replica set for its storage.
Standalone servers are straightforward, except you have to deal with scale-out: you can't have multiple instances of a standalone server simultaneously, so you'll need to plan for this: take all but one out of the load balancer, or only launch mongod if you can acquire the Cloud Drive lock.
Replicasets are doable, as I demonstrated at MongoSV. However, I didn't cover the intricacies of graceful shutdown of a replicaset to ensure zero data loss.
You can run memcached as well - see David Aiken's post about this. Note: Now that the AppFabric Cache service is live, you should look into the pros/cons of using that over memcached. Cost-wise, AppFabric Cache should run much less, as you don't have to pay for role instances to host your cache. More info about AppFabric Cache here.
You now also have the option of running Redis in Windows Azure on Linux virtual machines ! In the case of Redis, this would allow you to use the "official" build instead of the "unsupported" Windows build ... For MongoDB, both choices seem equally valid (running on Linux virtual machines, "plain" Windows virtual machines, or using 10gen's package to run on "managed" VMs (Cloud Services).
FYI, there's now a Redis installer for Windows Azure available from MS Open Tech (my team). Here's a tutorial on how to use it: http://ossonazure.interoperabilitybridges.com/articles/how-to-deploy-redis-to-windows-azure-using-the-command-line-tool
[UPDATE] Azure now supports MongoDB and Redis.
http://azure.microsoft.com/blog/2014/04/22/announcing-new-mongodb-instances-on-microsoft-azure/
http://azure.microsoft.com/en-us/services/cache/
In the Azure Store you can now select Redis Cloud as an add-on.
Heres the Azure store description:
"Redis Cloud is a fully-managed cloud service for hosting and running Redis in a highly-available and scalable manner, with predictable and stable top performance. Tell us how much memory you need and get started instantly with your new Redis database."
PUBLISHED DATE 3/31/2014
You can access the store by selecting the "New" button in the Azure portal then "Store". I have yet to use it but it looks promising.
Azure now has a first-party Redis service, currently in preview:
http://azure.microsoft.com/en-us/documentation/articles/cache-dotnet-how-to-use-azure-redis-cache/