Glue Python Shell - Private Subnet Access - amazon-redshift

I have a Redshift Cluster in my Private Subnet.
I am trying to write a UNLOAD job using Glue Python Shell.
But I am not able to connect with my Cluster since it resides in Private subnet.
I tried to Add JDBC and Redshift Connection, still I am unsuccessful.
I went through this article and still unfortunately I am not able to understand the workflow.
How to connect Glue Python Shell to Redshift Cluster available in Private Subnet ?
It will be great if some one could help me to understand this workflow.

I did the following steps in order to connect my Glue Python Shell Job with the Redshift Cluster under the Private Subnet.
Define the JDBC Connection
● Go to Glue Console
● Under Connections Add a new JDBC Connection
● Provide the necessary details for your Redshift endpoint like
-> JDBC URL : jdbc:redshift://host:port/database
-> Username and Password
● In VPC ID choose the VPC ID of the Redshift Cluster itself
● Subnet ID also choose the same as Redshift Cluster
● Security Group : Choose the same Security Group used for the Redshift Cluster
● Once done save this connection
Change the Security Group : Navigate to the Redshift Security Group that we selected in the first step and make the following changes.
● Copy the Security Group ID
● Edit the Security Group
● Under Inbound Rules: Choose ALL TCP and in source paster the Security Group ID ( Basically here we are self referencing Security Group for ALL TCP )
● Save the Security Group
Navigate to the Glue Console again and under the connection , choose the connection that is defined in Step 1 and test it , this option is available in the console itsef
If the configurations are fine you will see the success message.
Now just go to your job and under Connections choose the connection defined above and you can access it.
References :
How can I access aws resources in VPC from AWS glue?
https://docs.aws.amazon.com/glue/latest/dg/setup-vpc-for-glue-access.html
https://docs.aws.amazon.com/glue/latest/dg/connection-JDBC-VPC.html
https://aws.amazon.com/blogs/big-data/how-to-access-and-analyze-on-premises-data-stores-using-aws-glue/
https://docs.aws.amazon.com/glue/latest/dg/how-it-works.html
Hope it helps..!!!

Related

postgres redirect queries to standby?

I am trying to create a connection pooling system with load balancing. From what I unsderstand PGbouncer doesn't have a load balancing option and all I can do is to create a file with all the users+pass and configure the dbs/clusters. but in this option i cannot direct the connections to specific cluster. i'll explain: inserts will go to primary and selects will go to slave. what is possible is to let user "user1" connect to cluster on port 5432 to DB "database123".
How can I redirect queries to standby with other tools?
I tried to do this with pgpool but for some reason the standby is always on "waiting" status --> Cannot configure pgpool with master and slave nodes
It is impossible to tell from an SQL statement if it will modify data or not. What about SELECT delete_my_data();?
So all tools that try to figure that out by looking at the SQL statement are potentially problematic.
The best you can do is to write your application so that it uses two data sources: one for reading and one for writing, and you determine what goes where.

mongodb mms monitoring agent does not find group members

I have installed the latest mongodb mms agent (6.5.0.456) on ubuntu 16.04 and initialised the replicaset. Hence I am running a single node replicaset with the monitoring agent enabled. The agent works fine, however it does not seem to actually find the replicaset member:
[2018/05/26 18:30:30.222] [agent.info] [components/agent.go:Iterate:170] Received new configuration: Primary agent, Assigned 0 out of 0 plus 0 chunk monitor(s)
[2018/05/26 18:30:30.222] [agent.info] [components/agent.go:Iterate:182] Nothing to do. Either the server detected the possibility of another monitoring agent running, or no Hosts are configured on the Group.
[2018/05/26 18:30:30.222] [agent.info] [components/agent.go:Run:199] Done. Sleeping for 55s...
[2018/05/26 18:30:30.222] [discovery.monitor.info] [components/discovery.go:discover:746] Performing discovery with 0 hosts
[2018/05/26 18:30:30.222] [discovery.monitor.info] [components/discovery.go:discover:803] Received discovery responses from 0/0 requests after 891ns
I can see two processes for monitor agents:
/bin/sh -c /usr/bin/mongodb-mms-monitoring-agent -conf /etc/mongodb-mms/monitoring-agent.config >> /var/log/mongodb-mms/monitoring-agent.log 2>&1
/usr/bin/mongodb-mms-monitoring-agent -conf /etc/mongodb-mms/monitoring-agent.config
However if I terminate one, it also tears down the other, so I do not think that is the problem.
So, question is what is the Group that the agent is referring to. Where is that configured? Or how do I find out which Group the agent refers to and how do I check if the group is configured correctly.
The rs.config() looks fine, with one replicaset member, which has a host field, which looks just fine. I can use that value to connect to the instance using the mongo command. no auth is configured.
EDIT
It kind of looks that the cloud manager now needs to be configured with the seed host. Then it starts to discover all the other nodes in the replicaset. This seems to be different to pre-cloud-manager days, where the agent was able to track the rs - if I remember correctly... Probably there still is a way to get this done easier, so I am leaving this question open for now...
So, question is what is the Group that the agent is referring to. Where is that configured? Or how do I find out which Group the agent refers to and how do I check if the group is configured correctly.
Configuration values for the Cloud Manager agent (such as mmsGroupId and mmsApiKey) are set in the config file, which is /etc/mongodb-mms/monitoring-agent.config by default. The agent needs this information in order to communicate with the Cloud Manager servers.
For more details, see Install or Update the Monitoring Agent and Monitoring Agent Configuration in the Cloud Manager documentation.
It kind of looks that the cloud manager now needs to be configured with the seed host. Then it starts to discover all the other nodes in the replicaset.
Unless a MongoDB process is already managed by Cloud Manager automation, I believe it has always been the case that you need to add an existing MongoDB process to monitoring to start the process of initial topology discovery. Once a deployment is monitored, any changes in deployment membership should automatically be discovered by the Cloud Manager agent.
Production employments should have authentication and access control enabled, so in addition to adding a seed hostname and port via the Cloud Manager UI you usually need to provide appropriate credentials.

Cannot connect to Cloud SQL read replica that is replicating from external master

Following the documentation to configure external master replication: https://cloud.google.com/sql/docs/mysql/replication/configure-external-master
I have created a First Generation read replica that is replicating from an external master. But I cannot connect to the Cloud SQL read replica. The documentation states you need to create a user account on the read replica. Attempting to do this gives you Operation not allowed for a read replica. And I see a root and (anonymous) user already but I cannot change their passwords. I get the same error message Operation not allowed for a read replica.
See this screenshot:
I was able to connect to cloud SQL replica by using 'root' user with no password. In the documentation they suggest you can add users to a replica, but if you try server gives you an error. You can add password to your root user though
You should follow the docs regarding External Master configuration. Especially in:
Before you begin
...
You must have the external IP address and port of the external master instance, and the username and password information for the
replication user on the master instance.
...
Also in Requirements and Tips for Configuring Replication:
The MySQL settings of the master instance are propagated to the replica, including root password and changes to the user table.
......
To summarise, the user and password must be set in the Replication master and use those to connect to the read replica.

AWS + Elastic Beanstalk + MongoDB

I am trying to setup my microservices architecture using AWS Elastic Beanstalk and Docker. That is very easy to do, but when I launch the environment, it launches into the default VPC, thus giving public IP's to the instances. Right now, that's not too much of a concern.
What I am having a problem with is how to set up the MongoDB architecture. I have read: recommended way to install mongodb on elastic beanstalk but still remain unsure on how to set this up.
So far I have tried:
Using the CloudFormation template from AWS here: http://docs.aws.amazon.com/quickstart/latest/mongodb/step2b.html to launch a primary with 2 replica node setup into the default VPC, but this gives and assigns public access to the Mongo nodes. I also am not sure how to connect my application since this does not add a NAT instance - do I simply connect directly to the primary node? In case of failure for this node, will the secondary node's IP become the same as that of the primary node so that all connections remain consistent? Or do I need to add my own NAT instance?
I have also tried launching MongoDB into its own VPC (https://docs.aws.amazon.com/quickstart/latest/mongodb/step2a.html) and giving access via the NAT, but this means having two different VPCs (one for my EB instances and one for the MongoDB). In this case would I connect to the NAT from my EB VPC in order to route requests to the databases?
I have also tried launching a new VPC for the MongoDB architecture first and then trying to launch EB into this VPC. For some reason, the load balancing setup won't let me add into the subnets, giving me the error: "Custom Availability Zones option not supported for VPC environments".
I am trying to launch all this in us-west-1. It's been two days now and I have no idea where to go or what the right way is to tackle this issue. I want the databases to be private (no public access) with a NAT gateway, so ideally my third method seems what I want, but I cannot seem to add the new EB instances/load balancer into the newly-created MongoDB VPC. This is the setup I'm going for: http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/images/default-vpc-diagram.png but I am trying to use the templates to do this.
What am I doing wrong here? Any help would be much, much appreciated. I have read up a lot about this but still am not sure where to go from here.
Thanks a lot in advance!
Im having this same issue. There seems to be a complete lack of documentation on how to connect an Elastic Beanstalk node.js / express app with the aws Quickstart mongodb cluster set up documentation.
When I run the aws mongo quickstart though it launches a NAT which is public and also a private primary node... maybe this is part of your issue?

AWS Security Group Error

On Amazon Web Services, I'm connecting an Elastic Beanstalk environment to an RDS database, per the tutorial. Launching the database instance worked fine; I connected it to a security group.
Adding the security group to my environment then fails. If I try to add the group name rds-launch-wizard, I get an error - use group id. If I try to add the group id sg-10bea66b, I get the error Security Group does not exist.
The security group does exist. What's going on?
Your RDS instance is inside a VPC, whereas your Elastic Beanstalk application is in EC2-Classic (outside any VPC).
With some exceptions, only security groups that are in the same VPC can be added to each other.
Resolution: Put your EB application in the same VPC as your RDS instance.