Running two containers on Fargate using CDK - mongodb

I'd like to use Fargate to run two containers - one for the main project's backend, and another for the database (MongoDB). The basic example included in the GitHub repo shows how to run a single container on Fargate using CDK, however I still have 2 issues:
The example doesn't show how to run two containers.
I'd like to scale the database containers but have them share the data storage (so that the data gets stored in a central place and stays synchronized between the different containers).
I've figured out how to (sort of) fixed the first issue, similarly to how ecs.LoadBalancedFargateService is implemented, however the second issue still remains.
For reference, this is what I have so far in stack.ts (the rest is the basic boilerplate cdk init app --language typescript generates for you):
import cdk = require("#aws-cdk/cdk");
import ec2 = require("#aws-cdk/aws-ec2");
import ecs = require("#aws-cdk/aws-ecs");
import elbv2 = require("#aws-cdk/aws-elasticloadbalancingv2");
const {ApplicationProtocol} = elbv2;
export class AppStack extends cdk.Stack {
constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// Create VPC and Fargate Cluster
const vpc = new ec2.VpcNetwork(this, "FargateVPC", {
maxAZs: 2
});
const cluster = new ecs.Cluster(this, "Cluster", {vpc});
// Create task definition
const fargateTaskDefinition = new ecs.FargateTaskDefinition(this, "FargateTaskDef", {
memoryMiB: "512",
cpu: "256"
});
// Create container from local `Dockerfile`
const appContainer = fargateTaskDefinition.addContainer("Container", {
image: ecs.ContainerImage.fromAsset(this, "Image", {
directory: ".."
})
});
// Set port mapping
appContainer.addPortMappings({
containerPort: 5000
});
// Create container from DockerHub image
const mongoContainer = fargateTaskDefinition.addContainer("MongoContainer", {
image: ecs.ContainerImage.fromDockerHub("mongo")
});
// Set port mapping
mongoContainer.addPortMappings({
containerPort: 27017
});
// Create service
const service = new ecs.FargateService(this, "Service", {
cluster,
taskDefinition: fargateTaskDefinition,
desiredCount: 2
});
// Configure task auto-scaling
const scaling = service.autoScaleTaskCount({
maxCapacity: 5
});
scaling.scaleOnCpuUtilization("CpuScaling", {
targetUtilizationPercent: 70
});
// Create service with built-in load balancer
const loadBalancer = new elbv2.ApplicationLoadBalancer(this, "AppLB", {
vpc,
internetFacing: true
});
// Allow incoming connections
loadBalancer.connections.allowFromAnyIPv4(new ec2.TcpPort(5000), "Allow inbound HTTP");
// Create a listener and listen to incoming requests
const listener = loadBalancer.addListener("Listener", {
port: 5000,
protocol: ApplicationProtocol.Http
});
listener.addTargets("ServiceTarget", {
port: 5000,
protocol: ApplicationProtocol.Http,
targets: [service]
});
// Output the DNS where you can access your service
new cdk.Output(this, "LoadBalancerDNS", {
value: loadBalancer.dnsName
});
}
}
Thanks in advance.

Generally, running a database in a Fargate container is not recommended since there is not currently a good solution for persisting data. You could integrate a hook that copies data into something like S3 prior to a task stopping, but generally those kinds of solutions are very fragile and not recommended.
You may want to check out DocumentDB as an alternative to running your own MongoDB instances, though support for DocumentDB constructs in the CDK are not yet fully fleshed out.
Another alternative is to run regular ECS tasks and attach an EBS volume on your EC2 Instance. Then you can use docker volumes to mount the EBS volume to your container. With this approach, you'll need to tag the instance metadata and use an ECS placement constraint to ensure that your task gets placed on the instance that has the EBS volume attached.
If either of these approaches works for you, feel free to open a feature request on the CDK repository. Hope this helps!

Is AWS Fargate a hard requirement?
If not, you could opt for simple ECS + Ec2, it supports the use of persistent data volumes:
Fargate tasks only support nonpersistent storage volumes.
For EC2 tasks, use data volumes in the following common examples:
To provide persistent data volumes for use with a container
To define an empty, nonpersistent data volume and mount it on multiple containers
To share defined data volumes at different locations on different containers on the same container instance
To provide a data volume to your task that is managed by a third-party volume driver
I haven't tried it myself but it seems that CDK has stable support for ECS + Ec2.
PS the link to the basic example is broken, I tried to find the new location but in the new example repository without success.

Related

connection to external server (mongodb server) fails from fargate container deployed using cdk

I created a simple node.js/express app and created docker image and successfully pushed it to aws ecr.
Next, I created a cdk project to deploy this container to fargate with public application load balancer. ecs_patterns.ApplicationLoadBalancedFargateService
Although the deployment cmd (cdk deploy) was successful, cluster page in aws console shows "No tasks running" and Services tab within the cluster shows red bar with "0/1 Tasks running" and Tasks tab within cluster shows tasks are getting created and stopped (every 1 or 2 min, a task is created and eventually stopped and a new one is created and this keeps on going forever)
Going inside a stopped task and its Log tab shows
ERROR: Connecting to MongoDB failed. Please check if MongoDB server is running at the correct host/port.. This is the error message I have in my app when connection to mongodb fails when the server is initialized.
The DB credentials and connection url are valid (see below) and it runs in a separate EC2 instance with EIP and domain name. In fact, I can connect to the DB from my dev machine which is outside aws.
Also, just for trial, I created a stack manually through console by creating security groups (for load balancer and service), target group, application load balancer, listener (port 80 HTTP), cluster, task definition (with correct db credentials set in env var), service, etc., it's working without any issue.
All I want is to create similar stack using cdk (I don't want to manually create/maintain it)
Any clue on why connection to external server/db is failing from a fargate container would be very useful. I'm unable to compare the "cdk created cloudformation template" (that's not working) with the "manually created stack" (that's working) as there are too many items in the autogenerated template.
Here is the cdk code based on aws sample code:
const vpc = new ec2.Vpc(this, "MyVpc", { maxAzs: 2 });
const cluster = new ecs.Cluster(this, "MyCluster", { vpc });
const logDriver = ecs.LogDriver.awsLogs({ streamPrefix: "api-log" });
const ecrRepo = ecr.Repository.fromRepositoryName(this, "app-ecr", "abcdef");
new ecs_patterns.ApplicationLoadBalancedFargateService(
this, "FargateService", {
assignPublicIp: true,
cluster,
desiredCount: 1,
memoryLimitMiB: 1024,
cpu: 512,
taskImageOptions: {
containerName: "api-container",
image: ecs.ContainerImage.fromEcrRepository(ecrRepo),
enableLogging: true,
logDriver,
environment: { MONGO_DB_URL: process.env.DB_URL as string }
},
publicLoadBalancer: true,
loadBalancerName: "api-app-lb",
serviceName: "api-service"
}
);
It turned out to be a silly mistake! Instead of MONGO_DB_URL it should be DB_URL because that's what my node.js/express server in the container is using.

Configure Spring Data Redis to perform all operations via Elasticache configuration endpoint?

Description
Is it possible for Spring Data Redis to use Elasticache's configuration endpoint to perform all cluster operations (i.e., reading, writing, etc.)?
Long Description
I have a Spring Boot application that uses a Redis cluster as data store. The Redis cluster is hosted on AWS Elasticache running in cluster-mode enabled. The Elasticache cluster has 3 shards spread out over 12 nodes. The Redis version that the cluster is running is 6.0.
The service isn't correctly writing or retrieving data from the cluster. Whenever performing any of these operations, I get a message similar to the following:
io.lettuce.core.RedisCommandExecutionException: MOVED 16211 10.0.7.254:6379
In searching the internet, it appears that the service isn't correctly configured for a cluster. The fix seems to be set the spring.redis.cluster.nodes property with a list of all the nodes in the Elasticache cluster (see here and here). I find this rather needless, considering that the Elasticache configuration endpoint is supposed to be used for all read and write operations (see the "Finding Endpoints for a Redis (Cluster Mode Enabled) Cluster" section here).
My question is this: can Spring Data Redis use Elasticache's configuration endpoint to perform all reads and writes, the way the AWS documentation describes? I'd rather not hand over a list of all the nodes if Spring Data Redis can use the configuration endpoint the way its meant to be used. This seems like a serious limitation to me.
Thanks in advance!
Here is what I found works:
#Bean
public RedisConnectionFactory lettuceConnectionFactory()
{
LettuceClientConfiguration config =
LettucePoolingClientConfiguration
.builder()
.*your configuration settings*
.build();
RedisClusterConfiguration clusterConfig = new RedisClusterConfiguration();
clusterConfig.addClusterNode(new RedisNode("xxx.v1tc03.clustercfg.use1.cache.amazonaws.com", 6379));
return new LettuceConnectionFactory(clusterConfig, config);
}
where xxx is the name from your elasticache cluster.

How to read from redis replica with go-redis

We have a go lang service which will go to redis, to fetch data for each request and we want to read data from redis slave node as well. We went through the documentation of redis and go-redis library and found that, in order to read data from redis slave we should fire readonly command from redis side. We are using ClusterOptions on go-redis library to setup a readonly connection to redis.
redis.NewClusterClient(&redis.ClusterOptions{
Addrs: []string{redisAddress},
Password: "",
ReadOnly: true,
})
After doing all this we are able to see (Using monitoring) that read requests are handled by master nodes only. I hope this is not expected and I am missing something or doing it wrong. Any pointers will be appreciated to solve this problem.
Some more context:
redisAddress in above code is single kubernetes cluster IP. Redis is deployed using kubernetes operator with 3 masters and 1 replica for each master.
I`ve done it setting the option RouteRandomly: true

specify more than one endpoint in java s3 api for ceph to connect to ceph cluster?

Hello every one i have just started to get my hands dirty with ceph object storage i.e. radosgateway and for this purpose have spun out a very basic single node ceph/daemon docker container which works perfectly fine for both s3cmd and java s3 API (the mgr dashboard don't work though container shuts down when issuing command ceph mgr module enable dashboard) but one thing i cant seem to figure out is how can we specify more than one endpoint for our java s3 client to connect to our cluster? does it have something to do with HTTP front-ends? please need some pointers or a sample example would be great.Following is my code to connect to a single node ceph cluster built using ceph/daemon image's docker container.
String accessKey = "demoKey";
String secretKey = "demoKey";
try {
ClientConfiguration clientConfig = new ClientConfiguration();
clientConfig.setProtocol(Protocol.HTTP);
System.setProperty(SDKGlobalConfiguration.DISABLE_CERT_CHECKING_SYSTEM_PROPERTY,"true");
if (SDKGlobalConfiguration.isCertCheckingDisabled())
{
System.out.println("Cert checking is disabled");
}
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonS3 conn = new AmazonS3Client(credentials);
conn.setEndpoint("http://ubuntu:8080"); //more than one endpoint ??
List<Bucket> buckets = conn.listBuckets();
for (Bucket bucket : buckets) {
System.out.println(bucket.getName() + "\t" +
StringUtils.fromDate(bucket.getCreationDate()));
}
}catch(Exception ex)
{
ex.printStackTrace();
}
Finally my ceph version
ceph version 14.2.4 nautilus (stable)
The Ceph Object Gateway can have multiple instances. These are combined by some load balancer. You have one end point that is distributing the load onto the ceph object gateway instances. The load balancer itself can be scaled as well (i.e. round-robin DNS or whatnot).
I found a nice use case here. Maybe it helps. Have a look at the media storage architecture.

How to convert/migrate existing google cloud platform infrastructure to terraform or other IaC

Currently we have our kubernetes cluster master set to zonal, and require it to be regional. My idea is to convert the existing cluster and all workloads/nodes/resources to some infrastructure-as-code - preferably terraform (but could be as simple as a set of gcloud commands).
I know with GCP I can generate raw command lines for commands I'm about to run, but I don't know how (or if I even can) to convert existing infrastructure to the same.
Based on my research, it looks like it isn't exactly possible to do what I'm trying to do [in a straight-forward fashion]. So I'm looking for any advice, even if it's just to read some other documentation (for a tool I'm not familiar with maybe).
TL;DR: I'm looking to take my existing Google Cloud Platform Kubernetes cluster and rebuild it in order to change the location type from zonal to master - I don't actually care how this is done. What is a currently accepted best-practice way of doing this? If there isn't one, what is a quick and dirty way of doing this?
If you require me to specify further, I will - I have intentionally left out linking to specific research I've done.
Creating a Kubernetes cluster with terraform is very straightforward because ultimately making a Kubernetes cluster in GKE is straightforward, you'd just use the google_container_cluster and google_container_node_pool resources, like so:
resource "google_container_cluster" "primary" {
name = "${var.name}"
region = "${var.region}"
project = "${var.project_id}"
min_master_version = "${var.version}"
addons_config {
kubernetes_dashboard {
disabled = true
}
}
maintenance_policy {
daily_maintenance_window {
start_time = "03:00"
}
}
lifecycle {
ignore_changes = ["node_pool"]
}
node_pool {
name = "default-pool"
}
}
resource "google_container_node_pool" "default" {
name = "default"
project = "${var.project_id}"
region = "${var.region}"
cluster = "${google_container_cluster.primary.name}"
autoscaling {
min_node_count = "${var.node_pool_min_size}"
max_node_count = "${var.node_pool_max_size}"
}
management {
auto_repair = "${var.node_auto_repair}"
auto_upgrade = "${var.node_auto_upgrade}"
}
lifecycle {
ignore_changes = ["initial_node_count"]
}
node_config {
machine_type = "${var.node_machine_type}"
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform",
]
}
depends_on = ["google_container_cluster.primary"]
}
For a more fully featured experience, there are terraform modules available like this one
Converting an existing cluster is considerably more fraught. If you want to use terraform import
terraform import google_container_cluster.mycluster us-east1-a/my-cluster
However, in your comment , you mentioned wanting to convert a zonal cluster to a regional cluster. Unfortunately, that's not possible at this time
You decide whether your cluster is zonal or regional when you create
it. You cannot convert an existing zonal cluster to regional, or vice
versa.
Your best bet, in my opinion, is to:
Create a regional cluster with terraform, giving the cluster a new name
Backup your existing zonal cluster, either using an etcd backup, or a more sophisticated backup using heptio-ark
Restore that backup to your regional cluster
I wanted to achieve exactly that: Take existing cloud infrastructure and bring it to infrastructure as code (IaC), i.e. put it in *.tf files
There were basically 2 options that I found and took into consideration:
terraform import (Documentation)
Because of the following limitation terraform import did not achieve exactly what I was looking for, because it requires to manually create the resources.
The current implementation of Terraform import can only import resources into the state. It does not generate configuration. A future version of Terraform will also generate configuration.
Because of this, prior to running terraform import it is necessary to write manually a resource configuration block for the resource, to which the imported object will be mapped.
Terraformer (GitHub Repo)
A CLI tool that generates tf/json and tfstate files based on existing infrastructure (reverse Terraform).
This tools is provider-agnostic and follows the flow as terraform, i.e. plan and import. It was able to import specific resources entire workspaces and convet it into *.tf files.