AWS EMR cluster with cloudformation: how to enable debugging

AWS EMR cluster with cloudformation: how to enable debugging - aws-cloudformation

When creating an EMR cluster via the UI, I can click 'enable debugging'.
Via the cli, I can add the parameter --enable-debugging.
How can I do it via cloudformation? I did give a LogUri, where I do see the logs, but the EMR web UI carries on telling me 'Debugging not configured' when running Spark jobs.

It just is added as the first step in the steps...so the equivalent of the below in CloudFormation config.
"Steps": [{
"Name": "Setup Hadoop Debugging",
"ActionOnFailure": "TERMINATE_CLUSTER",
"HadoopJarStep": {
"Jar": "command-runner.jar",
"Args": ["state-pusher-script"]
}
}]

The accepted answer is almost correct.
I received an error while deploying: CloudFormation currently only supports long-running clusters, set ActionOnFailure to CANCEL_AND_WAIT or CONTINUE.
So, the following code worked for me (also, YAML version for those, who would like to copypaste that format):
Steps:
- Name: SetupHadoopDebugging
ActionOnFailure: CONTINUE
HadoopJarStep:
Jar: command-runner.jar
Args:
- 'state-pusher-script'

Related

AWS CLI 2 can't update-service use CLI

I have a cluster on ecs everything works well! When I used aws cli v.1, I could update my service used a command like this aws ecs update-service --cluster [cluster-name] --service [service-name] --task-definition [task-name] --force-new-deployment. After updating CLI to v.2 I try to use this command and everything just stuck! I didn't find any changes in aws documentation. Do you have any ideas?
update:
my screenshot
the problem is that everything starts well, without errors or warnings, it just gets stuck!

The AWS CLI version 2 uses a client-side pager. As stated in the documentation, you can "use the --no-cli-pager command line option to disable the pager for a single command use" (or use any of the other options described there).

It worked. What you are seeing is the output as documented. You can scroll through the output using up and down arrow keys or hit the Q key to dismiss the output.
Alternately, you can redirect the output to a file for review later:
aws ecs update-service ... > result.json
Or just ignore the output entirely:
aws ecs update-service ... > /dev/null

Timeout configuration for CloudFormation

I am running CloudFormation updates to ECS. Triggered by CodePipeline. I would like to abort the CloudFormation deployment and rollback to the previous version after a timeout.
What is the best way to accomplish this? I saw something about WaitConditions but I'm not sure that is the right mechanism.
I also found that you can configure a TimeoutInMinutes on nested stacks https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-stack.html#cfn-cloudformation-stack-timeoutinminutes - but sounds like you cannot apply a similar property at the top level of the stack or to an arbitrary resource?
Is there another way that I can use the combination where I can abort the Codepipeline->Cloudformation->ECS deployment after a few minutes if it doesn't succeed?

This is a general gripe with CodePipeline ECS Deploy action (ECS, not ECS B/G) that if you push a bad image, you will have to wait 1hr for the timeout to occur before you can retry the pipeline.
At the moment, CodePipeline doesn't support rollbacks. You can detect a failed pipeline using CloudWatch [1] and take some action. The action will probably be roll-forward to a good version.
[1] Detect and React to Changes in Pipeline State with Amazon CloudWatch Events - https://docs.aws.amazon.com/codepipeline/latest/userguide/detect-state-changes-cloudwatch-events.html

We don't use CodePipeline, we're using Sceptre. But I guess my workaround could still work.
My workaround for this problem is that before triggering a deployment, run a script in the background.
./deployment-breaker.sh &
And for the script
#!/bin/bash
sleep 600
$deploymentStatus = (aws cloudformation describe-stack --stack-name STACK_NAME | jq XXX)
if [[ $deploymentStatus == YOUR_TERMINATE_CONDITION ]]then
aws cloudformation cancel-update-stack --stack-name STACK_NAME
fi

How to schedule ECS tasks on AWS Fargate

I have created a Task Definition on Elastic Container Service and have successfully run it in a Fargate cluster. However when I create a Scheduled Task in said cluster the option for "Launch Type" is hardcoded to EC2. Is there a way, perhaps through the command line to schedule the task to run on Fargate?

Heads up ! This is now supported in AWS:
https://aws.amazon.com/about-aws/whats-new/2018/08/aws-fargate-now-supports-time-and-event-based-task-scheduling/
Although not in some regions - As at Apr-19 it still wasn't supported in EU-west-2 (London). Check the table at the top of this page to see if it's supported in the region you want: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/scheduled_tasks.html

There seem to be no way of scheduling a task on FARGATE.
Only way it can be done right now seems to be by having your 'scheduler' external to ECS. I did it with a lambda. You can also use something like a jenkins or a simple cron task that fires the aws-cli command to ECS, in both these cases though you will need an instance always running.
I wrote a lambda that accepts the params (overrides) to be sent to the ECS task and has the schedule the task was supposed to have.
Update:
It seems there is a schedule tab in FARGATE Cluster details now that will allow you set cron schedules on ECS tasks.

While the AWS Documentation gives you ways to do this through CloudFormation, it seems like they've not yet released this feature anyway. I have been trying to do something similar and ran into the same issue.
Once it does become available, this link from the aws docs should be useful. Here's how they suggest doing it, but I keep running into errors saying NetworkConfiguration is not recognized and LaunchType is not recognized.
"EcsParameters": {
"Group": "string",
"LaunchType": "string",
"NetworkConfiguration": {
"awsvpcConfiguration": {
"AssignPublicIp": "string",
"SecurityGroups": [ "string" ],
"Subnets": [ "string" ]
}
},
Update: Here is an alternative that did end up working for me through the aws events put-targets command on the aws cli!
Make sure your aws cli is up to date. This method fails for older
versions of the cli. run this to update: pip install awscli --upgrade --user
After that, you should be good to go. Use the aws events put-targets --rule <value> --targets <value> command. Make sure that before you run this command you have a rule already defined on your cluster. If not, you can do that with the aws events put-rule cmd too. Refer to the AWS docs for put-rule, and for put-targets.
An example of a rule from the documentation is given below:
aws events put-rule --name "DailyLambdaFunction" --schedule-expression "cron(0 9 * * ? *)"
The put-targets command that worked for me is this:
aws events put-targets --rule cli-RS-rule --targets '{"Arn": "arn:aws:ecs:1234/cluster/clustername","EcsParameters": {"LaunchType": "FARGATE","NetworkConfiguration": {"awsvpcConfiguration": {"AssignPublicIp": "ENABLED", "SecurityGroups": [ "sg-id1233" ], "Subnets": [ "subnet-1234" ] }},"TaskCount": 1,"TaskDefinitionArn": "arn:aws:ecs:1234:task-definition/taskdef"},"Id": "sampleID111","RoleArn": "arn:aws:iam:1234:role/eventrole"}'

You can create a CloudWatch rule that uses a schedule as the event source and an ESC task as the target.

No this is not supported yet unfortunately. There is an open issue here. Hopefully it gets done soon as I would like to use it as well!

Disclosure: I work for SenseDeep that provides Powerdown # https://www.powerdown.io
Other services provide this functionality. PowerDown gives the ability to schedule Fargate services. This is at the service level, not the task level, but it is easy to create services for tasks. For example: you could schedule a CICD pipeline container to run 9-5 M-F.

It's not possible to have EC2 instances and Fargate instances at the same cluster.
It's possible to schedule a Fargate instance. Create a specific service and update that from aws tools. Ex:
aws ecs update-service --service my-http-service --task-definition
https://docs.aws.amazon.com/cli/latest/reference/ecs/update-service.html
Useful resources:
You could use the ECS aws tools and execute on lambda or travis.
Check out this medium post:
https://medium.com/#joseignaciocastelli92/how-to-create-a-continuous-deployment-process-using-ecs-fargate-docker-travis-410d84b4d99e
At the button has this repository that has the aws commands:
https://github.com/JicLotus/ecs-farate-scripts-to-deploy-and-build
Bests

registering ECS tasks with codeship

I am trying to deploy an application to AWS ECS using codeship. I have my docker-compose file and everything is ready to be deployed. Codeship documentation says to do something like this in the codeship-steps.yml file:
aws ecs register-task-definition --cli-input-json file:///deploy/tasks/backend.json
aws ecs update-service --service my-backend-service --task-definition backend
My question is, is this file file:///deploy/tasks/backend.json something I have to provide manually or is it created automatically as well as the ECS task. because I keep getting this error from codeship
An error occurred (ClientException) when calling the RunTask operation: TaskDefinition not found.

file:///deploy/tasks/backend.json is something you provide to aws ecs register-task-definition.
Looked it up here, this will generate the structure of that file you want:
aws ecs register-task-definition --generate-cli-skeleton
You can then modify it, after throwing it in backend.json, for instance (provided you redirect the output of the command into a file called backend.json).
If you look at the suggested service definition:
awsdeployment:
image: codeship/aws-deployment
encrypted_env_file: aws-deployment.env.encrypted
environment:
- AWS_DEFAULT_REGION=us-east-1
volumes:
- ./:/deploy
You can see that ./ is mapped to the /deploy mount point. This means that if in your repo, you create a directory called tasks, then you place your json file there, and you should be all set.

Where are the Spark logs on EMR?

I'm not able to locate error logs or message's from println calls in Scala while running jobs on Spark in EMR.
Where can I access these?
I'm submitting the Spark job, written in Scala to EMR using script-runner.jar with arguments --deploy-mode set to cluster and --master set to yarn. It runs the job fine.
However I do not see my println statements in the Amazon EMR UI where it lists "stderr, stdoutetc. Furthermore if my job errors I don't see why it had an error. All I see is this in thestderr`:
15/05/27 20:24:44 INFO yarn.Client: Application report from ResourceManager:
application identifier: application_1432754139536_0002
appId: 2
clientToAMToken: null
appDiagnostics:
appMasterHost: ip-10-185-87-217.ec2.internal
appQueue: default
appMasterRpcPort: 0
appStartTime: 1432758272973
yarnAppState: FINISHED
distributedFinalState: FAILED
appTrackingUrl: http://10.150.67.62:9046/proxy/application_1432754139536_0002/A
appUser: hadoop
`

With the deploy mode of cluster on yarn the Spark driver and hence the user code executed will be within the Application Master container. It sounds like you had EMR debugging enabled on the cluster so logs should have also pushed to S3. In the S3 location look at task-attempts/<applicationid>/<firstcontainer>/*.

If you SSH into the master node of your cluster then you should be able to find the stdout, stderr, syslog and controller logs under:
/mnt/var/log/hadoop/steps/<stepname>

I also spent a lot of time figuring this out. Found logs in the following location:
EMR UI Console -> Summary -> Log URI -> Containers -> application_xxx_xxx -> container_yyy_yy_yy -> stdout.gz.

The event logs, the ones required for the spark-history-server can be found at :
hdfs:///var/log/spark/apps

If you submit your job with emr-bootstrap you can specify the log directory as an s3 bucket with --log-uri

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse