Can we trigger AWS Lambda function from aws Glue PySpark job? - pyspark

Currently i'm able to run Glue PySpark job, but is this possible to call a lambda function from Glue this job ? Using below code from my PySpark Glue job i'm calling lambda function.
lambda_client = boto3.client('lambda', region_name='us-west-2')
response = lambda_client.invoke(FunctionName='test-lambda')
Error:
botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the Invoke operation: User: arn:aws:sts::208244724522:assumed-role/AWSGlueServiceRoleDefault/GlueJobRunnerSession is not authorized to perform: lambda:InvokeFunction on resource: arn:aws:lambda:us-west-2:208244724522:function:hw-test
But I added proper lambda roles to my Glue iam role, still getting above error. Any specific role need to add ?
Thanks.

To invoke AWS Lambda you can use the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowToExampleFunction",
"Effect": "Allow",
"Action": "lambda:InvokeFunction",
"Resource": "arn:aws:lambda:<region>:<123456789012>:function:<example_function>"
}
]
}
Your roles are not suitable for Lambda invocations as
AWSLambdaBasicExecutionRole – Grants permissions only for the Amazon CloudWatch Logs actions to write logs. You can use this policy
if your Lambda function does not access any other AWS resources except
writing logs.
AWSLambdaVPCAccessExecutionRole – Grants permissions for Amazon Elastic Compute Cloud (Amazon EC2) actions to manage elastic network
interfaces (ENIs).
Please see documentation here about these roles.

Related

Trigger a dag in Amazon Managed Workflows for Apache Airflow (MWAA) as a part CI/CD

Wondering if there is any way (blueprint) to trigger an airflow dag in MWAA on the merge of a pull request (preferably via github actions)? Thanks!
You need to create a role in AWS :
set permission with policy airflow:CreateCliToken
{
"Action": "airflow:CreateCliToken",
"Effect": "Allow",
"Resource": "*"
}
Add trusted relationship (with your account and repo)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::{account_id}:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:{repo-name}:*"
}
}
}
]
}
In github action you need to set AWS credential with role-to-assume and permission to job
permissions:
id-token: write
contents: read
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials#v1
with:
role-to-assume: arn:aws:iam::{ account_id }:role/{role-name}
aws-region: {region}
Call MWAA using the CLI see aws ref about how to create token and run dag.
(Answering for Airflow without specific context to MWAA)
Airflow offers rest API which has trigger dag end point so in theory you can configure GitHub action that will run after merge of PR and trigger a dag run via REST call. In theory this should work.
In practice this will not work as you expect.
Airflow is not synchronous with your merges (even if merged dump code in the dag folder and there is no additional wait time for GitSync). Airflow has a DAG File Processing service that scans the Dag folder and lookup for changes in files. It process the changes and then a dag is registered to the database. Only after that Airflow can use the new code. This seralization process is important it makes sure different parts of airflow (webserver etc..) don't have access to your dag folder.
This means that if you invoke dagrun right after merge you are risking that it will execute an older version of your code.
I don't know what why you need such mechanism it's not very typical requirement but I'd advise you to not trying to force this idea into your deployment.
To clarify:
If under a specific deployment you can confirm that the code you deployed is parsed and register as dag in the database then there is no risk in doing what you are after. This is probably a very rare and unique case.

How to implement cross-account RBAC using Cognito User groups and API Gateway?

I have 2 AWS accounts.The Front end along with cognito is hosted in Account 1 and the backend with the API GW is hosted in Account 2. I want to setup RBAC to prevent the users in the Cognito group to 'DELETE' API's using cognito groups. I have created a permission policy as below and attached it to a Role and then attached the Role to the Cognito group. I have then created a Authoriser for the API GW in Account 2 using the Cognito user pool available in Account 1 and then attached the Authoriser to the API's Delete Method Request.
Deny Policy, where I have replaced the resource parameters with my account/API details:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Action": [
"execute-api:Invoke"
],
"Resource": [
"arn:aws:execute-api:region:account-id:api-id/stage/METHOD_HTTP_VERB/Resource-path"
]
}
]
}
But when I try to delete the API, I am still able to successfully delete it. But I expect to get unauthorised as per the setup. I am able to see the Cognito user group details when I decode the token response, so my guess is the Cognito call is happening properly with API GW, but the Role/Deny Policy attached is not being enforced. Can someone please help me know what I am doing wrong, since this is cross account do I have to do something else with the IAM Role I have attached to the Cognito group or is there a issue with the Policy I am using?

Lambda Authorizer Policy not restricting access to Api Gateway proxy resource

I have a Lambda authorizer (python) that returns a resource-based policy similar to the following:
import json
def lambda_handler(event, context):
resource = "*"
headerValue = _get_header_value(event, 'my-header')
if headerValue == 'a':
resource = "arn:aws:execute-api:*:*:*/*/GET/a"
return {
"principalId": f"somebody",
"policyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": "execute-api:Invoke",
"Effect": "Allow",
"Resource": f"{resource}"
}
]
}
}
Basically, this authorizer will return an unrestricted api resource policy by default, using *. However, if a specific header value is passed, the policy will restrict access to only allow GET /a.
On the ApiGateway side of things, the only resource I have is ANY /{proxy+} which proxies into a .NET Core WebApi using APIGatewayProxyFunction. Inside the APIGatewayProxyFunction/WebApi, I have a number of Controllers and routes available, including GET /a. After all this is deploying into AWS, I can construct an http request using my-header with value a. I'm expecting this request to only provide access to GET /a, and return a 403 in all other cases. Instead, it provides access to everything in the api, similar to the star policy.
Is this the expected behavior when using a Lambda Authorizer in front of a proxy resource? It seems to really only enforce Allow * or Deny *. Thank you.
Note - When using the same authorizer against an Api Gateway where all the resources defined inside it (instead of inside .NET Controllers by proxy), the expected behavior does appear to happen - the http request with my-header set to 'a' will grant access to GET /a, but return 403 otherwise.

Not able to associate elastic ip address to my AWS ECS instance

I have created an AWS ECS instance in ca-central region. It works with the dynamic public ip which changes every time when I update the service. Everything is good so far.
As I need a public static IP, I have created an elastic ip in the same region and try to associate the ip with the ECS instance.
Resource Type: Network Interface
Reassociation: Allow this Elastic IP address to be reassociated (checked)
When I try this, it throws the error like this:
Elastic IP address could not be associated.
Elastic IP address nn.nn.nn.nn: You do not have permission to access the specified resource.
It seems the EIP you are trying to associate to the ECS container instance is already associated with another resource (e.g. Nat Gateway?). Please make sure the EIP is not currently associated with any other resource then try again.
Also confirm the user performing these actions has the following permissions:
"ec2.AssociateAddress"
To apply the various EC2 Elastic IP permissions in the AWS console, you can basically follow the instructions in this link below.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-policies-ec2-console.html#ex-eip
I wanted to make sure that my IAM user had all the permissions necessary to view, allocate, associate, release Elastic IPs, so I added permissions through IAM to the specific IAM group by:
Opening the Permissions tab, selecting Add permissions -> Create Inline Policy
After naming the policy, added the following into the JSON tab
Here's the JSON text below
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeAddresses",
"ec2:AllocateAddress",
"ec2:DescribeInstances",
"ec2:AssociateAddress",
"ec2:ReleaseAddress",
"ec2:DescribeAvailabilityZones",
"ec2:describeCoipPools",
"ec2:describePublicIpv4Pools"
],
"Resource": "*"
}
]
}

Restrict gcloud service account to specific bucket

I have 2 buckets, prod and staging, and I have a service account. I want to restrict this account to only have access to the staging bucket. Now I saw on https://cloud.google.com/iam/docs/conditions-overview that this should be possible. I created a policy.json like this
{
"bindings": [
{
"role": "roles/storage.objectCreator",
"members": "serviceAccount:staging-service-account#lalala-co.iam.gserviceaccount.com",
"condition": {
"title": "staging bucket only",
"expression": "resource.name.startsWith(\"projects/_/buckets/uploads-staging\")"
}
}
]
}
But when i fire gcloud projects set-iam-policy lalala policy.json i get:
The specified policy does not contain an "etag" field identifying a
specific version to replace. Changing a policy without an "etag" can
overwrite concurrent policy changes.
Replace existing policy (Y/n)?
ERROR: (gcloud.projects.set-iam-policy) INVALID_ARGUMENT: Can't set conditional policy on policy type: resourcemanager_projects and id: /lalala
I feel like I misunderstood how roles, policies and service-accounts are related. But in any case: is it possible to restrict a service account in that way?
Following comments, i was able to solve my problem. Apparently bucket-permissions are somehow special, but i was able to set a policy on the bucket that allows access for my user, using gsutil:
gsutils iam ch serviceAccount:staging-service-account#lalala.iam.gserviceaccount.com:objectCreator gs://lalala-uploads-staging
After firing this, the access is as-expected. I found it a little bit confusing that this is not reflected on the service-account policy:
% gcloud iam service-accounts get-iam-policy staging-service-account#lalala.iam.gserviceaccount.com
etag: ACAB
Thanks everyone