Unzip artifact for REST API Gateway in CDK

Unzip artifact for REST API Gateway in CDK - aws-api-gateway

I'm currently passing, thru parameterOverrides, both the S3 Bucket name and the object key.
However, the key is in fact a zipped file (that contains the YAML):
export class BusinessAssetApi extends SpecRestApi {
constructor(scope: Construct, id: string, bucketName: string, key: string) {
const bucket = Bucket.fromBucketName(scope, "openapi-bucket", bucketName)
super(scope, id, {
deploy: true,
deployOptions: {
stageName: STAGE_NAME,
},
apiDefinition: ApiDefinition.fromBucket(bucket, key),
})
}
}
Now, I want to know if there's a smart way to unzip the file and get the yaml file instead, or if there is a smarter way to save the artifact with a specific filename and/or file extension?
TIA
FAres

fromBucket is intended for use where you are storing your configuration files or other needed files directly in an s3 bucket - they aren't really intended for an artifact (i am kinda assuming you are getting this artifact from an earlier step in a codePipeline?) - and as such you have encountered the primary drawback of this design - zip files are not the configuration, and fromBucket does not unzip.
if you have a repo as your base point, and it's part of your pipeline or where you are running cdk deploy from, you can use fromAsset instead, but this is a bit more convoluted in terms of getting that file there.
The only solution I know of in this situation is to store the file as part of your pipeline process directly in an s3 bucket and then pass that as part of your parameters into your next stack.
I suppose alternatively, if you really have no other choice, you could write a bit of code to grab the zip out of the artifact and the keys for it out of the pipeline event, unzip it in code, and use fromInline instead... but that probably wont work as expected.

Related

How to exclude artifact files after web deployment?

We have a build/release pipeline that is finally working correctly, but the developer asked that we exclude the stage config files (Web.Dev.config, Web.Test.config, Web.Prod.config) as well as the artifact archive itself from the site/wwwroot.
As you can see, every time we deployed, these zip files have been getting stored in the site root as well. They aren't harmful but it doesn't look good:
This is the Release App Service Web Deploy YAML:
steps:
- task: AzureRmWebAppDeployment#4
displayName: 'Azure App Service Deploy: project-123'
inputs:
azureSubscription: 'Azure Dev Service Connection'
WebAppName: 'project-123'
packageForLinux: '$(System.DefaultWorkingDirectory)/Project123 Dev Build Artifact/Release'
enableCustomDeployment: true
enableXmlTransform: true
How do we exclude those files after successful deployment?
Kudu dir structure:

Building on #theWinterCoder answer, Unfortunately, there doesn’t appear to be a way to honor the MSDeploySkipRules defined in the csproj file. Instead, files and folders can be skipped by defining the AdditionalArguments parameter of the Azure App Service Deploy (AzureRmWebAppDeployment) task.
Since there doesn’t appear to be any official documentation for the -skip rules, and the MSDeploy.exe documentation that Azure Pipelines references is out-of-date, in 2012, richard-szalay wrote a useful article, “Demystifying MSDeploy skip rules”, which provides a lot of details for anyone requiring additional control.
Brief Explanation:
The dirPath argument represents the Web Deploy Provider to skip a directory whilst the filePath argument is used to skip an individual file.
The dirPath starts at wwwroot.
For ASP.NET Core applications, there’s another wwwroot under wwwroot; as such, the absolutePath in that case would look like this: absolutePath=wwwroot\\somefoldername which would map to D:\home\site\wwwroot\wwwroot\somefoldername
Solution:
Therefore, since I’m skipping files, i set the web deploy provider to filePath, and since we’re not using .NET Core, we set absolutePath to Web.Dev.config. That would map to D:\home\site\wwwroot\Web.Dev.config.
The same thing applies for the zip artifact, however, if we don’t prepend \\ before the wildcard it will fail with following error:
Error: Error: The regular expression '.zip’ is invalid. Error: parsing ".zip" - Quantifier {x,y} following nothing. Error count: 1.
-skip:objectName=filePath,absolutePath=Web.Dev.config
-skip:objectName=filePath,absolutePath=Web.Prod.config
-skip:objectName=filePath,absolutePath=Web.Test.config
-skip:objectName=filePath,absolutePath=\\*.zip
or with regular expression:
-skip:objectName=filePath,absolutePath="Web.Dev.config|Web.Prod.config|Web.Test.config|\\*.zip"
Thats it 😃

You can add an additional arguments line to the yml that will tell it to skip certain files. It will look something like this:
AdditionalArguments: '-skip:objectName=dirPath,absolutePath=wwwroot\\Uploads'
More details can be found in this thread

get s3 url path of metaflow artifact

Is there a way to get the full s3 url path of a metaflow artifact, which was stored in a step?
I looked at Metaflow's DataArtifact class but didn't see an obvious s3 path property.

Yep, you can do
Flow('MyFlow')[42]['foo'].task.artifacts.bar._object['location']
where MyFlow is the name of your flow, 42 is the run ID, foo is the step under consideration and bar is the artifact from that step.

Based on #Savin's answer, I've written a helper function to get the S3 URL of an artifact given a Run ID and the artifact's name:
from metaflow import Flow, Metaflow, Run
from typing import List, Union
def get_artifact_s3url_from_run(
run: Union[str, Run], name: str, legacy_names: List[str] = [], missing_ok: bool = False
) -> str:
"""
Given a MetaFlow Run and a key, scans the run's tasks and returns the artifact's S3 URL with that key.
NOTE: use get_artifact_from_run() if you want the artifact itself, not the S3 URL to the artifact.
This allows us to find data artifacts even in flows that did not finish. If we change the name of an artifact,
we can support backwards compatibility by also passing in the legacy keys. Note: we can avoid this by resuming a
specific run and adding a node which re-maps the artifact to another key. This will assign the run a new ID.
Args:
missing_ok: whether to allow an artifact to be missing
name: name of the attribute to look for in task.data
run: a metaflow.Run() object, or a run ID
legacy_names: backup names to check
Returns:
the value of the attribute. If attribute is not found
Raises:
DataartifactNotFoundError if artifact is not found and missing_ok=False
ValueError if Flow not found
ValueError if Flow is found but run ID is not.
"""
namespace(None) # allows us to access all runs in all namespaces
names_to_check = [name] + legacy_names
if isinstance(run, str):
try:
run = Run(run)
except Exception as e:
# run ID not found. see if we can find other runs and list them
flow = run.split(sep="/")[0]
try:
flow = Flow(flow)
raise ValueError(f"Could not find run ID {run}. Possible values: {flow.runs()}") from e
except Exception as e2:
raise ValueError(f"Could not find flow {flow}. Available flows: {Metaflow().flows}") from e2
for name_ in names_to_check:
for step_ in run:
for task in step_:
print(f"task {task} artifacts: {task.artifacts} \n \n")
if task.artifacts is not None and name_ in task.artifacts:
# https://stackoverflow.com/a/66361249/4212158
return getattr(task.artifacts, name_)._object["location"]
if not missing_ok:
raise DataArtifactNotFoundError(
f"No data artifact with name {name} found in {run}. Also checked legacy names: {legacy_names}"
)

automate uploading of glue script

We are currently using cloud formation to create a glue job (via codebuild and codepipeline). The one thing we are stuck on is how to automate the code that goes into the glue job.
Our current relevant piece of the cloudformation template looks like this:
MyJob:
Type: AWS::Glue::Job
Properties:
Command:
Name: glueetl
ScriptLocation: "s3://aws-glue-scripts//your-script-file.py"
DefaultArguments:
"--job-bookmark-option": "job-bookmark-enable"
ExecutionProperty:
MaxConcurrentRuns: 2
MaxRetries: 0
Name: cf-job1
Role: !Ref MyJobRole
The problem is is the "ScriptLocation". Looks like it is required to be an S3 location. How can we automate the upload of this. The code is in a .py file in our Git repository and I assume is uploaded to the artifact repository as are of the codebuild process, but how to access it?
Would like to hear how others are doing this. Thanks!
EDIT: I was able to find a similar stack overflow post:AWS Glue automatic job creation but it the answers really don't give a solution or understand the question posed.

I've written a tool to handle the upload of stack dependencies, including CloudFormation nested templates and non-inline Lambda functions.
Currently AWS Glue was not handled since I haven't try it in any project yet. But it should be easy to expand to support Glue.
The dependencies were defined in separate config file, and a piece of code within the tool is responsible for the config. Here's the sample config:
Nested CloudFormation templates:
# DEPENDS=( <ParameterName>=<NestedTemplate> )
#
# Required: Yes if has nested template, otherwise No
# Default: None
# Syntax:
# <ParameterName>: The name of template parameter that is referred at the
# value of nested template property `TemplateURL`.
# <NestedTemplate>: A local path or a S3 URL starting with `s3://` or
# `https://` pointing to the nested template.
# The nested templates at local is going to be uploaded
# to S3 Bucket automatically during the deployment.
# Description:
# Double quote the pairs which contain whitespaces or special characters.
# Use `#` to comment out.
# ---
# Example:
# DEPENDS=(
# NestedTemplateFooURL=/path/to/nested/foo/stack.json
# NestedTemplateBarURL=/path/to/nested/bar/stack.json
# )
Lambda functions:
# LAMBDA=( <S3BucketParameterName>:<S3KeyParameterName>=<LambdaFunction> )
#
# Required: Yes if has None-inline Lambda Function, otherwise No
# Default: None
# Syntax:
# <S3BucketParameterName>: The name of template parameter that is referred
# at the value of Lambda property `Code.S3Bucket`.
# <S3KeyParameterName>: The name of template parameter that is referred
# at the value of Lambda property `Code.S3Key`.
# <LambdaFunction>: A local path or a S3 URL starting with `s3://` pointing
# to the Lambda Function.
# The Lambda Functions at local is going to be zipped and
# uploaded to S3 Bucket automatically during the deployment.
# Description:
# Double quote the pairs which contain whitespaces or special characters.
# Use `#` to comment out.
# ---
# Example:
# DEPENDS=(
# S3BucketForLambdaFoo:S3KeyForLambdaFoo=/path/to/LambdaFoo.py
# S3BucketForLambdaBar:S3KeyForLambdaBar=s3://mybucket/LambdaBar.py
# )
The tools were written in bash and come with 2 parts:
xsh: It works as a bash library framework.
xsh-lib/aws: It's a library of xsh.
The code you may need to expand is located in xsh-lib/aws/functions/cfn/deploy.sh.
The example deploy command looks like:
$ xsh aws/cfn/deploy -C /path/to/your/template-and-config-dir -t stack.json -c sample.conf
I'm considering to abstract the dependencies such as CloudFormation template, Lambda functions and Glue, into a single interface for both configs and handlers.
This will make it easier to add new dependency handlers to the deployer.

Replace values in multiple appsettings files stored in different artifacts during VSTS Release pipeline

I am trying to replace appsettings.json and e2e-appsettings.json variables stored in two different artifacts in the release pipeline.
As per below code, ONLY appsettings.json is updating, but for the second line it gives an error,
error: NO JSON file matched with specific pattern: e2e/XXX.EndToEnd.Integration.Tests/e2e-appsettings.json
As per the information it should be the relative path to root. So in this case not sure what should be the root as there are two build artifacts.
Further in download artifact logs says,
2019-04-09T02:33:55.9132583Z Downloaded e2e/XXX.EndToEnd.Integration.Tests/e2e-appsettings.json to D:\a\r1\a\EstimationCore\e2e\XXX.EndToEnd.Integration.Tests\e2e-appsettings.json
The artifact with other appsettings.json file which is working fine is a zip file. logs, Downloading app/app.zip to D:\a\r1\a\EstimationCore\app\app.zip
These I have already tried which gave the same error
- NO JSON file matched with specific pattern: e2e/XXX.EndToEnd.Integration.Tests/e2e-appsettings.json.
- NO JSON file matched with specific pattern: **/*e2e-appsettings.json
- NO JSON file matched with specific pattern: d:\a\r1\a\EstimationCore\e2e\XXX.EndToEnd.Integration.Tests\e2e-appsettings.json.
- NO JSON file matched with specific pattern: d:\a\r1\a\EstimationCore\**\**\e2e-appsettings.json.

Terraform - Pass in Variable to "Source" Parameter

I'm using Terraform in a modular fashion in order to build out my infrastructure. I do this by having a configuration file that calls in the different modules. I want to pass an infrastructure variable which picks up what tagged version of the Github repository the application should be building out. Most importantly I'm trying to figure out how to make a concatenation of a string happen in the "source" variable of the configuration file.
module "athenaelb" {
source = "${concat("git::https://github.com/ORG/REPONAME.git?ref=",var.infra_version)}"
aws_access_key = "${var.aws_access_key}"
aws_secret_key = "${var.aws_secret_key}"
aws_region = "${var.aws_region}"
availability_zones = "${var.availability_zones}"
subnet_id = "${var.subnet_id}"
security_group = "${var.athenaelb_security_group}"
branch_name = "${var.branch_name}"
env = "${var.env}"
sns_topic = "${var.sns_topic}"
s3_bucket = "${var.elb_s3_bucket}"
athena_elb_sns_topic = "${var.athena_elb_sns_topic}"
infra_version = "${var.infra_version}"
}
I want it to compile and for the source to look like this (for example): git::https://github.com/ORG/REPONAME.git?ref=v1
Anyone have any thoughts on how to make this work?
Thanks,
Keren

This is not possible currently in Terraform itself.
The only way to achieve something like this is to use a separate script to interact with the git repository that Terraform clones into a subdirectory of the .terraform/modules directory and switch it to a different tag depending on which version you need. This is non-ideal since Terraform organizes these into directories based on a hash of the module path, but if you can identify the module in question it is safe to run git checkout within these repositories as long as you do not run terraform get again afterwards.
For more details and discussion on this issue, see issue #1439 in Terraform's issue tracker, where this feature was requested.

You could use envsubst or python jinja and use these wrapper scripts in your pipeline deploy script to actually build the scripts from .envsubst and .jinja files before your terraform plan/apply
https://github.com/uvoo/process-templates/tree/main/scripts
I wish terraform would support this but my guess is they never will so just add some simple functions/files into deploy scripts which is usually the best way to deploy.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse