How to Create Dynamodb Global Secondary Index using AWS CLI? - command-line

The AWS CLI for Dynamodb create-table is a little bit confusion when it comes to create global secondary index. In the CLI document, it says global secondary index could be expressed with the following expression (shorthand):
IndexName=string,KeySchema=[{AttributeName=string,KeyType=string},{AttributeName=string,KeyType=string}],Projection={ProjectionType=string,NonKeyAttributes=[string,string]},ProvisionedThroughput={ReadCapacityUnits=long,WriteCapacityUnits=long} ...
My interpretation is, I should do
--global-secondary-indexes IndexName=requesterIndex,Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}
Note that I am not including KeySchema here to deduce complexity. The console gives me the following error:
Parameter validation failed:
Missing required parameter in GlobalSecondaryIndexes[0]: "KeySchema"
Unknown parameter in GlobalSecondaryIndexes[0]: "WriteCapacityUnits", must be one of: IndexName, KeySchema, Projection, ProvisionedThroughput
Invalid type for parameter GlobalSecondaryIndexes[0].ProvisionedThroughput, value: ReadCapacityUnits=1, type: <class 'str'>, valid types: <class 'dict'>
So somehow AWS CLI does not recognize the map expression for ProvisionedThroughput. I tried several ways to express it and could not make it work. I also failed to find any web page in Google describing how to do it.

This is the cli call I used to create the Reply sample in the aws documentation from the command line. The $EP i used at the end can be set in the environment to EP="--endpoint-url http://localhost:8000" to create the table on your local dynamodb instead of aws.
aws dynamodb create-table --table-name Reply --attribute-definitions \
AttributeName=Id,AttributeType=S AttributeName=ReplyDateTime,AttributeType=S \
AttributeName=PostedBy,AttributeType=S AttributeName=Message,AttributeType=S \
--key-schema AttributeName=Id,KeyType=HASH \
AttributeName=ReplyDateTime,KeyType=RANGE --global-secondary-indexes \
IndexName=PostedBy-Message-Index,KeySchema=["\
{AttributeName=PostedBy,KeyType=HASH}","\
{AttributeName=Message,KeyType=RANGE}"],Projection="{ProjectionType=INCLUDE \
,NonKeyAttributes=["ReplyDateTime"]}",ProvisionedThroughput="\
{ReadCapacityUnits=10,WriteCapacityUnits=10}" --provisioned-throughput \
ReadCapacityUnits=5,WriteCapacityUnits=4 $EP

Read through AWS CLI source code on Github, it could parse double quote content. So adding double quote in the script solved the issue. There is the new code -
--global-secondary-indexes IndexName=requesterIndex,Projection={ProjectionType=ALL},ProvisionedThroughput="{ReadCapacityUnits=${CURRENT_READUNIT},WriteCapacityUnits=${CURRENT_WRITEUNIT}}"

Define the table structure in a JSON file, including the index structures. Use following to create a template structure.
aws dynamodb create-table --generate-cli-skeleton
Run the cli command with the table definition input json
aws dynamodb create-table --cli-input-json file://path-to-yourtable-definition.json

Related

Error in Google Cloud Shell Commands while working on the lab (Securing Google Cloud with CFT Scorecard)

I am working in a GCP lab (Securing Google Cloud with CFT Scorecard). All instructions for the lab are given.
First I have to run the following two commands to set environment variables
export GOOGLE_PROJECT=$DEVSHELL_PROJECT_ID
export CAI_BUCKET_NAME=cai-$GOOGLE_PROJECT
In the second command given above I don't know what to replace with my own credentials? May be that is the reason I am getting error.
Now I have to enable the "cloudasset.googleapis.com" gcloud service. For this they gave the following command.
gcloud services enable cloudasset.googleapis.com \
--project $GOOGLE_PROJECT
Error for this is given in the screeshot attached herewith:
Error in the serviec enabling command
Next step is to clone the policy: The given command for that is:
git clone https://github.com/forseti-security/policy-library.git
After that they said: "You realize Policy Library enforces policies that are located in the policy-library/policies/constraints folder, in which case you can copy a sample policy from the samples directory into the constraints directory".
and gave this command:
cp policy-library/samples/storage_blacklist_public.yaml policy-library/policies/constraints/
On running this command I received this:
error on running the directory command
Finally they said "Create the bucket that will hold the data that Cloud Asset Inventory (CAI) will export" and gave the following command:
gsutil mb -l us-central1 -p $GOOGLE_PROJECT gs://$CAI_BUCKET_NAME
I am confused in where to replace my own credentials like in the place of project_Id I wrote my own project id.
Also I don't know these errors are ocurring. Kindly help me.
I'm unable to access the tutorial.
What happens if you run the following:
echo ${DEVSHELL_PROJECT_ID}
I suspect you'll get an empty result because I think this environment variable isn't actually set.
I think it should be:
echo ${DEVSHELL_GCLOUD_CONFIG}
Does that return a result?
If so, perhaps try using that variable instead:
export GOOGLE_PROJECT=${DEVSHELL_GCLOUD_CONFIG}
export CAI_BUCKET_NAME=cai-${GOOGLE_PROJECT}
It's not entirely clear to me why this tutorial is using this approach but, if the above works, it may get you further along.
We're you asked to create a Google Cloud Platform project?
As per the shared error, this seems to be because your env variable GOOGLE_PROJECT is not set. You can verify it by using echo $GOOGLE_PROJECT and seeing whether it returns the project ID or not. You could also use echo $DEVSHELL_PROJECT_ID. If that returns the project ID and the former doesn't, it means that you didn't export the variable as stated at the beginning.
If the problem is that GOOGLE_PROJECT doesn't have any value, there are different approaches on how to solve it.
Set the env variable as you explained at the beginning. Obviously this will only work if the variable DEVSHELL_PROJECT_ID is also set.
export GOOGLE_PROJECT=$DEVSHELL_PROJECT_ID
Manually set the project ID into that variable. This is far from ideal because in Qwiklabs they create a new temporal project on every lab, so this would've only worked if you were still on that project. The project ID can be seen on both of your shared screenshots.
export GOOGLE_PROJECT=qwiklabs-gcp-03-c6e1787dc09e
Avoid using the argument --project. According to the documentation, the aforementioned argument is optional and if none is used the command will take the one by default, which will be on the configuration settings. You can get the current project by using this:
gcloud config get-value project
If the previous command matches the project ID you want to use, you can simply issue the following command:
gcloud services enable cloudasset.googleapis.com
Notice that the project ID is not being explicitly mentioned using --project.
Regarding your issue with the GitHub file, I have checked the repository and the file storage_blacklist_public.yaml doesn't seem to be in the directory policy-library/samples. There seems to be a trace that it was once there, but it isn't anymore, they should probably update the lab as it isn't anymore.
About your credentials confusion, you don't have to use your own project ID, just the one given on your lab. If I recall properly all the needed data should be on the left side of the lab. Still, you shouldn't need to authenticate in a normal situation as you are already logged in your temporal project if you are accessing it form the Cloud Shell, which is where you should be doing all this.
Adding this for the later versions
in the gcloud shell you can set a temp variable for the current project id with
PROJECT_ID="$(gcloud config get-value project)"
then use like
--project ${PROJECT_ID}

Grant permissions to the service account error

Following step by step the example at,
https://cloud.google.com/text-to-speech/docs/reference/libraries#client-libraries-usage-python
on Step2, I get the following error.
D:\google-api-python\py>gcloud projects add-iam-policy-binding [my-project-for-tts] --member "serviceAccount:[tts-python-1]#[my-project-for-tts].iam.gserviceaccount.com" --role "roles/owner"
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Request contains an invalid argument.
What argument seems to be invalid here?
My intention is to use the Text-to-Speech services by loading a custom input text file and save the output in my local disk. I installed successfully google-sdk and python.
It seems like the square brackets are the invalid argument here. Using my own service account and running your command I get the same error, but removing the the [] runs fine.
If you remove the [] then the command will work fine.
As per the Google Documentation, you need = to provide member and role value. Your command should look like below:
gcloud projects add-iam-policy-binding my-project-for-tts --member='serviceAccount:tts-python-1#my-project-for-tts.iam.gserviceaccount.com' --role='roles/owner'
Also, make sure you are using the latest gcloud package.

Seems to conflict between Serverless syntax and CloudFormation syntax

The below is the part of CloudForamtion file loaded by Serverless.
# resource.yml
.
.
.
{"Fn::Sub": "arn:aws:sqs:*:${AWS::AccountId}:sqs-spoon-*-${env:SERVICE}"}
# serverless.yml
.
.
resources:
- ${file:resource.yml}
${AWS::AccountId} is CloudFormation Pseudo Parameter and ${env:SERVICE} is Serverless variable.
When I run sls deploy, it returns the error.
Invalid variable reference syntax for variable AWS::AccountId. You can only reference env vars, options, & files. You can check our docs for more info.
It seems to say that Serverless recognize ${AWS::AccountId} as Serverless variable, not as CloudFormation Pseudo Parameter.
Right?
If so, how to have Serverless not to parse Pseudo Parameter so that it will be parsed by CloudFormation later?
I can solve it with the plugin.
With the plugin, It cloud be solved by replacing ${AWS::AccountId} with #{AWS::AccountId}.
{"Fn::Sub": "arn:aws:sqs:*:#{AWS::AccountId}:sqs-spoon-*-${env:SERVICE}"}
You can accomplish support for the native AWS syntax with a single config line in serverless.yml to define the variableSyntax. Details can be found here https://github.com/serverless/serverless/pull/3694.
provider:
name: aws
runtime: nodejs8.10
variableSyntax: "\${((env|self|opt|file|cf|s3)[:\(][ :a-zA-Z0-9._,\-\/\(\)]*?)}"

EMR Spark Fails to Save Dataframe to S3

I am using the RunJobFlow command to spin up a Spark EMR cluster. This command sets the JobFlowRole to an IAM Role which has the policies AmazonElasticMapReduceforEC2Role and AmazonRedshiftReadOnlyAccess. The first policy contains an action to allow all s3 permissions.
When the EC2 instances spin up, they assume this IAM role, and generate temporary credentials via STS.
The first thing which I do is read a table from my Redshift cluster into a Spark Dataframe using the com.databricks.spark.redshift format and using the same IAM Role to unload the data from redshift as I did for the EMR JobFlowRole.
So far as I understand, this runs an UNLOAD command on Redshift to dump into the S3 bucket I specify. Spark then loads the newly unloaded data into a Dataframe. I use the recommended s3n:// protocol for the tempdir option.
This command works great, and it always successfully loads the data into the Dataframe.
I then run some transformations and attempt to save the dataframe in the csv format to the same S3 bucket Redshift Unloaded into.
However, when I try to do this, it throws the following error
java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively)
Okay. So I don't know why this happens, but I tried to hack around it by setting the recommended hadoop configuration parameters. I then used DefaultAWSCredentialsProviderChain to load the AWSAccessKeyID and AWSSecretKey and set via
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", <CREDENTIALS_ACCESS_KEY>)
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", <CREDENTIALS_SECRET_ACCESS_KEY>)
When I run it again it throws the following error:
java.io.IOException: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId;
Okay. So that didn't work. I then removed setting the hadoop configurations and hardcoded an IAM user's credentials in the s3 url via s3n://ACCESS_KEY:SECRET_KEY#BUCKET/KEY
When I ran this it spit out the following error:
java.lang.IllegalArgumentException: Bucket name should be between 3 and 63 characters long
So it tried to create a bucket.. which is definitely not what we want it to do.
I am really stuck on this one and would really appreciate any help here! It works fine when I run it locally, but completely fails on EMR.
The problem was the following:
EC2 Instance Generated Temporary Credentials on EMR Bootstrap Phase
When I queried Redshift, I passed the aws_iam_role to theDatabricks driver. The driver then re-generated temporary credentials for that same IAM role. This invalidated the credentials the EC2 instance generated.
I then tried to upload to S3 using the old credentials (and the credentials which were stored in the instance's metadata)
It failed because it was trying to use out-of-date credentials.
The solution was to remove redshift authorization via aws_iam_role and replace it with the following:
val credentials = EC2MetadataUtils.getIAMSecurityCredentials
...
.option("temporary_aws_access_key_id", credentials.get(IAM_ROLE).accessKeyId)
.option("temporary_aws_secret_access_key", credentials.get(IAM_ROLE).secretAccessKey)
.option("temporary_aws_session_token", credentials.get(IAM_ROLE).token)
On amazon EMR, try usong the prefix s3:// to refer to an object in S3.
It's a long story.

Comma separator in Lambda function environment settings using the Serverless Framework

I am trying to add a MongoDB cluster as part of a Serverless deployment, but I can't set the environment variable.
Here is part of the serverless.yml file:
service: serverless-test
plugins:
- serverless-offline
provider:
name: aws
runtime: nodejs4.3
environment:
MONGO_URI: "mongodb://mongo-6:27000,mongo-7:27000,mongo-8:27000/db-dev?replicaSet=mongo"
How do I pass the MONGO_URI to contain the cluster as a comma separated value?
Any advise is much appreciated.
Unfortunately, you can't use commas in Lambda environment variables. This is an AWS limitation and not a Serverless issue.
For example, browse the AWS console and try to add a environment variable that contains a comma:
When you save, you will get the following error:
1 validation error detected: Value at 'environment.variables' failed to satisfy constraint: Map value must satisfy constraint: [Member must satisfy regular expression pattern: [^,]*]
The error message says that the regex [^,]* must be satisfied and what this small regex explicitly says is to not (^) accept the comma (,). Any other char is acceptable.
I don't know why they don't accept the comma and this is not explained in their documentation, but at least their error message shows that it is intentional.
As a workaround, you can replace your commas by another symbol (like #) to create the env var and replace it back to comma after reading the variable, or you will need to create multiple env vars to store the endpoints.