Saving multiple Athena queries using a template

Saving multiple Athena queries using a template - aws-cloudformation

Here is a template that works correctly and saves a query to Athena. But how do I save more than 1 query in a single template?
{
"Resources": {
"AthenaNamedQuery": {
"Type": "AWS::Athena::NamedQuery",
"Properties": {
"Database": "swfnetadata",
"Description": "A query that selects all aggregated data",
"Name": "MostExpensiveWorkflow",
"QueryString": "SELECT workflowname, AVG(activitytaskstarted) AS AverageWorkflow FROM swfmetadata WHERE year='17' AND GROUP BY workflowname ORDER BY AverageWorkflow DESC LIMIT 10"
}
}
}
}

Just stick another resource in the template:
{
"Resources": {
"AthenaNamedQuery": {
"Type": "AWS::Athena::NamedQuery",
"Properties": {
"Database": "swfnetadata",
"Description": "A query that selects all aggregated data",
"Name": "MostExpensiveWorkflow",
"QueryString": "SELECT workflowname, AVG(activitytaskstarted) AS AverageWorkflow FROM swfmetadata WHERE year='17' AND GROUP BY workflowname ORDER BY AverageWorkflow DESC LIMIT 10"
}
},
"AnotherAthenaNamedQuery": {
"Type": "AWS::Athena::NamedQuery",
"Properties": {
"Database": "swfnetadata",
"Description": "Another query",
"Name": "AnotherQuery",
"QueryString": "SELECT 1"
}
}
}
}

Related

Azure data factory Dynamic Content

I have the following output from a web activity .
{
"value": [
{
"id": "/subscriptions/xy_csv",
"name": "xy_csv",
"type": "Microsoft.code",
"etag": "6200",
"properties": {
"folder": {
"name": "samplecodes"
},
"content": {
"query": "select * from table 1",
"metadata": {
"language": "sql"
},
"currentConnection": {
"databaseName": "demo",
"poolName": "Built-in"
},
"resultLimit": 5000
},
"type": "SqlQuery"
}
},
{
"id": "/subscriptions/ab_csv",
"name": "ab_csv",
"type": "Microsoft.code",
"etag": "6200",
"properties": {
"folder": {
"name": "livecode"
},
"content": {
"query": "select * from table 2",
"metadata": {
"language": "sql"
},
"currentConnection": {
"databaseName": "demo",
"poolName": "Built-in"
},
"resultLimit": 5000
},
"type": "SqlQuery"
}
}
]
I would like to create filter activity after the web activity just to filter out items that are saved under the folder name "livecode".
On the filter activity item field I have -#activity('Web1').output.value
On the condition field I have -- #startswith(item().properties.folder.name,'livecode')
The web activity is successful but the filter activity is failed with this error.
{
"errorCode": "InvalidTemplate",
"message": "The execution of template action 'FilterFilter1' failed: The evaluation of 'query' action 'where' expression '#startswith(item().properties.folder.name,'sql')' failed: 'The expression 'startswith(item().properties.folder.name,'sql')' cannot be evaluated because property 'folder' doesn't exist, available properties are 'content, type'.",
"failureType": "UserError",
"target": "Filter1",
"details": ""
}
it feels like I am going wrong on how i have written the Condition Dynamic Content filter to navigate to properties.folder.name. I am not sure what is missing in my condition. Can anyone help? thanks Much appreciated.

The error is because of the web activity output's properties object might not contain the folder sometimes.
I have taken the following json and got the same error:
{
"value":[
{
"id":"/subscriptions/xy_csv",
"name":"xy_csv",
"type":"Microsoft.code",
"etag":"6200",
"properties":{
"content":{
"query":"select * from table 1",
"metadata":{
"language":"sql"
},
"currentConnection":{
"databaseName":"demo",
"poolName":"Built-in"
},
"resultLimit":5000
},
"type":"SqlQuery"
}
},
{
"id":"/subscriptions/ab_csv",
"name":"ab_csv",
"type":"Microsoft.code",
"etag":"6200",
"properties":{
"folder":{
"name":"livecode"
},
"content":{
"query":"select * from table 2",
"metadata":{
"language":"sql"
},
"currentConnection":{
"databaseName":"demo",
"poolName":"Built-in"
},
"resultLimit":5000
},
"type":"SqlQuery"
}
}
]
}
So, you have to modify the filter condition to check whether it contains a folder key or not using the following dynamic content. I have taken your web activity output as a parameter value and took out folder key from properties object:
#startswith(if(contains(item().properties,'folder'),item().properties.folder.name,''),'livecode')
When I debug the pipeline, we get desired result:

Netsuite - Custom List related to a object field

I am able to get fields related to an object doing a GET request:
https://[...].suitetalk.api.netsuite.com/services/rest/record/v1/metadata-catalog/customer
And sending as header: Accept: application/schema+json
On this response I get a custom field called: "custentity_companypublicprivate", which in the UI is a dropdown, and the values on this dropdown are a custom list.
Manually I was able to get which is the custom list related: "customlist_compnaypublicprivate"
But I need to get this relationship through code since I need to do this for all fields which are a custom list and in the response I can not find any information which tells me which is the custom list related.
This is the definition of the object:
"custentity_companypublicprivate": {
"type": "object",
"properties": {
"id": {
"title": "Internal identifier",
"type": "string"
},
"refName": {
"title": "Reference Name",
"type": "string"
},
"externalId": {
"title": "External identifier",
"type": "string"
},
"links": {
"title": "Links",
"type": "array",
"readOnly": true,
"items": {
"$ref": "/services/rest/record/v1/metadata-catalog/nsLink"
}
}
}
}
Is there a way to do this?. Thanks in advance for your help.

Another option is to query using SuiteQL.
https://#########.suitetalk.api.netsuite.com/services/rest/query/v1/suiteql?limit=1000&offset=0
Query:
{
"q": "SELECT * FROM CustomField where fieldtype = 'ENTITY'"
}
Returns the list/record InternalId in fieldvaluetyperecord.
Other queries:
{
"q": "SELECT * FROM CustomRecordType"
}
{
"q": "SELECT * FROM CustomList"
}
{
"q": "SELECT * FROM customlist_compnaypublicprivate"
}

TMSL Creating Multiple Partitions Unrecognised JSON property

Hi all I'm trying to put together a script to create multiple partitions within a tabular data model. I can do one at a time, but multiples seems to be erroring with the following message.
Unrecognized JSON property: partitions. Check path 'create.partitions'
I'm using the following (anonymised) generated script.
{
"create": {
"parentObject": {
"database": "MY_TABULAR",
"table": "MY_TABLE"
},
"partitions": [{
"name": "MY_TABLE 12 2018-09",
"source": {
"query": "SELECT * FROM [Fact].[MY_TABLE] WHERE PlanKey = 12 AND dateKey BETWEEN 20180901 AND 20180930",
"dataSource": "MY_DW"
}
},
{
"name": "MY_TABLE 12 2018-10",
"source": {
"query": "SELECT * FROM [Fact].[MY_TABLE] WHERE PlanKey = 12 AND dateKey BETWEEN 20181001 AND 20181031",
"dataSource": "MY_DW"
}
},
{
"name": "MY_TABLE 12 2018-11",
"source": {
"query": "SELECT * FROM [Fact].[MY_TABLE] WHERE PlanKey = 12 AND dateKey BETWEEN 20181101 AND 20181130",
"dataSource": "MY_DW"
}
}]
}
}
As far as I can tell from looking at the references this is correct, but SSMS doesn't appear to like it.

You can do this by using the Sequence command to execute multiple CreateOrReplace commands that will create the partitions. The Sequence command does have an optional maxParallelism property, however only refresh operations run in parallel (per MSDN). The example below details this further.
{
"sequence":
{
"operations": [
{
"createOrReplace": {
"object": {
"database": "YourTabularDatabase",
"table": "YourTable",
"partition": "Partition 1"
},
"partition": {
"name": "Partition 1",
"dataView": "full",
"source": {
"query": "SELECT * FROM [dbo].[SourceTable] where DateKey < 20180901",
"dataSource": "YourDataSource"
}
}
}
},
{
"createOrReplace": {
"object": {
"database": "YourTabularDatabase",
"table": "YourTable",
"partition": "Partition 2"
},
"partition": {
"name": "Partition 2,
"source": {
"query": "SELECT * FROM [dbo].[SourceTable] where DateKey >= 20180901",
"dataSource": "YourDataSource"
}
}
}
}]
}
}

ElasticSearch Reindex API and painless script to access date field

I try to familiarize myself with the Reindexing API of ElasticSearch and the use of Painless scripts.
I have the following model:
"mappings": {
"customer": {
"properties": {
"firstName": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"dateOfBirth": {
"type": "date"
}
}
}
}
I would like to reindex all documents from test-v1 to test-v2 and apply a few transformations on them (for example extract the year part of dateOfBirth, convert a date value to a timestamp, etc) and save the result as a new field. But I got an issue when I tried to access it.
When I made the following call, I got an error:
POST /_reindex?pretty=true&human=true&wait_for_completion=true HTTP/1.1
Host: localhost:9200
Content-Type: application/json
{
"source": {
"index": "test-v1"
},
"dest": {
"index": "test-v2"
},
"script": {
"lang": "painless",
"inline": "ctx._source.yearOfBirth = ctx._source.dateOfBirth.getYear();"
}
}
And the response:
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx._source.yearOfBirth = ctx._source.dateOfBirth.getYear();",
" ^---- HERE"
],
"script": "ctx._source.yearOfBirth = ctx._source.dateOfBirth.getYear();",
"lang": "painless"
}
],
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx._source.yearOfBirth = ctx._source.dateOfBirth.getYear();",
" ^---- HERE"
],
"script": "ctx._source.yearOfBirth = ctx._source.dateOfBirth.getYear();",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Unable to find dynamic method [getYear] with [0] arguments for class [java.lang.String]."
}
},
"status": 500
}
According to this tutorial Date fields are exposed as ReadableDateTime so they support methods like getYear, and getDayOfWeek. and indeed, the Reference mentions those as supported methods.
Still, the response mentions [java.lang.String] as the type of the dateOfBirth property. I could just parse it to e.g. an OffsetDateTime, but I wonder why it is a string.
Anyone has a suggestion what I'm doing wrong?

How to use script arguments in AWS DataPipeline SQLActivity?

I am trying to execute an unload command on Redshift via Data Pipeline. The script looks something like:
unload ($$ SELECT *, count(*) FROM (SELECT APP_ID, CAST(record_date AS DATE) WHERE len(APP_ID)>0 AND CAST(record_date as DATE)=$1) GROUP BY APP_ID $$) to 's3://test/unload/' iam_role 'arn:aws:iam::xxxxxxxxxxx:role/Test' delimiter ',' addquotes;
The pipeline looks something like this:
{
"objects": [
{
"role": "DataPipelineDefaultRole",
"subject": "SuccessNotification",
"name": "SNS",
"id": "ActionId_xxxxx”,
"message": "SUCCESS: #{format(minusDays(node.#scheduledStartTime,1),'MM-dd-YYYY')}",
"type": "SnsAlarm",
"topicArn": "arn:aws:sns:us-west-2:xxxxxxxxxx:notification"
},
{
"connectionString": “connection-url”,
"password": “password”,
"name": “Test”,
"id": "DatabaseId_xxxxx”,
"type": "RedshiftDatabase",
"username": “username”
},
{
"subnetId": "subnet-xxxxxx”,
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"name": "EC2",
"id": "ResourceId_xxxxx”,
"type": "Ec2Resource"
},
{
"failureAndRerunMode": "CASCADE",
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"pipelineLogUri": "s3://test/logs/",
"scheduleType": "ONDEMAND",
"name": "Default",
"id": "Default"
},
{
"database": {
"ref": "DatabaseId_xxxxxx”
},
"scriptUri": "s3://test/script.sql",
"name": "SqlActivity",
"scriptArgument": "#{format(minusDays(node.#scheduledStartTime,1),"MM-dd-YYYY”)}”,
"id": "SqlActivityId_xxxxx”,
"runsOn": {
"ref": "ResourceId_xxxx”
},
"type": "SqlActivity",
"onSuccess": {
"ref": "ActionId_xxxxx”
}
}
],
"parameters": []
}
However, I keep getting the error: The column index is out of range: 1, number of columns: 0.
I just can't get it to work. I have tried using ?, $1 and I even tried putting the expression #{format(minusDays(node.#scheduledStartTime,1),"MM-dd-YYYY”)} directly in the script. None of them works.
I have looked at the answers to Amazon Data Pipline: How to use a script argument in a SqlActivity? but none of them are helpful.
Does anyone has idea how to use script argument in SQL Script in AWS Data Pipeline?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Saving multiple Athena queries using a template - aws-cloudformation

Related

Azure data factory Dynamic Content

Netsuite - Custom List related to a object field

TMSL Creating Multiple Partitions Unrecognised JSON property

ElasticSearch Reindex API and painless script to access date field

How to use script arguments in AWS DataPipeline SQLActivity?

Categories

Resources