What query language is used by AWS Cloudwatch Logs Insights? - aws-cloudwatch-log-insights

The AWS guides for insight queries aren't very specific but I wondered what query language it may use or at least, most resemble?
This is the example they give when you navigate to Logs Insights:
fields #timestamp, #message
| sort #timestamp desc
| limit 20
Thanks.

Related

Masking the logs of kafka connector

I have property file in which there is some secrets(credentials) and certificates, i don't want to log them.
So is there any way to put that credentials somewhere else or not log them.
Do we have something called masking in apache kafka..?
If you happen to be using KSQL for streaming queries you use the masking function MASK().
CREATE STREAM MASKED_PURCHASES AS
SELECT MASK(CUSTOMER_NAME) AS CUSTOMER_NAME,
MASK_RIGHT(DATE_OF_BIRTH,12) AS DATE_OF_BIRTH,
ORDER_ID, PRODUCT, ORDER_TOTAL_USD, TOWN, COUNTRY
FROM PURCHASES;
ksql> SELECT CUSTOMER_NAME, DATE_OF_BIRTH, PRODUCT, ORDER_TOTAL_USD FROM MASKED_PURCHASES LIMIT 1;
Xxxxxx-Xxxxxx | 1908-03-nnXnn-nn-nnX | Langers - Mango Nectar | 5.80
Documentation source is here.

Detecting if stopwords are present in a field

I am using tsvectors to search for similar entries in a list of keyword I have. In this way I am able to consider the following keyword are identical:
IT security governance
it security government
The Problem is that, due to the stopwords logic, also the following entries are considered similar:
IT environment
Environment
So, I would like to have a way to detect which records contains stopwords, so that I can treat them differently.
I can add a boolean value to the record to know if it contains stopwords or not.
Any idea ?
You can use ts_debug to find the lexemes that are generated by full text search:
SELECT array_agg(lexemes[1]) FILTER (WHERE lexemes[1] IS NOT NULL)
FROM ts_debug('english', 'IT security governance')
WHERE alias IN ('asciiword', 'word');
array_agg
----------------
{secur,govern}
(1 row)
To find if there is a stop present, you can look if the lexeme is NULL:
SELECT token
FROM ts_debug('english', 'IT security governance')
WHERE alias IN ('asciiword', 'word')
AND lexemes[1] IS NULL;
token
-------
IT
(1 row)
Based on the suggetion of #Laurenz-Albe I came up with this more general solution; based on the value returned by this queries I can distinguish entries with stowords from those that don't have any:
select count(*) FROM ts_debug('english', 'IT security governance')
where array_length(lexemes,1) = 0
select count(*) FROM ts_debug('english', 'advanced security governance')
where array_length(lexemes,1) = 0

Retrieve Redshift error messages

I'm running queries on a Redshift cluster using DataGrip that take upwards of 10 hours to run and unfortunately these often fail. Alas, DataGrip doesn't maintain a connection to the database long enough for me to get to see the error message with which the queries fail.
Is there a way of retrieving these error messages later, e.g. using internal Redshift tables? Alternatively, is there are a way to make DataGrip maintain the connection for long enough?
Yes, you Can!
Query stl_connection_log table to find out pid by looking at the recordtime column when your connection was initiated and also dbname, username and duration column helps to narrow down.
select * from stl_connection_log order by recordtime desc limit 100
If you can find the pid, you can query stl_query table to find out if are looking at right query.
select * from stl_query where pid='XXXX' limit 100
Then, check the stl_error table for your pid. This will tell you the error you are looking for.
select * from stl_error where pid='XXXX' limit 100
If I’ve made a bad assumption please comment and I’ll refocus my answer.

Maximum number of user per cluster in redshift

We are trying to create multiple users for our redshift cluster to implement WLM. can anyone please tell us what is the maximum number of users supported by redshift per cluster.
Although I don't know the actual limit number of users, I confirmed more than 10,000 users can be created on a Redshift cluster.
dev=# select count(*) from pg_user;
count
-------
10003
(1 row)

Proper GROUP BY syntax

I'm fairly proficient in mySQL and MSSQL, but I'm just getting started with postgres. I'm sure this is a simple issue, so to be brief:
SQL error:
ERROR: column "incidents.open_date" must appear in the GROUP BY clause or be used in an aggregate function
In statement:
SELECT date(open_date), COUNT(*)
FROM incidents
GROUP BY 1
ORDER BY open_date
The type for open_date is timestamp with time zone, and I get the same results if I use GROUP BY date(open_date).
I've tried going over the postgres docs and some examples online, but everything seems to indicate that this should be valid.
The problem is with the unadorned open_date in the ORDER BY clause.
This should do it:
SELECT date(open_date), COUNT(*)
FROM incidents
GROUP BY date(open_date)
ORDER BY date(open_date);
This would also work (though I prefer not to use integers to refer to columns for maintenance reasons):
SELECT date(open_date), COUNT(*)
FROM incidents
GROUP BY 1
ORDER BY 1;
"open_date" is not in your select list, "date(open_date)" is.
Either of these will work:
order by date(open_date)
order by 1
You can also name your columns in the select statement, and then refer to that alias:
select date(open_date) "alias" ... order by alias
Some databases require the keyword, AS, before the alias in your select.