AWS Redshift: FATAL: connection limit "500" exceeded for non-bootstrap users - amazon-redshift

Hope you're all okay.
We hit this limit quite often. We know there is no way to up the 500 limit of concurrent user connections in Redshift. We also know certain views (pg_user_info) provide info as to the user's actual limit.
We are looking for some answers not found in this forum plus any guidance based on your experience.
Questions:
Does recreation of the cluster with bigger EC2 instances, would yield a higher limit value?
Does adding new nodes to the existing cluster would yield a higher limit value?
From the app development perspective: What specific strategies/actions you'd recommend in order to spot or predict a situation whereby this limit will be hit?
Txs - Jimmy

Okay folks.
thanks to all who answered.
I posted a support ticket in AWS and this is the recommendation, pasting all here, it's long but I hope it works for many people running into this issue. The idea is to catch the situation before it happens:
To monitor the number of connections made to the database, you can create a cloudwatch alarm based on the Database connections metrics that will trigger a lambda function when a certain threshold is reached. This lambda function can then terminate idle connections by calling a procedure that terminates idle connections.
Please find the query that creates a procedure to log and terminate long running inactive sessions
:
1. Add view to get all current inactive sessions in the cluster
CREATE OR REPLACE VIEW inactive_sessions as (
select a.process,
trim(a.user_name) as user_name,
trim(c.remotehost) as remotehost,
a.usesysid,
a.starttime,
datediff(s,a.starttime,sysdate) as session_dur,
b.last_end,
datediff(s,case when b.last_end is not null then b.last_end else a.starttime end,sysdate) idle_dur
FROM
(
select starttime,process,u.usesysid,user_name
from stv_sessions s, pg_user u
where
s.user_name = u.usename
and u.usesysid>1
and process NOT IN (select pid from stv_inflight where userid>1
union select pid from stv_recents where status != 'Done' and userid>1)
) a
LEFT OUTER JOIN (
select
userid,pid,max(endtime) as last_end from svl_statementtext
where userid>1 and sequence=0 group by 1,2) b ON a.usesysid = b.userid AND a.process = b.pid
LEFT OUTER JOIN (
select username, pid, remotehost from stl_connection_log
where event = 'initiating session' and username <> 'rsdb') c on a.user_name = c.username AND a.process = c.pid
WHERE (b.last_end > a.starttime OR b.last_end is null)
ORDER BY idle_dur
);
2. Add table for logging information about long running transactions that was terminated
CREATE TABLE IF NOT EXISTS terminated_inactive_sessions (
process int,
user_name varchar(50),
remotehost varchar(50),
starttime timestamp,
session_dur int,
idle_dur int,
terminated_on timestamp DEFAULT GETDATE()
);
3. Add procedure to log and terminate any inactive transactions running for longer than 'n' amount of seconds
CREATE OR REPLACE PROCEDURE terminate_and_log_inactive_sessions (n INTEGER)
AS $$
DECLARE
expired RECORD ;
BEGIN
FOR expired IN SELECT process, user_name, remotehost, starttime, session_dur, idle_dur FROM inactive_sessions where idle_dur >= n
LOOP
EXECUTE 'INSERT INTO terminated_inactive_sessions (process, user_name, remotehost, starttime, session_dur, idle_dur) values (' || expired.process || ' , ''' || expired.user_name || ''' , ''' || expired.remotehost || ''' , ''' || expired.starttime || ''' , ' || expired.session_dur || ' , ' || expired.idle_dur || ');';
EXECUTE 'SELECT PG_TERMINATE_BACKEND(' || expired.process || ')';
END LOOP ;
END ;
$$ LANGUAGE plpgsql;
4. Execute the procedure by running the following command:
call terminate_and_log_inactive_sessions(100);
Here is a sample lambda function that attempts to close idle connections by querying the view 'inactive_sessions' created above, which you can use as a reference.
#Current time
now = datetime.datetime.now()
query = "SELECT process, user_name, session_dur, idle_dur FROM inactive_sessions where idle_dur >= %d"
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
try:
conn = psycopg2.connect("dbname=" + db_database + " user=" + db_user + " password=" + db_password + " port=" + db_port + " host=" + db_host)
conn.autocommit = True
except:
logger.error("ERROR: Unexpected error: Could not connect to Redshift cluster.")
sys.exit()
logger.info("SUCCESS: Connection to RDS Redshift cluster succeeded")
with conn.cursor() as cur:
cur.execute(query % (session_idle_limit))
row_count = cur.rowcount
if row_count >=1:
result = cur.fetchall()
for row in result:
print("terminating session with pid %s that has been idle for %d seconds at %s" % (row[0],row[3],now))
cur.execute("SELECT PG_TERMINATE_BACKEND(%s);" % (row[0]))
conn.close()
else:
conn.close()

As you said this is a hard limit in Redshift and there is no way to up it. Redshift is not a high concurrency / high connection database.
I expect that if you need the large data analytic horsepower of Redshift you can get around this with connection sharing. Pgpool is a common tool for this.

Related

Get postgres query log statement and duration as one record

I have log_min_duration_statement=0 in config.
When I check log file, sql statement and duration are saved into different rows.
(Not sure what I have wrong, but statement and duration are not saved together as this answer points)
As I understand session_line_num for duration record always equals to session_line_num + 1 for relevant statement, for same session of course.
Is this correct? is below query reliable to correctly get statement with duration in one row?
(csv log imported into postgres_log table):
WITH
sql_cte AS(
SELECT session_id, session_line_num, message AS sql_statement
FROM postgres_log
WHERE
message LIKE 'statement%'
)
,durat_cte AS (
SELECT session_id, session_line_num, message AS duration
FROM postgres_log
WHERE
message LIKE 'duration%'
)
SELECT
t1.session_id,
t1.session_line_num,
t1.sql_statement,
t2.duration
FROM sql_cte t1
LEFT JOIN durat_cte t2
ON t1.session_id = t2.session_id AND t1.session_line_num + 1 = t2.session_line_num;

SSRS SQL-Restart subscription runs all scheduled Reports

I am using the folling SQL-Script to restart faild SSRS mailing subscriptions:
DECLARE #ScheduledReportName varchar(200)
DECLARE #JobID uniqueidentifier
DECLARE #LastRunTime datetime
Declare #JobStatus Varchar(100)
--------------------------------------------------------
DECLARE #RunAllReport CURSOR
SET #RunAllReport = CURSOR FAST_FORWARD
FOR
SELECT
CAT.[Name] AS RptName
, res.ScheduleID AS JobID
, sub.LastRuntime
, CASE WHEN job.[enabled] = 1 THEN 'Enabled'
ELSE 'Disabled'
END AS JobStatus
FROM
dbo.Catalog AS cat
INNER JOIN dbo.Subscriptions AS sub
ON CAT.ItemID = sub.Report_OID
INNER JOIN dbo.ReportSchedule AS res
ON CAT.ItemID = res.ReportID
AND sub.SubscriptionID = res.SubscriptionID
INNER JOIN msdb.dbo.sysjobs AS job
ON CAST(res.ScheduleID AS VARCHAR(36)) = job.[name]
INNER JOIN msdb.dbo.sysjobschedules AS sch
ON job.job_id = sch.job_id
INNER JOIN dbo.Users U
ON U.UserID = sub.OwnerID
----------------Filter der Subscriptions----------------
where sub.subscriptionid in
(
SELECT subscriptionid
FROM Subscriptions AS S
LEFT OUTER JOIN [Catalog] AS C
ON C.ItemID = S.Report_OID
WHERE S.LastStatus like 'Failure sending mail%'
)
----------------Filter der Subscriptions----------------
ORDER BY U.UserName, RptName
OPEN #RunAllReport
FETCH NEXT FROM #RunAllReport
INTO #ScheduledReportName,#JobID,#LastRunTime,#JobStatus
WHILE ##FETCH_STATUS = 0
BEGIN
Print #ScheduledReportName --&' ' & #JobID
EXEC msdb.dbo.sp_start_job #job_name =#JobID
FETCH NEXT FROM #RunAllReport
INTO #ScheduledReportName,#JobID,#LastRunTime,#JobStatus
END
CLOSE #RunAllReport
DEALLOCATE #RunAllReport
I run this if an subscripton fails. In my example I send the same Report to multiple persons as a subscription with differen parameters. Sometime one subscription fails and I want to restart the job. The query in the upper script provides the specific subscriptionID of the failed one.
But even though the ScheduleID is handed over AS JobID all the Reports are beeing resend to all people.
Is there something wrong with the script?
Pleas help me.
Turns out that the procedure in my old query:
EXEC msdb.dbo.sp_start_job #job_name =#JobID
Runs all the Subscriptions under a ScheduleID AS JobID. So all Report with a SharedSchedule are launched again, by a job restart.
To restart only a specific supbscription I have to run the following command at its place, with the subscriptionID instead of the scheduleID:
exec [dbo].[AddEvent] 'TimedSubscription', #SubscriptionID;
This will add an event into the [dbo].[Event] table that will run the subscription passed in the parameter. So only the required Report will be sent.

ColdFusion 2016 and stored proc throwing invalid character error

I am trying existing code in a CF 2016 install... I get this error
"[Macromedia][DB2 JDBC Driver][DB2]ILLEGAL SYMBOL =; VALID SYMBOLS ARE ..."
the line identified is a param of a stored proc call that looks like this:
<cfstoredproc datasource="#application.dsn#" procedure="LIVE.STOPS">
<cfprocparam type="In" cfsqltype="CF_SQL_BIGINT" dbvarname="STOPID" value="#val( variables.procstopid )#" null="no">
<cfprocparam type="In" cfsqltype="CF_SQL_INTEGER" dbvarname="TRIPID" value="#val( url.tripId )#" null="no">
</cfstoredproc>
I cannot find any mention on line of a change in stored proc tag - maybe the DB2 driver? I'm looking for any input. Thanks.
Other info;
Windows10, Apache2.4, connectiong to DB2 v10.
#pendo, Here is the stored proc - it should be noted that I abbreviated some of the sql, but the SP works and has for a long time in the app running CF10.
CREATE OR REPLACE PROCEDURE LIVE.STOP(
IN stopId BIGINT DEFAULT 0,
IN tripId INTEGER DEFAULT 0
) LANGUAGE SQL
BEGIN
DECLARE updateTripId INTEGER DEFAULT 0;
DECLARE minStopId BIGINT DEFAULT 0;
DECLARE maxStopId BIGINT DEFAULT 0;
DECLARE TripSearch_cursor CURSOR FOR
SELECT s1.fkTripsId
FROM live.paymentsTripsStops s1
JOIN live.Trips t ON s1.fkTripsId = t.Id
WHERE s1.fkStopsId = stopId
FETCH FIRST 1 ROWS ONLY;
DECLARE minMaxStop_cursor CURSOR FOR
SELECT
COALESCE(
(
SELECT s.Id
FROM live.Stops s
JOIN live.Trips t ON s.fkTripsId = t.Id
ORDER BY s.Sequence
FETCH FIRST 1 ROWS ONLY
),
0
) AS firstStopId,
COALESCE(
(
SELECT s.Id
FROM live.Stops s
JOIN live.Trips t ON s.fkTripsId = t.Id
ORDER BY s.Sequence DESC
FETCH FIRST 1 ROWS ONLY
),
0
) AS lastStopId
FROM live.Trips t
WHERE t.Id = updateTripId
FETCH FIRST 1 ROWS ONLY;
IF TripId > 0
THEN SET updateTripId = TripId;
ELSE OPEN TripSearch_cursor;
FETCH FROM TripSearch_cursor INTO updateTripId;
CLOSE TripSearch_cursor;
END IF;
IF updateTripId > 0
THEN OPEN minMaxStop_cursor;
FETCH FROM minMaxStop_cursor INTO minStopId, maxStopId;
CLOSE minMaxStop_cursor;
UPDATE live.Trips
SET fkFirstStopId = minStopId,
fkLastStopId = maxStopId
WHERE intId = updateTripId;
END IF;
END

Convert value to unique value (ex John to John_1)

The user writes his name and i want to store it into the database. If the name is already in the database i want to insert a postfix. ie Convert 'John' to the first one available between ('John_1', 'John_2' ... etc).
This is my way of doing this so far, but i'm sure there's a better way.
select n from
(
select 'John' n ,0 v
union
select 'John'||'_'||generate_series(1,100),generate_series(1,100)
) possible_names
where n not in
(select my_name from all_names u)
order by v
limit 1
Any suggestions?
If you need to worry about concurrency, the simplest way to guarantee uniqueness is by issuing insert statements until one succeeds. (This assumes you've a unique constraint, of course.)
Pseudocode:
while true
if db.execute(insert_sql, [..., name + postfix, ...])
break
end
counter += 1
postfix = '_' + counter
end
You can make the procedure run in a shorter amount of time by starting at the maximum existing postfix (see the other answers with approaches to do that).
An awkward alternative would be to find the maximum existing postfix using a select statement, and then to try to acquire an advisory lock on something unique to the applicable name and postfix, e.g. 'username:' + name + postfix. It's much less robust though, because it opens up the possibility of two transactions finding the same max_postfix, and then one transaction trying to acquire the lock immediately after other is done committing its insert and releasing that lock -- thus resulting in a duplicate.
SELECT CASE WHEN num IS NULL THEN 'John' ELSE 'John' || '_' || num END AS new_name
FROM (
SELECT max(substr(my_name, position('_' in my_name) + 1)::int) + 1 AS num
FROM all_names
WHERE my_name ilike 'John' || '_%'
) new_number
With all three instances of 'John' being where you pass in the name entered. (This is assuming that the user can't make an underscore part of their name and a number will always follow the underscore.)
Edit: This is also assuming that 'John' and 'john' should be treated the same. If they shouldn't, then replace the ilike with like instead.
CREATE FUNCTION get_username_proposal(text) RETURNS text AS $$
SELECT
CASE WHEN (SELECT COUNT(*) FROM all_names WHERE my_name = $1)=0 THEN
$1
ELSE
$1 || '_' || COALESCE(MAX(LTRIM(SUBSTRING(my_name FROM '_[0-9]+$'), '_')::int), 0)+1
END
FROM
all_names
WHERE
my_name ~ ($1 || '_[0-9]+$');
$$ LANGUAGE SQL STABLE;

SQL SErver Trigger not evaluating as Insert or Update properly

I want to have one trigger to handle updates and inserts. Most of the sql actions in the trigger are for both. The only exception is the fields I'm using to record date and username for an insert and an update. This is what I have, but the updates of the fields used to track update and insert are not firing right. If I insert a new record, I get CreatedBy, CreatedOn, LastEditedBy, LastEditedOn populated, with LastEditedOn as 1 second after CreatedOn (which I dont want to happen). When I update the record, only the LastEditedBy & LastEditedOn changes (which is correct). I'm including my full trigger for reference:
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
-- =================================================================================
-- Author: Paul J. Scipione
-- Create date: 2/15/2012
-- Update date: 6/5/2012
-- Description: To concatenate several fields into a set formatted UnitDescription,
-- to total Span & Loop footages, to set appropriate AcctCode, & track
-- user inserts
-- =================================================================================
IF OBJECT_ID('ProcessCable', 'TR') IS NOT NULL
DROP TRIGGER ProcessCable
GO
CREATE TRIGGER ProcessCable
ON Cable
AFTER INSERT, UPDATE
AS
BEGIN
SET NOCOUNT ON;
-- IF TRIGGER_NESTLEVEL() > 1 RETURN
IF ((SELECT TRIGGER_NESTLEVEL()) > 1 )
RETURN
ELSE
BEGIN
-- record user and date of insert or update
IF EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
ELSE IF NOT EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
-- reset Suffix if applicable
UPDATE Cable SET Suffix = NULL WHERE Suffix = 'n/a'
-- create UnitDescription value
UPDATE Cable SET UnitDescription =
isnull (Type, '') +
isnull (CONVERT (NVARCHAR (10), Size), '') +
'-' +
isnull (CONVERT (NVARCHAR (10), Gauge), '') +
CASE
WHEN ExtraTrench IS NOT NULL AND ExtraTrench > 0 THEN
CASE
WHEN Suffix IS NULL THEN 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')'
ELSE 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')' + Suffix
END
ELSE isnull (Suffix, '')
END
-- convert any accidental negative numbers entered
UPDATE Cable SET Length = ABS(Length)
-- sum Length with LoopFootage into TotalFootage
UPDATE Cable SET TotalFootage = isnull(Length, 0) + isnull(LoopFootage, 0)
-- set proper AcctCode based on Type
UPDATE Cable SET AcctCode =
CASE
WHEN Type IN ('SEA', 'CW', 'CJ') THEN '32.2421.2'
WHEN Type IN ('BFC', 'BJ', 'SEB') THEN '32.2423.2'
WHEN Type IN ('TIP','UF') THEN '32.2422.2'
WHEN Type = 'unknown' OR Type IS NULL THEN 'unknown'
END
WHERE AcctCode IS NULL OR AcctCode = ' '
END
END
GO
A few things jump out at me when I look at your trigger:
You are doing several additional updates rather than a single update (performance-wise, a single update would be better).
Your update statements are unconstrained (there is no JOIN to the inserted/deleted tables to limit the number of records that you perform these additional updates on).
Most of this logic feels like it should be in the application layer rather than in the database; Or, perhaps in some cases implemented differently.
Some quick examples:
Suffix of "n/a" should be removed before inserted.
Cable length absolute value should be done before inserted (with a CHECK CONSTRAINT to verify that bad data cannot be inserted).
TotalFootage should be a computed column so it is always correct.
The Type/AcctCode relationship seems like it should be a column value in a foreign key reference.
But ultimately, I think the reason you are seeing the unexpected dates is because of the unconstrained updates. Without addressing any of the other concerns I brought up above, the statement that sets the audit fields should be more like this:
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN inserted on Cable.PrimaryKeyColumn = inserted.PrimaryKeyColumn
LEFT JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
WHERE deleted.PrimaryKeyColumn IS NULL