How to run a sql statements sequentially? - amazon-redshift

I am running a set of SQL statements sequentially in the AWS Redshift Query editor.
sql-1
sql-2
sql-3
....
sql-N
However, in the Redshift Query editor, I cannot run multiple SQL statements. So currently I am running SQL statements one by one manually.
What is the alternative approach for me? For me looks like, I can use the DBeaver.
Is there more programmatic approach i.e. just by using a simple bash script?

If you have a Linux instance that can access the cluster you can use the psql command line tool. For example:
yum install postgresql
psql -h my-cluster.cjmul6ivnpa4.us-east-2.redshift.amazonaws.com \
-p 5439 \
-d my_db \
-f my_sql_script.sql
We recently announced a way to schedule queries: https://aws.amazon.com/about-aws/whats-new/2020/10/amazon-redshift-supports-scheduling-sql-queries-by-integrating-with-amazon-eventbridge/ And even more recently published this blog post that walks you through all the steps using the Console or the CLI: https://aws.amazon.com/blogs/big-data/scheduling-sql-queries-on-your-amazon-redshift-data-warehouse/ Hope these links help.

Related

Keeping backup of everything in DB2 version 7

I am trying to keep backups of almost everything from a DB2 version 7, including Tables, Views, Triggers and Indexes. I am currently experimenting with db2look and db2move commands as shown below, but I am able to keep only the DDL and the Tables.
Command to keep the DDL
db2look -d CC_DEV -e -a -o CC_DEV_DDL.sql
Command to keep the Tables with their data
db2move CC_DEV export
Is this enough to keep what I need in case of a disaster? Do I need anything else?

Issues when upgrading and dockerising a Postgres v9.2 legacy database using pg_dumpall and pg_dump

I am using an official postgres v12 docker image that I want to initialise with two SQL dump files that are gathered from a remote legacy v9.2 postgres server during the docker build phase:
RUN ssh $REMOTE_USER#$REMOTE_HOST "pg_dumpall -w -U $REMOTE_DB_USER -h localhost -p $REMOTE_DB_PORT --clean --globals-only -l $REMOTE_DB_NAME" >> dump/a_globals.sql
RUN ssh $REMOTE_USER#$REMOTE_HOST "pg_dump -w -U $REMOTE_DB_USER -h localhost -p $REMOTE_DB_PORT --clean --create $REMOTE_DB_NAME" >> dump/b_db.sql
By placing both a_globals.sql and b_db.sql files into the docker image folder docker-entrypoint-initdb.d, then the database is initialised with the legacy SQL files when the v12 container starts (as described here). Docker is working correctly, the dump files are retrieved successfully. However I am running into problems initialising the container's database and require guidance:
When the container starts to initialise its DB, it stops with ERROR: role $someDBRole does not exist. This is because the psql v9.2 dump SQL files DROP roles before reinstating them; the container DB does not like this. Unfortunately it is not until psql v9.4 that pg_dumpall and pg_dump have the option to --if-exists (see pg_dumpall v9.2 documentation). What would you suggest that I do in order to remedy this? I could manually edit the SQL dump files, but this would be impractical as the snapshots of the legacy DB need to be automated. Is there a way to suppress this error during container startup?
If I want to convert from ASCII to UTF-8, is it adequate to simply set the encoding option for pg_dumpall and pg_dump? Or do I need to take into consideration other issues when upgrading?
Is there a way to supress the removal and adding of the postgres super user which is in the dump SQL?
In general are there any other gotchas when containerising and/or updating a postgres DB.
I'm not familiar with Docker so I don't know how straightforward it'll be do to these things, but in general, pg_dump/dumpall output, when it's in SQL format, will work just fine after having gone through some ugly string manipulation.
Pipe it through sed -e 's/DROP ROLE/DROP ROLE IF EXISTS/', ideally when writing the .sqls, but it's fine to just run sed -i -e <...> to munge the files in-place after they're created if you don't have a full shell available. Make it sed -r -e '/^DROP ROLE/DROP ROLE IF EXISTS/ if you're worried about strings containing DROP ROLE in your data, at the cost of portability (AFAIK -r is a GNU addition to sed).
Yes. It's worth checking the data in pg12 to make sure it got imported correctly, but in the general case, pg_dump has been aware of encoding considerations since time immemorial, and a dump->load is absolutely the best way to change your DB encoding.
Sure. Find the lines that do it in your .sql, copy enough of it to be unique, and pipe it through grep -v <what you copied> :D
I can't speak to the containerizing aspect of things, but - and this is more of a general practice, not even really PG-specific - if you're dealing with a large DB that's getting migrated, prepare a small one, as similar as possible to the real one but omitting any bulky data, to test with to get everything working so that doing the real migration is just a matter of changing some vars (I guess $REMOtE_HOST and $REMOTE_PORT in your case). If it's not large, then just be comfortable blowing away any pg12 containers that failed partway through the import, figure out & do whatever to fix the failure, and start from the top again until it works end-to-end.

Dump Contents of RDS Postgres Query

Short Version of this Question:
I'd like to dump the contents of a Postgres query from a db instance hosted in RDS inside of a shell script.
Complete Version:
Right now I'm writing a shell script that I would like to dump the contents of a query into a .dump file from a source database, and run the dump file on a destination database instance. Both db instances are hosted in RDS.
MySQL allows you to do this using the mysqldump tool, but the recommended answer to this problem in Postgres seems to be to use the COPY command. However, the COPY command isn't available in RDS instances. The recommended solution in this case seems to be to use the '\copy' command, which does the same thing locally using the psql tool. However, it doesn't seem like this is a support option inside of a shell script.
What's the best way to accomplish this?
Thank you!
I am not familiar with shell, but I have used batch file in Windows to dump output of query to a file and to import the file on another instance.
Here is what I used to export from postgres RDS to file on Windows.
SET PGPASSWORD=your_password
cd "C:\Program Files (x86)\pgAdmin 4\v3\runtime"
psql -h your_host -U your_username -d your_databasename -c "\copy (your_query) TO
path\file_name.sql"
All above commands are in one batch file.

How to execute Redshift queries in parallel

I need to simulate some basic load testing against my Redshift cluster and I need to execute around 20 SELECT queries in parallel.
Since stored procedures are not supported by Redshift, I would love to get some ideas on how I can accomplish this.
To initiate the selects in parallel, install this
https://github.com/gbb/par_psql
and then you can run parallel sql commands against redshift like this
export PGPASSWORD=your_pw; par_psql -h your_redshift -p 5439 -U your_username -d mydb —-file=myscript.sql
Check the WLM Query Slot Count
Check Route Queries to Queues

Custom pg:dump options with Heroku pg:backups capture?

When developing, I need to pull the latest database so I know I'm working with the latest data. However, we keep a table full of Archives that I don't need to bother downloading because it's a very large table.
I know pg_dump allows for custom parameters that will let you exclude a certain table from being dumped.
Without doing anything crazy like having 2 databases, 1 for data and 1 for archives, is there any way to download everything BUT the archives table from Heroku?
I still need it to keep backups of the archives table, but I don't want to be downloading it. Can I just do a pg_dump when needed that is seperate from the backups?
I know it's a long shot, but any suggestions would be greatly appreciated.
You can't add any custom pg_dump options when using heroku pg:backups capture. This command actually calls an undocumented Heroku Postgres API and it doesn't pass any parameters (see here for the code if you are curious).
What you can do is run your own pg_dump dump command that points to the Heroku Postgres instance.
Get the connection info with pg:credentials where DATABASE_URL can also be the the database color if you have more than one database attached to the app:
> heroku pg:credentials DATABASE_URL --app app_name
Connection info string:
"dbname=zzxcasdqwe host=ec2-1-1-1-1.compute-1.amazonaws.com port=1111 user=asdfasdf password=qwertyqwerty sslmode=require"
Connection URL:
postgres://asdfasdf:qwertyqwerty#ec2-1-1-1-1.compute-1.amazonaws.com:1111/zzxcasdqwe
Take either the the connection info string or the connection url and include that as the first argument to pg_dump and add your custom options
pg_dump "dbname=zzxcasdqwe host=ec2-1-1-1-1.compute-1.amazonaws.com port=1111 user=asdfasdf password=qwertyqwerty sslmode=require"\
-n schema -t table -O -x -Fc -f dump.out
# OR
pg_dump postgres://asdfasdf:qwertyqwerty#ec2-1-1-1-1.compute-1.amazonaws.com:1111/zzxcasdqwe \
-n schema -t table -O -x -Fc -f dump.out
I also co-wrote a Heroku plugin (parse_db_url) that will parse DATABASE_URL's into other formats like pg_dump, pg_restore, pgpass etc. I find it useful when dealing with several different Heroku databases.