DBT Cloud: Trying to conditional create tables in a DB based on variable - amazon-redshift

In my profiles.yml file I have vars declared:
vars:
database: dbtworkshop
And then within my models I have the following:
{{ config(database= var('database') ) }}
select
*
from {{var('database')}}.staging.orders
The query appears to be compiling correctly however I'm getting an error when passing the vars at runtime
dbt run --vars '{"database":"a"}'
I get the error:
Postgres adapter: Postgres error: cross-database reference to database "a" is not supported
I'm not sure whether doing something like this (ie having models that I want to create in a database which changes based on the variable passed at runtime) is even possible using dbt or if I'm just doing something wrong.

When using Redshift with dbt, make sure to set ra3_node = true in your profiles.yml since the older node type of DC2 does not actually support cross database queries. RA3 and Serverless support it, and for those the setting is required.
See https://docs.getdbt.com/reference/warehouse-setups/redshift-setup

Related

Prisma with Bit.io connection issue

I have a next.js app setup with prisma (v3.13) as the ORM. I am testing out bit.io for db hosting, and I am getting this error when trying to connect with the client. Everything works as intended when I use a local postgres db. I'm currently using a connection string that looks like the following:
DATABASE_URL="postgresql://[username]:[password]#db.bit.io/[username]/[dbname]"
I am trying to run prisma db push and getting the following error
Environment variables loaded from .env
Prisma schema loaded from prisma/schema.prisma
Datasource "db": PostgreSQL database "eli-front/rankstl", schema "public" at "db.bit.io:5432"
Error: P1000: Authentication failed against database server at `db.bit.io`, the provided database credentials for `(not available)` are not valid.
Please make sure to provide valid database credentials for the database server at `db.bit.io`.
I am assuming the core of the issue has to due with the part of the error that says credentials for '(not available)' as if something isn't loading correctly.
Using the failing connection string with psql works completely fine, but not with prisma.
There are two things that need to be done in order for bit.io to work with Prisma.
Database names must be formatted as username.dbname rather than username/dbname. bit.io supports a number of different separator characters in the database name because different clients have different requirements around permissible characters in database names.
You have to create a second database on bit.io to use as a "shadow database." By default, this is done automatically—a shadow database is created, used, and deleted. However, most cloud database providers don't allow use of the CREATE DATABASE, so a shadow database must be created explicitly. See the prisma docs for details.
See the bit.io docs on connecting with Prisma for more details on setting up a minimum working connection.

AWS Glue ETL Job Missing collection name

I have data catalog tables generated by crawlers one is data source from mongodb, and second is datasource Postgres sql (rds). Crawlers running successfully & connections test working.
I am trying to define an ETL job from mongodb to postgres sql (simple transform).
In the job I defined source as AWS Glue Data Catalog (mongodb) and target as Data catalog Postgres.
When I run the job I get this error:
IllegalArgumentException: Missing collection name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.collection' property
It looks like this is related to the mongodb part. I tried to set the 'database' and 'collection' parameters in the data catalog tables and it didn't help
Script generated for source is:
AWSGlueDataCatalog_node1653400663056 = glueContext.create_dynamic_frame.from_catalog(
database="data-catalog-db",
table_name="data-catalog-table",
transformation_ctx="AWSGlueDataCatalog_node1653400663056"
What could be missing?
I had the same problem, just add the parameter below.
AWSGlueDataCatalog_node1653400663056 = glueContext.create_dynamic_frame.from_catalog(
database="data-catalog-db",
table_name="data-catalog-table",
transformation_ctx="AWSGlueDataCatalog_node1653400663056"
additional_options = {"database":"data-catalog-db",
"collection":"data-catalog-table"}
Additional parameters can be found on the AWS page
https://docs.aws.amazon.com/glue/latest/dg/connection-mongodb.html

How to log all DDLs on a gcloud SQL instance?

I would like to log all DDL queries that are run on SQL instance. I tried looking into plugins but they are not allowed.
Edit: It's a MySQL instance.
You may set the Cloud SQL Flag "log_statement" to value "mod" to log all Data definition language (DDL) statements.
The reference link :
https://cloud.google.com/sql/docs/postgres/flags#postgres-l

How can I remove an enum label from pg_enum table in Google Cloud SQL?

I have a Node app (Express) that I built recently, which uses Sequelize to connect to a PostgreSQL instance. I just deployed this to GCP and used Cloud SQL to set up the database. This works quite well and the app properly connects to the DB instance.
However, when running the migrations I have I get this error: permission denied for table pg_enum
This happens on a single migration that I have which tries to remove an enum value from the database:
module.exports = {
up: (queryInterface, Sequelize) =>
queryInterface.sequelize.query(
`DELETE FROM pg_enum WHERE enumlabel = 'to' AND enumtypid = (SELECT oid FROM pg_type WHERE typname = 'enum_Subscriptions_emailType')`
),
down: (queryInterface, Sequelize) =>
queryInterface.sequelize.query(`ALTER TYPE "enum_Subscriptions_emailType" ADD VALUE 'to';`)
}
I've read here that since Cloud SQL is a managed service, it doesn't provide superuser privileges to customers like me. Is there some other way that I can get this migration run?
I've also tried running a Cloud Build process, but that also fails with the following error: ERROR: connect ENOENT /cloudsql/<project-id>:<project-region>:<db-instance>/.s.PGSQL.5432. For reference, my cloudbuild.yaml file looked like this:
steps:
- name: 'gcr.io/cloud-builders/yarn'
args: ['install']
- name: 'gcr.io/cloud-builders/yarn'
args: ['migrate']
- name: 'gcr.io/cloud-builders/gcloud'
args: ['app', 'deploy']
and my package.json scripts were:
"scripts": {
"start": "node index.js",
"migrate": "npx sequelize-cli db:migrate",
"dev": "set DEBUG=app:* && nodemon index.js",
"deploy": "gcloud builds submit --config cloudbuild.yaml ."
}
What else can I do to get around this, so that the migrations I have following this one can run and my app can function?
Removing a value from an enum is not supported by PostgreSQL. You can only add new ones or rename existing ones.
While it might work somewhat reliably by modifying the system catalogue, even this is not officially supported and needs superuser permissions for a reason - so there is no way to do it without.
The supported way to do what you want to do is to recreate the type without the value.
CREATE TYPE new_enum AS ENUM('a','b','c');
ALTER TABLE table_using_old_enum
ALTER COLUMN colum_using_old_enum
SET DATA TYPE new_enum
USING colum_using_old_enum::text::new_enum;
DROP TYPE old_enum;
ALTER TYPE new_enum RENAME TO old_enum;
This obviously only works if no entry in table_using_old_type is still set to the value that you want to remove - you need to ensure this first, otherwise the typecast in the USING clause will fail for those values.
--
Alternatively, you could also just comment out that query - if you already know that a few minutes later with a later migration the type will be removed completely, it shouldn't cause an issue if that value stays in until then.

How to set default database in Postgresql database dump script?

I'd need to initialize postgres instance to Docker container from dump SQL-file. Otherwise it works fine but the problem is I cannot set database to be something else than "postgres". Creating new database works fine but schema clauses eg. CREATE TABLE end up going nowhere.
I tried to set default database with --env option in docker run command but it returns error --env requires a value.
Is there any way to set default database? Hopefully in SQL-clause.
Apparently you need to use /connect "dbname=[database name]" before schema clauses in order to point script towards correct dabase.
This wasn't (quite understandbly) included into the script when dump was generated only for a single database instead of the whole cluster.