Error when creating external table in Redshift Spectrum with dbt: cross-database reference not supported - amazon-redshift

I want to create an external table in Redshift Spectrum from CSV files. When I try doing so with dbt, I get a strange error. But when I manually remove some double quotes from the SQL generated by dbt and run it directly, I get no such error.
First I run this in Redshift Query Editor v2 on default database dev in my cluster:
CREATE EXTERNAL SCHEMA example_schema
FROM DATA CATALOG
DATABASE 'example_db'
REGION 'us-east-1'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;
Database dev now has an external schema named example_schema (and Glue catalog registers example_db).
I then upload example_file.csv to the S3 bucket s3://example_bucket. The file looks like this:
col1,col2
1,a,
2,b,
3,c
Then I run dbt run-operation stage_external_sources in my local dbt project and get this output with an error:
21:03:03 Running with dbt=1.0.1
21:03:03 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.example_project.example_models
21:03:03 1 of 1 START external source example_schema.example_table
21:03:03 1 of 1 (1) drop table if exists "example_db"."example_schema"."example_table" cascade
21:03:04 Encountered an error while running operation: Database Error
cross-database reference to database "example_db" is not supported
I try running the generated SQL in Query Editor:
DROP TABLE IF EXISTS "example_db"."example_schema"."example_table" CASCADE
and get the same error message:
ERROR: cross-database reference to database "example_db" is not supported
But when I run this SQL in Query Editor, it works:
DROP TABLE IF EXISTS "example_db.example_schema.example_table" CASCADE
Note that I just removed some quotes.
What's going on here? Is this a bug in dbt-core, dbt-redshift, or dbt_external_tables--or just a mistake on my part?
To confirm, I can successfully create the external table by running this in Query Editor:
DROP SCHEMA IF EXISTS example_schema
DROP EXTERNAL DATABASE
CASCADE
;
CREATE EXTERNAL SCHEMA example_schema
FROM DATA CATALOG
DATABASE 'example_db'
REGION 'us-east-1'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;
CREATE EXTERNAL TABLE example_schema.example_table (
col1 SMALLINT,
col2 CHAR(1)
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
LOCATION 's3://example_bucket'
TABLE PROPERTIES ('skip.header.line.count'='1')
;
dbt config files
models/example/schema.yml (modeled after this example:
version: 2
sources:
- name: example_source
database: dev
schema: example_schema
loader: S3
tables:
- name: example_table
external:
location: 's3://example_bucket'
row_format: >
serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
'strip.outer.array'='false'
)
columns:
- name: col1
data_type: smallint
- name: col2
data_type: char(1)
dbt_project.yml:
name: 'example_project'
version: '1.0.0'
config-version: 2
profile: 'example_profile'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_packages"
models:
example_project:
example:
+materialized: view
packages.yml:
packages:
- package: dbt-labs/dbt_external_tables
version: 0.8.0

Related

Trying to use cloud storage on Db2 external tables

I try to use Db2 external tables, with Swift or Amazon S3 object storage option.
I use my own Openstack Swift server.
Here are the details of steps:
[i1156#lat111 ~]$ swift --auth-version 3 --os-auth-url http://myip:5000/v3 --os-project-name myproject --os-project-domain-name default --os-username user1 --os-password mypass list container1
outfile
[i1156#lat111 ~]$ db2 "CREATE EXTERNAL TABLE TB_EXTERNAL(COL1 VARCHAR(5)) USING (FORMAT TEXT DELIMITER '|' QUOTEDVALUE DOUBLE CCSID 1208 NULLVALUE 'NULL' NOLOG TRUE DATAOBJECT 'outfile' SWIFT ('http://myip:5000/v3', 'user1', 'mypass', 'container1'))"
DB20000I The SQL command completed successfully.
[i1156#lat111 ~]$ db2 "select * from TB_EXTERNAL"
COL1
-----
SQL20569N The external table operation failed due to a problem with the
corresponding data file or diagnostic files. File name: "outfile". Reason
code: "1". SQLSTATE=428IB
The result is the same when I try to use AWS S3 storage. Any idea ?
Thanks

liquibase default schema ignored in sql changelog

Problem: liquibase can't find table without setting schema in SQL script.
How to say liquibase use default schema in SQL changelog?
Before sql changelog, for adding check constraint, I create all table, without setting schema. Schema was set in application.properties and all table was created correctly in $RM_DB_SCHEMA.
RM_DB_SCHEMA: MANAGER
RM_DB_URL: "jdbc:h2:file:~/rmdb;MODE=PostgreSQL;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE;AUTO_SERVER=TRUE;INIT=CREATE SCHEMA IF NOT EXISTS ${RM_DB_SCHEMA}"
RM_DB_USER: sa
RM_DB_PASSWORD: admin
RM_LB_USER: ${RM_DB_USER}
RM_LB_PASSWORD: ${RM_DB_PASSWORD}
spring:
datasource:
hikari:
schema: ${RM_DB_SCHEMA}
username: ${RM_DB_USER}
password: ${RM_DB_PASSWORD}
jdbc-url: ${RM_DB_URL}
liquibase:
change-log: "classpath:db/manager-changelog.xml"
default-schema: ${RM_DB_SCHEMA}
user: ${RM_LB_USER}
password: ${RM_LB_PASSWORD}
jpa:
database: postgresql
Caused by: liquibase.exception.LiquibaseException: liquibase.exception.MigrationFailedException: Migration failed for change set changelog.xml::d::d:
Reason: liquibase.exception.DatabaseException: Таблица "STATUS" не найдена
Table "STATUS" not found; SQL statement:
ALTER TABLE TEST ADD CONSTRAINT STATUS_ID CHECK (exists (SELECT 1 FROM STATUS s WHERE STATUS_ID = s.id)) [42102-200] [Failed SQL: (42102) ALTER TABLE TEST ADD CONSTRAINT STATUS_ID CHECK (exists (SELECT 1 FROM STATUS s WHERE STATUS_ID = s.id))]
I found another solution.
The problem was in local developing with h2. (it always init as public schema). I'm just adding SET SCHEMA after creating it.
in test properties:
jdbc-url: 'jdbc:h2:file:~/rmdb;MODE=PostgreSQL;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE;AUTO_SERVER=TRUE;INIT=CREATE SCHEMA IF NOT EXISTS ${application.database.schema}\;SET SCHEMA ${application.database.schema}'

DataJpaTest: Numeric scale default seems to be 0 with spring-boot-starter 2.7.1

I have a DataJpaTest with some schema.sql and data.sql for preparing the postgresql in-memory database. I've just upgraded spring-boot-starter-parent from 2.6.3 to 2.7.1, and now the test fails.
schema:
CREATE TABLE IF NOT EXISTS some_table(
id BIGSERIAL,
name TEXT,
problematic_number NUMERIC NOT NULL
);
data:
INSERT INTO some_table (name, problematic_number) VALUES ('something', 1.4321);
For some reason a test is failing now with:
org.opentest4j.AssertionFailedError:
Expected :1.4321
Actual :1
I also connected to the h2 database and I got really "1" in here instead of "1.4321". Before my spring upgrade, the test was fine.
Did the default scale for numeric maybe change? if I change my schema.sql to NUMERIC(10,4), the test succeeds.

Delta Lake 1.1.0 location in spark local mode not work properly

I've updated some ETLs to spark 3.2.1 and delta lake 1.1.0. After doing this my local tests started to fail. After some debugging, I found that when I create an empty table with a specified location it is registered in the metastore with some prefix.
Let's say if try to create a table on the bronze DB with spark-warehouse/users as my specified location:
spark.sql("""CREATE DATABASE IF NOT EXISTS bronze""")
spark.sql("""CREATE TABLE bronze.users (
| name string,
| active boolean
|)
|USING delta
|LOCATION 'spark-warehouse/users'""".stripMargin)
I end up with:
spark-warehouse/bronze.db/spark-warehouse/users registered on the metastore but with the actual files in spark-warehouse/users! This makes any query to the table fail.
I generated a sample repository: https://github.com/adrianabreu/delta-1.1.0-table-location-error-example/blob/master/src/test/scala/example/HelloSpec.scala

Spring Boot 2 - H2 Database - #SpringBootTest - Failing on org.h2.jdbc.JdbcSQLException: Table already exists

Unable to test Spring Boot & H2 with a script for creation of table using schema.sql.
So, what’s happening is that I have the following properties set:
spring.datasource.driver-class-name=org.h2.Driver
spring.datasource.initialization-mode=always
spring.datasource.username=sa
spring.datasource.password=
spring.datasource.platform=h2
spring.datasource.url=jdbc:h2:mem:city;MODE=PostgreSQL;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect
spring.jpa.generate-ddl=false
spring.jpa.hibernate.ddl-auto=update
spring.jpa.show-sql=true
and, I expect the tables to be created using the schema.sql. The application works fine when I run gradle bootRun. However, when I run tests using gradle test, my tests for Repository passes, but the one for my Service fails stating that it’s trying to create the table when the table already exists:
Exception raised:
Caused by: org.h2.jdbc.JdbcSQLException: Table "CITY" already exists;
SQL statement:
CREATE TABLE city ( id BIGINT NOT NULL, country VARCHAR(255) NOT NULL, map VARCHAR(255) NOT NULL, name VARCHAR(255) NOT NULL, state VARCHAR(2555) NOT NULL, PRIMARY KEY (id) ) [42101-196]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.command.ddl.CreateTable.update(CreateTable.java:117)
at org.h2.command.CommandContainer.update(CommandContainer.java:101)
at org.h2.command.Command.executeUpdate(Command.java:260)
at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:192)
at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:164)
at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:95)
at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java)
at org.springframework.jdbc.datasource.init.ScriptUtils.executeSqlScript(ScriptUtils.java:471)
... 105 more
The code is setup and ready to recreate the scenario. README has all the information ->
https://github.com/tekpartner/learn-spring-boot-data-jpa-h2
If the tests are run individually, they pass. I think the problem is due to schema.sql being executed twice against the same database. It fails the second time as the tables already exist.
As a workaround, you could set spring.datasource.continue-on-error=true in application.properties.
Another option is to add the #AutoConfigureTestDatabase annotation where appropriate so that a unique embedded database is used for each test.
There are 2 other possible solutions you could try:
Add a drop table if exists [tablename] in your schema.sql before you create the table.
Change the statement from CREATE TABLE to CREATE TABLE IF NOT EXISTS