Create table in kafka using ksqldb-server - apache-kafka

I am trying to create a kafka table using the (Confluent) ksqldb-server via its REST interface using the following code (bash script):
KSQLDB_COMMAND="CREATE TABLE sample_table \
(xkey VARCHAR, \
xdata VARCHAR) \
WITH (KAFKA_TOPIC=\'sample-topic\', \
VALUE_FORMAT=\'JSON\', \
KEY=\'xkey\'); "
COMMAND="curl -X 'POST' '$KSQLDB_SERVER' \
-H 'Content-Type: application/vnd.ksql.v1+json; charset=utf-8' \
-d '{ \"ksql\": \"$KSQLDB_COMMAND\" }' "
eval $COMMAND
The following error output message is returned:
{"#type":"statement_error","error_code":40001,"message":"Failed to prepare statement: Invalid config variable(s) in the WITH clause: KEY","statementText":"CREATE TABLE sample_table (xkey VARCHAR, xdata VARCHAR) WITH (KAFKA_TOPIC='sample-topic', VALUE_FORMAT='JSON', KEY='xkey');","entities":[]}%
The error suggests an error in the actual statement, in particular with the KEY attribute.
I can get basic commands ("LIST STREAMS" etc) working using the REST interface but can not create tables, so I figure this is a problem in the KSQL statement or how I am create the bash command (in "COMMAND" variable).
Any help is appreciated.

I spent a fair bit of time experimenting and got this simple example working (my original attempt required too many bash variable substitutions to make it useful/maintainable, so this version is simplified quite a bit). I also found that KSQLDB table names must follow regular SQL naming conventions for table names (ie. alpha, underscores, etc... but no hyphens, which caused a bunch of errors in my original question... I should have read the documentation more carefully).
The following works (you may need to change your KSQLDB server address)... and with minimal changes, just about any KSQLDB command can be executed:
####
# NOTE: table MUST be alpha (underscores are OK)... hyphens are not allowed
####
KSQLDB_SERVER="http://localhost:8088/ksql"
KSQLDB_TABLE="some_table"
KSQLDB_TOPIC="some_topic"
VALUE_FORMAT="JSON"
FMT="{ \"ksql\": \"CREATE TABLE %s (key VARCHAR PRIMARY KEY, data VARCHAR) WITH (KAFKA_TOPIC='%s', VALUE_FORMAT='%s');\" }"
JSON_DATA=$(printf "$FMT" "$KSQLDB_TABLE" "$KSQLDB_TOPIC" "$VALUE_FORMAT")
curl -X "POST" "$KSQLDB_SERVER" \
-H "Content-Type: application/vnd.ksql.v1+json; charset=utf-8" \
-d "$JSON_DATA"

you can't specify KEY for table, KEY is used for streams. you should use PRIMARY KEY for table in the type declaration.like :
CREATE OR REPLACE TABLE TABLE_1
(
ID INT PRIMARY KEY,
EMAILADDRESS VARCHAR,
ISPRIMARY BOOLEAN,
USERID INT,
PARANT INT
)WITH (KAFKA_TOPIC='test_1', VALUE_FORMAT='AVRO', KEY_FORMAT='AVRO');

Related

Columns combined when using `select ...pgp_sym_decrypt ()` from postgresql databse through Bash psql

System is Debian 11.2 with PostgreSQL 11.5.
I created a database and table as below:
CREATE DATABASE dbname OWNER=postgres
ENCODING= 'UTF8'
\c dbname
CREATE TABLE test(
id serial primary key,
site varchar(100) NOT NULL,
username char(30) NOT NULL,
password char(300) NOT NULL,
note varchar(200) DEFAULT NULL
);
Create bash file as below:
#!/bin/bash
res_user='me'
db_user='postgres'
db_name='dbname'
table_name='test'
sym_key='key'
#insert 4 columns
su $db_user <<EOFU
psql -d "$db_name" -U "$db_user" << EOF
INSERT INTO $table_name (site,username,password,note) VALUES ('v4','u3',pgp_sym_encrypt('password','key','cipher-algo=aes128,compress-algo=0,convert-crlf=1,sess-key=0,s2k-mode=3'),'note3');
EOF
EOFU
#column note has no output
password_arr=($(su $db_user <<EOFU
psql -tAq --field-separator= -d "$db_name" -U "$db_user" << EOF
SELECT "username",pgp_sym_decrypt(password::bytea,'key'),"note" FROM "$table_name" WHERE "site" LIKE '%v4%';
EOF
EOFU
))
echo "${password_arr[1]}" #output is passwordnote3
echo "${password_arr[2]}" #no ouput?
The expect output is:
${password_arr[1]} is `password`
${password_arr[2]} is `note3`
Run above bash script, but output "${password_arr[2]}" has no value,"${password_arr[1]}" is passwordnote3. Where is the problem?
I found the issue. The problem is you specified --field-separator to "nothing" instead of a space. It should be --field-separator=" ". This allowed the output of pgp_sym_decrypt() to concatenate with note. The username field however always had spaces probably since it has a fixed width of 30.
I also suggest that you reduce the number of row outputs to 1, and also enable "noglob" option when you're relying on word splitting. This can be done with set -f. You can also use read to get the needed fields. See How to split a string into an array in Bash?.

Docker PSQL Query Issue: "Column <column_name> does not exist"

I'm trying to perform a db query through a docker inline command within a shell script.
myscript.sh:
docker run -it --rm -c "psql -U ${DB_USER} -d ${DB_NAME} -h ${DB_HOST}\
-c 'select col1, col2 , col3 from table1\
where table1.col2 = \"matching_text\" order by col1;'"
But I get an odd error:
ERROR: column "matching_text" does not exist
LINE 1: ...ndow where table1.col2 = "matching_t...
For some reason when I run this, psql thinks the matching_text in my query is referring to a column name. How would I get around this?
Note: Our database is implemented as a psql docker container.
The Postgres manual explains you need to use single quotes:
A string constant in SQL is an arbitrary sequence of characters bounded by single quotes ('), for example 'This is a string'. To include a single-quote character within a string constant, write two adjacent single quotes, e.g., 'Dianne''s horse'. Note that this is not the same as a double-quote character (").
See section 4.1.2.1 of the postgres manual.
Double quotes are for table or column identifiers:
There is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing an arbitrary sequence of characters in double-quotes ("). A delimited identifier is always an identifier, never a key word. So "select" could be used to refer to a column or table named "select", whereas an unquoted select would be taken as a key word and would therefore provoke a parse error when used where a table or column name is expected. The example can be written with quoted identifiers like this:
UPDATE "my_table" SET "a" = 5;
See section 4.1.1 of the same manual.
Combination of post here and other post solved this issue:
Need to use single quotes for string query
Use double quotes for -c in psql command (Answer thread)
docker run -it --rm -c "psql -U ${DB_USER} -d ${DB_NAME} -h ${DB_HOST}\
-c \"select col1, col2 , col3 from table1\
where table1.col2 = 'matching_text' order by col1;\""

NextVal of postgresql usage

CREATE SEQUENCE :schema.empseq;
CREATE TABLE emp(empid bigint NOT NULL DEFAULT NEXTVAL(':schema.empseq'));
I am execute like psql -d dbname -U username -f emp.sql -v schema=post
Getting an error
schema ":schema" does not exist
The documentation here talks about how psql interpolates values into SQL.
CREATE SEQUENCE :schema.empseq;
CREATE TABLE emp(empid bigint NOT NULL DEFAULT NEXTVAL(:'schema' || '.empseq'));
might work for you.

Psql COPY with constraint fails

I have a table like this in the server:
CREATE TABLE example_table (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(70) NOT NULL,
status VARCHAR(70) NOT NULL CONSTRAINT status_enum CHECK (status IN ('old', 'new')),
UNIQUE (id, name)
);
And I have an SQL file, example.sql. The first line contain a header:
name_of_class,status
'CLASSNAME','old';
And I try to run a psql \copy to google server:
PGPASSWORD=password psql -d database --username username --port 5432 --host 11.111.111 << EOF
BEGIN;
\copy example_table(name,status) FROM example.sql DELIMITER ',' CSV Header
COMMIT;
EOF
I then get this error:
ERROR: new row for relation "example_table" violates check constraint "status_enum"
DETAIL: Failing row contains (1, 'CLASSNAME', 'old';).
CONTEXT: COPY example_table, line 2: "'CLASSNAME','old';"
ROLLBACK
Any idea how to solve this? 🙂
It appears that your source csv is using the ' (single-quote) to quote all the columns. You could specify that as the quote character using the option QUOTE
The \copy command is trying to load 'old' into the status column that checks that values are either new or old. The extra quotes violate the constraint.
\copy example_table(name,status) FROM example.sql DELIMITER ',' CSV Header QUOTE ''''
4 single quotes are required because 1 specifies the actual quote char, 1 to escapes the quote-character, and 2 encloses the escaped quote-character.

Generate DDL programmatically on Postgresql

How can I generate the DDL of a table programmatically on Postgresql? Is there a system query or command to do it? Googling the issue returned no pointers.
Use pg_dump with this options:
pg_dump -U user_name -h host database -s -t table_or_view_names -f table_or_view_names.sql
Description:
-s or --schema-only : Dump only ddl / the object definitions (schema), without data.
-t or --table Dump : Dump only tables (or views or sequences) matching table
Examples:
-- dump each ddl table elon build.
$ pg_dump -U elon -h localhost -s -t spacex -t tesla -t solarcity -t boring > companies.sql
Sorry if out of topic. Just wanna help who googling "psql dump ddl" and got this thread.
You can use the pg_dump command to dump the contents of the database (both schema and data). The --schema-only switch will dump only the DDL for your table(s).
Why would shelling out to psql not count as "programmatically?" It'll dump the entire schema very nicely.
Anyhow, you can get data types (and much more) from the information_schema (8.4 docs referenced here, but this is not a new feature):
=# select column_name, data_type from information_schema.columns
-# where table_name = 'config';
column_name | data_type
--------------------+-----------
id | integer
default_printer_id | integer
master_host_enable | boolean
(3 rows)
The answer is to check the source code for pg_dump and follow the switches it uses to generate the DDL. Somewhere inside the code there's a number of queries used to retrieve the metadata used to generate the DDL.
Here is a good article on how to get the meta information from information schema,
http://www.alberton.info/postgresql_meta_info.html.
I saved 4 functions to mock up pg_dump -s behaviour partially. Based on \d+ metacommand. The usage would be smth alike:
\pset format unaligned
select get_ddl_t(schemaname,tablename) as "--" from pg_tables where tableowner <> 'postgres';
Of course you have to create functions prior.
Working sample here at rextester