Using PARTITION BY at KSQL, but the topic generated by the STREAM have messy code - apache-kafka

`CREATE STREAM TEST_STREAM_JSON
(id INT
,age INT
,name VARCHAR)
WITH (KAFKA_TOPIC = 'test_partition_key_stream', VALUE_FORMAT = 'JSON');
CREATE STREAM TEST_STREAM_AVRO WITH (PARTITIONS=3, FORMAT = 'AVRO')
AS SELECT ID AS IDPK, AS_VALUE(ID) AS ID, AGE, NAME FROM TEST_STREAM_JSON
PARTITION BY ID;`
But the topic generated TEST_STREAM_AVRO's keys are messy code at UI side.
enter image description here
I have tried to add KEY_FORMAT to JSON, not wroks.

Related

Flink SQL-CLi: bring header records

I'm new with flink sql cli and I want to create a sink from my kafka cluster.
I've read the documentation and as I understand de headers are a map<STRING, BYTE> types and through them are all the important information.
When I'm using de sql-cli I try to create a sink table following this command:
CREATE TABLE KafkaSink (
`headers` MAP<STRING, BYTES> METADATA
) WITH (
'connector' = 'kafka',
'topic' = 'MyTopic',
'properties.bootstrap.servers' ='LocalHost',
'properties.group.id' = 'MyGroypID',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'json'
);
But when I try to read the data with select * from KafkaSink limit 10; It returns me null records
I've tried to run queries like
select headers.col1 from a limit 10;
And also, I've tried to create the sink table with different structures at selecting columns part:
...
`headers` STRING
...
...
`headers` MAP<STRING, STRING>
...
...
`headers` ROW(COL1 VARCHAR, COL2 VARCHAR...)
...
But it returns me nothing, however when I bring the offset columns from kafka cluster it brings me the offset but no the headers.
Can someone explain me my error?
I want to create a kafka sink with flink sql cli
Ok, as I could see it, when I tried to change to
'format' = 'debezium-json'
I could see in a better way the json.
I follow the json schema, in my case was
{
"data": {...},
"metadata":{...}
}
So instead of bringing the header i'm bringing the data with all the columns that i need, the data as a string and the columns as for example
data.col1, data.col2
In order to see the records, just with a
select
json_value(data, '$.Col1') as Col1
from Table;
it works!

KSQLDB Create Stream Select AS with Headers

Is there a way in KsqlDB to add headers while creating a stream from AS SELECT?
For example, I have a stream DomainOrgs(DomainId INT,OrgId INT,someId INT), now I need to create a stream with all the values in DomainOrgs also DomainId should go to Header
I tried to create like
CREATE STREAM DomainOrgs_with_header AS
SELECT DomainId,
OrgId,
someId,
DomainId HEADER('DomainId')
FROM DomainOrgs
EMIT CHANGES;
Also tried
CREATE STREAM DomainOrgs_with_header
(
DomainId INT,
OrgId INT,
someId INT,
DomainId_Header Header('DomainId')
)
INSERT INTO DomainOrgs_with_header
SELECT DomainId,OrgId,someId,DomainId FROM DomainOrgs
Here, stream will create but INSERT INTO will fail.
Is there any way to select data into the stream with headers?

How to use few fields in a key of kafka stream, joining stream and table?

I have stream with Coupon info from topic:
CREATE STREAM personal_coupons
(Coupon VARCHAR KEY,
CouponType VARCHAR,
MarketingArea VARCHAR,
CouponCode VARCHAR,
CouponName VARCHAR) WITH
(KAFKA_TOPIC = 'Coupons_Personal',
VALUE_FORMAT = 'JSON');
And I have table with two fields - Coupon and GUID
CREATE TABLE coupon_and_guid(Coupon varchar PRIMARY KEY,
guid varchar)
WITH (KAFKA_TOPIC = 'Coupon_GUID',VALUE_FORMAT = 'JSON');
I try to join it with:
CREATE STREAM coupon_with_guid WITH (KEY_FORMAT = 'JSON', VALUE_FORMAT = 'JSON') AS
SELECT
personal_coupons.Coupon,
COUPON_AND_GUID.guid,
CouponType,
MarketingArea,
AS_VALUE(personal_coupons.Coupon),
CouponCode,
CouponName
FROM personal_coupons
LEFT JOIN coupon_and_guid ON personal_coupons.Coupon = coupon_and_guid.coupon
PARTITION BY personal_coupons.Coupon,COUPON_AND_GUID.guid EMIT CHANGES;
And I've got message with format:
key: {"PERSONAL_COUPONS_COUPON":"{\"Coupon\":\"1-2NAZTM69\"}","GUID":null}
value: {"COUPONTYPE":"MULTI","COUPONCONTACTRELATIONSHIPTYPE":"03","MARKETINGAREA":"VKUSOMANIA","KSQL_COL_0":"{\"Coupon\":\"1-2NAZTM69\"}","COUPONORIGIN":"Siebel","COUPONSTATUS":"01","LANGUAGE":"RU","COUPONCODE":"9001196300379670","COUPONNAME":"1000275479000214"}
But I want to get:
key: {"Coupon":"1-2NAZTM69","GUID":null}
value: {"COUPONTYPE":"MULTI","COUPONCONTACTRELATIONSHIPTYPE":"03","MARKETINGAREA":"VKUSOMANIA","Coupon":"1-2NAZTM69","COUPONORIGIN":"Siebel","COUPONSTATUS":"01","LANGUAGE":"RU","COUPONCODE":"9001196300379670","COUPONNAME":"1000275479000214"}
What I did wrong and how can i fix it?
I found this solution - using EXTRACTJSONFIELD ksql function
CREATE STREAM coupon_with_guid WITH (KEY_FORMAT = 'JSON', VALUE_FORMAT = 'JSON') AS
SELECT
EXTRACTJSONFIELD(personal_coupons.Coupon, '$.Coupon') as Coupon,
COUPON_AND_GUID.guid,
CouponType,
MarketingArea,
AS_VALUE(personal_coupons.Coupon),
CouponCode,
CouponName
FROM personal_coupons
LEFT JOIN coupon_and_guid ON personal_coupons.Coupon = coupon_and_guid.coupon
PARTITION BY EXTRACTJSONFIELD(personal_coupons.Coupon, '$.Coupon'),COUPON_AND_GUID.guid EMIT CHANGES;

ksqlDB get message's value as string column from Json topic

There is a topic, containing plain JSON messages, trying to create a new stream by extracting few columns from JSON and another column as varchar with the message's value.
Here is a sample message in a topic_json
{
"db": "mydb",
"collection": "collection",
"op": "update"
}
Creating a stream like -
CREATE STREAM test1 (
db VARCHAR,
collection VARCHAR
VAL STRING
) WITH (
KAFKA_TOPIC = 'topic_json',
VALUE_FORMAT = 'JSON'
);
Output of this stream will contain only db and collection columns, How can I add another column as message's value - "{\"db\":\"mydb\",\"collection\":\"collection\",\"op\":\"update\"}"

Grails doesn't create foreign key column in PostgreSQL

I am having trouble generating my tables in PostgreSQL from Grails. I have simple Email and EmailAttachment domain classes with a hasMany and belongsTo relationship. This setup worked well on our production server (AS400 DB2), but when I try to run my program on PostgreSQL (the new dev environment), the Email class does not have the attachment_id column.
Email.groovy:
class Email {
static hasMany = [attachments:EmailAttachment]
Integer id
Integer version = 0
String subject
String recipients
String sender
Date sentDate
String plainTextMessage
Set attachments
static mapping = {
datasources(['DEFAULT'])
table name:'error_email', schema: Appointment.schema
sort sentDate:'desc'
}
static constraints = {
subject nullable:true
version nullable:true
recipients nullable:true
sender nullable:true
sentDate nullable:true
plainTextMessage nullable:true
attachments nullable:true
}
def String toString(){
return subject
}
}
EmailAttachment.groovy:
class EmailAttachment {
static belongsTo = [email:ErrorEmail]
ErrorEmail email
String filename
byte[] content
static mapping = {
datasources(['DEFAULT'])
table name:'error_email_attachment', schema: Appointment.schema
}
static constraints = {
filename nullable:true
content nullable:true
}
}
Also, here are the relevant lines from schema-export:
alter table program.email_attachment drop constraint FK2E592AFD1D80E229;
drop table program.email cascade;
drop table program.email_attachment cascade;
drop sequence hibernate_sequence;
create table program.email (id int4 not null, version int4, plain_text_message varchar(255), recipients varchar(255), sender varchar(255), sent_date timestamp, subject varchar(255), primary key (id));
create table program.email_attachment (id int8 not null, version int8 not null, content bytea, email_id int4 not null, filename varchar(255), primary key (id));
alter table program.email_attachment add constraint FK2E592AFD1D80E229 foreign key (email_id) references program.error_email;
create sequence hibernate_sequence;
I've tried specifying joinTable: attachments joinTable:[name: 'email_table', column: 'attachment_id', key: 'id'] to no avail, as well as leaving it out, and trying other collection types for attachment. Thanks in advance for your time and brain cells.
The email doesn't have an attachment_id column because it's the one side of the one-to-many. The many side, attachment in this case, has a reference to its owning email in the email_id int4 not null column. Given an email with id 42 you (and Grails/GORM/Hibernate) can find all of the attachments by querying for all rows in that table with email_id=42.