`CREATE STREAM TEST_STREAM_JSON
(id INT
,age INT
,name VARCHAR)
WITH (KAFKA_TOPIC = 'test_partition_key_stream', VALUE_FORMAT = 'JSON');
CREATE STREAM TEST_STREAM_AVRO WITH (PARTITIONS=3, FORMAT = 'AVRO')
AS SELECT ID AS IDPK, AS_VALUE(ID) AS ID, AGE, NAME FROM TEST_STREAM_JSON
PARTITION BY ID;`
But the topic generated TEST_STREAM_AVRO's keys are messy code at UI side.
enter image description here
I have tried to add KEY_FORMAT to JSON, not wroks.
Related
I'm new with flink sql cli and I want to create a sink from my kafka cluster.
I've read the documentation and as I understand de headers are a map<STRING, BYTE> types and through them are all the important information.
When I'm using de sql-cli I try to create a sink table following this command:
CREATE TABLE KafkaSink (
`headers` MAP<STRING, BYTES> METADATA
) WITH (
'connector' = 'kafka',
'topic' = 'MyTopic',
'properties.bootstrap.servers' ='LocalHost',
'properties.group.id' = 'MyGroypID',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'json'
);
But when I try to read the data with select * from KafkaSink limit 10; It returns me null records
I've tried to run queries like
select headers.col1 from a limit 10;
And also, I've tried to create the sink table with different structures at selecting columns part:
...
`headers` STRING
...
...
`headers` MAP<STRING, STRING>
...
...
`headers` ROW(COL1 VARCHAR, COL2 VARCHAR...)
...
But it returns me nothing, however when I bring the offset columns from kafka cluster it brings me the offset but no the headers.
Can someone explain me my error?
I want to create a kafka sink with flink sql cli
Ok, as I could see it, when I tried to change to
'format' = 'debezium-json'
I could see in a better way the json.
I follow the json schema, in my case was
{
"data": {...},
"metadata":{...}
}
So instead of bringing the header i'm bringing the data with all the columns that i need, the data as a string and the columns as for example
data.col1, data.col2
In order to see the records, just with a
select
json_value(data, '$.Col1') as Col1
from Table;
it works!
Is there a way in KsqlDB to add headers while creating a stream from AS SELECT?
For example, I have a stream DomainOrgs(DomainId INT,OrgId INT,someId INT), now I need to create a stream with all the values in DomainOrgs also DomainId should go to Header
I tried to create like
CREATE STREAM DomainOrgs_with_header AS
SELECT DomainId,
OrgId,
someId,
DomainId HEADER('DomainId')
FROM DomainOrgs
EMIT CHANGES;
Also tried
CREATE STREAM DomainOrgs_with_header
(
DomainId INT,
OrgId INT,
someId INT,
DomainId_Header Header('DomainId')
)
INSERT INTO DomainOrgs_with_header
SELECT DomainId,OrgId,someId,DomainId FROM DomainOrgs
Here, stream will create but INSERT INTO will fail.
Is there any way to select data into the stream with headers?
I have stream with Coupon info from topic:
CREATE STREAM personal_coupons
(Coupon VARCHAR KEY,
CouponType VARCHAR,
MarketingArea VARCHAR,
CouponCode VARCHAR,
CouponName VARCHAR) WITH
(KAFKA_TOPIC = 'Coupons_Personal',
VALUE_FORMAT = 'JSON');
And I have table with two fields - Coupon and GUID
CREATE TABLE coupon_and_guid(Coupon varchar PRIMARY KEY,
guid varchar)
WITH (KAFKA_TOPIC = 'Coupon_GUID',VALUE_FORMAT = 'JSON');
I try to join it with:
CREATE STREAM coupon_with_guid WITH (KEY_FORMAT = 'JSON', VALUE_FORMAT = 'JSON') AS
SELECT
personal_coupons.Coupon,
COUPON_AND_GUID.guid,
CouponType,
MarketingArea,
AS_VALUE(personal_coupons.Coupon),
CouponCode,
CouponName
FROM personal_coupons
LEFT JOIN coupon_and_guid ON personal_coupons.Coupon = coupon_and_guid.coupon
PARTITION BY personal_coupons.Coupon,COUPON_AND_GUID.guid EMIT CHANGES;
And I've got message with format:
key: {"PERSONAL_COUPONS_COUPON":"{\"Coupon\":\"1-2NAZTM69\"}","GUID":null}
value: {"COUPONTYPE":"MULTI","COUPONCONTACTRELATIONSHIPTYPE":"03","MARKETINGAREA":"VKUSOMANIA","KSQL_COL_0":"{\"Coupon\":\"1-2NAZTM69\"}","COUPONORIGIN":"Siebel","COUPONSTATUS":"01","LANGUAGE":"RU","COUPONCODE":"9001196300379670","COUPONNAME":"1000275479000214"}
But I want to get:
key: {"Coupon":"1-2NAZTM69","GUID":null}
value: {"COUPONTYPE":"MULTI","COUPONCONTACTRELATIONSHIPTYPE":"03","MARKETINGAREA":"VKUSOMANIA","Coupon":"1-2NAZTM69","COUPONORIGIN":"Siebel","COUPONSTATUS":"01","LANGUAGE":"RU","COUPONCODE":"9001196300379670","COUPONNAME":"1000275479000214"}
What I did wrong and how can i fix it?
I found this solution - using EXTRACTJSONFIELD ksql function
CREATE STREAM coupon_with_guid WITH (KEY_FORMAT = 'JSON', VALUE_FORMAT = 'JSON') AS
SELECT
EXTRACTJSONFIELD(personal_coupons.Coupon, '$.Coupon') as Coupon,
COUPON_AND_GUID.guid,
CouponType,
MarketingArea,
AS_VALUE(personal_coupons.Coupon),
CouponCode,
CouponName
FROM personal_coupons
LEFT JOIN coupon_and_guid ON personal_coupons.Coupon = coupon_and_guid.coupon
PARTITION BY EXTRACTJSONFIELD(personal_coupons.Coupon, '$.Coupon'),COUPON_AND_GUID.guid EMIT CHANGES;
There is a topic, containing plain JSON messages, trying to create a new stream by extracting few columns from JSON and another column as varchar with the message's value.
Here is a sample message in a topic_json
{
"db": "mydb",
"collection": "collection",
"op": "update"
}
Creating a stream like -
CREATE STREAM test1 (
db VARCHAR,
collection VARCHAR
VAL STRING
) WITH (
KAFKA_TOPIC = 'topic_json',
VALUE_FORMAT = 'JSON'
);
Output of this stream will contain only db and collection columns, How can I add another column as message's value - "{\"db\":\"mydb\",\"collection\":\"collection\",\"op\":\"update\"}"
I am having trouble generating my tables in PostgreSQL from Grails. I have simple Email and EmailAttachment domain classes with a hasMany and belongsTo relationship. This setup worked well on our production server (AS400 DB2), but when I try to run my program on PostgreSQL (the new dev environment), the Email class does not have the attachment_id column.
Email.groovy:
class Email {
static hasMany = [attachments:EmailAttachment]
Integer id
Integer version = 0
String subject
String recipients
String sender
Date sentDate
String plainTextMessage
Set attachments
static mapping = {
datasources(['DEFAULT'])
table name:'error_email', schema: Appointment.schema
sort sentDate:'desc'
}
static constraints = {
subject nullable:true
version nullable:true
recipients nullable:true
sender nullable:true
sentDate nullable:true
plainTextMessage nullable:true
attachments nullable:true
}
def String toString(){
return subject
}
}
EmailAttachment.groovy:
class EmailAttachment {
static belongsTo = [email:ErrorEmail]
ErrorEmail email
String filename
byte[] content
static mapping = {
datasources(['DEFAULT'])
table name:'error_email_attachment', schema: Appointment.schema
}
static constraints = {
filename nullable:true
content nullable:true
}
}
Also, here are the relevant lines from schema-export:
alter table program.email_attachment drop constraint FK2E592AFD1D80E229;
drop table program.email cascade;
drop table program.email_attachment cascade;
drop sequence hibernate_sequence;
create table program.email (id int4 not null, version int4, plain_text_message varchar(255), recipients varchar(255), sender varchar(255), sent_date timestamp, subject varchar(255), primary key (id));
create table program.email_attachment (id int8 not null, version int8 not null, content bytea, email_id int4 not null, filename varchar(255), primary key (id));
alter table program.email_attachment add constraint FK2E592AFD1D80E229 foreign key (email_id) references program.error_email;
create sequence hibernate_sequence;
I've tried specifying joinTable: attachments joinTable:[name: 'email_table', column: 'attachment_id', key: 'id'] to no avail, as well as leaving it out, and trying other collection types for attachment. Thanks in advance for your time and brain cells.
The email doesn't have an attachment_id column because it's the one side of the one-to-many. The many side, attachment in this case, has a reference to its owning email in the email_id int4 not null column. Given an email with id 42 you (and Grails/GORM/Hibernate) can find all of the attachments by querying for all rows in that table with email_id=42.