ClickHouse JSON parse exception: Cannot parse input: expected ',' before - apache-kafka

I'm trying to add JSON data to ClickHouse from Kafka. Here's simplified JSON:
{
...
"sendAddress":{
"sendCommChannelTypeId":4,
"sendCommChannelTypeCode":"SMS",
"sendAddress":"789345345945"},
...
}
Here's the steps for creating table in ClickHouse, create another table using Kafka Engine and creating MATERIALIZED VIEW to connect these two tables, and also connect CH with Kafka.
Creating the first table
CREATE TABLE tab
(
...
sendAddress Tuple (sendCommChannelTypeId Int32, sendCommChannelTypeCode String, sendAddress String),
...
)Engine = MergeTree()
PARTITION BY applicationId
ORDER BY (applicationId);
Creating a second table with Kafka Engine SETTINGS:
CREATE TABLE tab_kfk
(
...
sendAddress Tuple (sendCommChannelTypeId Int32, sendCommChannelTypeCode String, sendAddress String),
...
)ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
kafka_topic_list = 'topk2',
kafka_group_name = 'group1',
kafka_format = 'JSONEachRow',
kafka_row_delimiter = '\n';
Create MATERIALIZED VIEW
CREATE MATERIALIZED VIEW tab_mv TO tab AS
SELECT ... sendAddress, ...
FROM tab_kfk;
Then I try to SELECT all or specific items from the first table - tab and get nothing. Logs is following
OK. Just add '[]' before curly braces in the sendAddress like this:
"authkey":"some_value",
"sendAddress":[{
"sendCommChannelTypeId":4,
"sendCommChannelTypeCode":"SMS",
"sendAddress":"789345345945"
}]
And I still get a mistake, but slightly different:
What should I do to fix this problem, thanks!

There are 3 ways to fix it:
Not use nested objects and flatten messages before inserting to Kafka topic. For example such way:
{
..
"authkey":"key",
"sendAddress_CommChannelTypeId":4,
"sendAddress_CommChannelTypeCode":"SMS",
"sendAddress":"789345345945",
..
}
Use Nested data structure that required to change the JSON-message schema and table schema:
{
..
"authkey":"key",
"sendAddress.sendCommChannelTypeId":[4],
"sendAddress.sendCommChannelTypeCode":["SMS"],
"sendAddress.sendAddress":["789345345945"],
..
}
CREATE TABLE tab_kfk
(
applicationId Int32,
..
sendAddress Nested(
sendCommChannelTypeId Int32,
sendCommChannelTypeCode String,
sendAddress String),
..
)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
kafka_topic_list = 'topk2',
kafka_group_name = 'group1',
kafka_format = 'JSONEachRow',
kafka_row_delimiter = '\n',
input_format_import_nested_json = 1 /* <--- */
Take into account the setting input_format_import_nested_json.
Interpret input JSON-message as string & parse it manually (see github issue #16969):
CREATE TABLE tab_kfk
(
message String
)
ENGINE = Kafka
SETTINGS
..
kafka_format = 'JSONAsString', /* <--- */
..
CREATE MATERIALIZED VIEW tab_mv TO tab
AS
SELECT
..
JSONExtractString(message, 'authkey') AS authkey,
JSONExtract(message, 'sendAddress', 'Tuple(Int32,String,String)') AS sendAddress,
..
FROM tab_kfk;

By reading at this commit, I believe for release 23.1, that a new setting input_format_json_read_objects_as_strings will allow converting nested json to String.
Example:
SET input_format_json_read_objects_as_strings = 1;
CREATE TABLE test (id UInt64, obj String, date Date) ENGINE=Memory();
INSERT INTO test FORMAT JSONEachRow {"id" : 1, "obj" : {"a" : 1, "b" : "Hello"}, "date" : "2020-01-01"};
SELECT * FROM test;
Result:
id
obj
date
1
{"a" : 1, "b" : "Hello"}
2020-01-01
docs
Of course, you can still use the materialized view to convert the String object to the correct column types using the same technics as for parsing JSONAsString format

Related

Flink SQL-CLi: bring header records

I'm new with flink sql cli and I want to create a sink from my kafka cluster.
I've read the documentation and as I understand de headers are a map<STRING, BYTE> types and through them are all the important information.
When I'm using de sql-cli I try to create a sink table following this command:
CREATE TABLE KafkaSink (
`headers` MAP<STRING, BYTES> METADATA
) WITH (
'connector' = 'kafka',
'topic' = 'MyTopic',
'properties.bootstrap.servers' ='LocalHost',
'properties.group.id' = 'MyGroypID',
'scan.startup.mode' = 'earliest-offset',
'value.format' = 'json'
);
But when I try to read the data with select * from KafkaSink limit 10; It returns me null records
I've tried to run queries like
select headers.col1 from a limit 10;
And also, I've tried to create the sink table with different structures at selecting columns part:
...
`headers` STRING
...
...
`headers` MAP<STRING, STRING>
...
...
`headers` ROW(COL1 VARCHAR, COL2 VARCHAR...)
...
But it returns me nothing, however when I bring the offset columns from kafka cluster it brings me the offset but no the headers.
Can someone explain me my error?
I want to create a kafka sink with flink sql cli
Ok, as I could see it, when I tried to change to
'format' = 'debezium-json'
I could see in a better way the json.
I follow the json schema, in my case was
{
"data": {...},
"metadata":{...}
}
So instead of bringing the header i'm bringing the data with all the columns that i need, the data as a string and the columns as for example
data.col1, data.col2
In order to see the records, just with a
select
json_value(data, '$.Col1') as Col1
from Table;
it works!

Error while creating a new extension Typo3 with extension builder

I'm trying to create a new extension with two tables :
Alert with 3 fields :
title (string type)
content (relation -> we must be able to select a content in the back office)
news (relation -> we must be able to select a news from the existing table tx_news_domain_model_news)
AlertUserMM (this table is used to link the Alert table and the User table)
Alert (relation with the table Alert)
User (relation with the table fe_users)
Here is my extension builder
When I want to save it, I have few errors :
Warning!
The configuration for table "pages" is not compatible
with extbase. You have to configure it yourself if you want to map
to this table (Error 606)
For this error, I can you to save anyway or not
When i delete the relation with the table pages, I have the Typo3 error :
Argument 1 passed to
EBT\ExtensionBuilder\Domain\Model\ClassObject\MethodParameter::setTypeHint()
must be of the type string, null given, called in
/home/dev/rta/htdocs/typo3conf/ext/extension_builder/Classes/Service/ClassBuilder.php
on line 394
I don't know where is the probleme, anyone has an idea ?
UPDATE
Here is my extension builder
I managed to make my extension, so I modified my typoscript like this to map the "pages" table:
config.tx_extbase {
persistence {
enableAutomaticCacheClearing = 1
updateReferenceIndex = 0
classes {
Ewill\EwillAlerte\Domain\Model\Contenu {
mapping {
tableName = pages
recordType = Tx_EwillAlerte_Contenu
columns {
uid.mapOnProperty = uid
title.mapOnProperty = title
sorting.mapOnProperty = sorting
}
}
}
Ewill\EwillAlerte\Domain\Model\Actualite {
mapping {
tableName = tx_news_domain_model_news
recordType = Tx_EwillAlerte_Actualite
}
}
Ewill\EwillAlerte\Domain\Model\Utilisateur {
mapping {
tableName = fe_users
recordType = Tx_EwillAlerte_Utilisateur
}
}
}
}
}
But when I install my extension in the extension manager, I have this error :
[SQL Error] line 0, col 22: Error: Expected BIT, TINYINT, SMALLINT,
MEDIUMINT, INT, INTEGER, BIGINT, REAL, DOUBLE, FLOAT, DECIMAL,
NUMERIC, DATE, TIME, TIMESTAMP, DATETIME, YEAR, CHAR, VARCHAR, BINARY,
VARBINARY, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, TINYTEXT, TEXT,
MEDIUMTEXT, LONGTEXT, ENUM, SET, or JSON, got ';' in statement: CREATE
TABLE pages ( );
I have to modify my ext_tables.sql ? With only the fields that I map in my typoscript ? Is there anything else to add? Any particular syntax?

how to convert map<anydata> to json

In my CRUD Rest Service I do an insert into a DB and want to respond to the caller with the created new record. I am looking for a nice way to convert the map to json.
I am running on ballerina 0.991.0 and using a postgreSQL.
The return of the Update ("INSERT ...") is a map.
I tried with convert and stamp but i did not work for me.
import ballerinax/jdbc;
...
jdbc:Client certificateDB = new({
url: "jdbc:postgresql://localhost:5432/certificatedb",
username: "USER",
password: "PASS",
poolOptions: { maximumPoolSize: 5 },
dbOptions: { useSSL: false }
}); ...
var ret = certificateDB->update("INSERT INTO certificates(certificate, typ, scope_) VALUES (?, ?, ?)", certificate, typ, scope_);
// here is the data, it is map<anydata>
ret.generatedKeys
map should know which data type it is, right?
then it should be easy to convert it to json like this:
{"certificate":"{certificate:
"-----BEGIN
CERTIFICATE-----\nMIIFJjCCA...tox36A7HFmlYDQ1ozh+tLI=\n-----END
CERTIFICATE-----", typ: "mqttCertificate", scope_: "QARC", id_:
223}"}
Right now i do a foreach an build the json manually. Quite ugly. Maybe somebody has some tips to do this in a nice way.
It cannot be excluded that it is due to my lack of programming skills :-)
The return value of JDBC update remote function is sql:UpdateResult|error.
The sql:UpdateResult is a record with two fields. (Refer https://ballerina.io/learn/api-docs/ballerina/sql.html#UpdateResult)
UpdatedRowCount of type int- The number of rows which got affected/updated due to the given statement execution
generatedKeys of type map - This contains a map of auto generated column values due to the update operation (only if the corresponding table has auto generated columns). The data is given as key value pairs of column name and column value. So this map contains only the auto generated column values.
But your requirement is to get the entire row which is inserted by the given update function. It can’t be returned with the update operation if self. To get that you have to execute the jdbc select operation with the matching criteria. The select operation will return a table or an error. That table can be converted to a json easily using convert() function.
For example: Lets say the certificates table has a auto generated primary key column name ‘cert_id’. Then you can retrieve that id value using below code.
int generatedID = <int>updateRet.generatedKeys.CERT_ID;
Then use that generated id to query the data.
var ret = certificateDB->select(“SELECT certificate, typ, scope_ FROM certificates where id = ?”, (), generatedID);
json convertedJson = {};
if (ret is table<record {}>) {
var jsonConversionResult = json.convert(ret);
if (jsonConversionResult is json) {
convertedJson = jsonConversionResult;
}
}
Refer the example https://ballerina.io/learn/by-example/jdbc-client-crud-operations.html for more details.?

Kotlin Exposed/Postgresql is lower-casing my table name in queries; how to use capitalized table names?

I have the following SQL query using Kotlin Exposed to a Postgres server with a capitalized table name:
object Table: IntIdTable("Table") {
val tC = text("Text")
val vC = text("Value")
}
Database.connect("jdbc:postgresql://...", driver = "org.postgresql.Driver")
transaction {
logger.addLogger(StdOutSqlLogger)
val query = Table.select {
Table.id eq 5
}
query.forEach {
println( it[Table.tC] )
}
}
But I am getting back:
Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation "table" does not exist
Usually I would simply be able to quote the table name "Table" to use the capitalized table names, but can't seem to do that with Kotlin Exposed; so is there a way to use the capitalized table name by preventing it from being lowercased?
I was able to resolve this by using escaped quotes within the table string, example for the above question would be as follows:
object Table : IntIdTable("\"Table\"") {
Could you provide the whole sample and point to the place where the exception is thrown? From the current code that's unclear who and how trying to create relation to the table.

f# Insert on MongoDB using Records

I've been trying for a while to insert on MongoDB using only records with no success.
My problem is that I want to create a simple insert function which I send a generic type and it is inserted into the database.
Like so.
let insert (value: 'a) =
let collection = MongoClient().GetDatabase("db").GetCollection<'a> "col"
collection.InsertOne value
From this function, I tried inserting the following records.
// Error that it can't set the Id
type t1 = {
Id: ObjectId
Text: string
}
// Creates the record perfectly but doesn't generate a new Id
type t2 = {
Id: string
Text: string
}
// Creates the record and autogenerates the Id but doesn't insert the Text, and there are two Ids (_id, Id#)
type t3 = {
mutable Id: ObjectId
Text: string
}
// Creates the record and autogenerates the Id but for every property it generates two on MongoDB (_id, Id#, Text, Text#)
type t4 = {
mutable Id: ObjectId
mutable Text: string
}
So does anyone can think of a solution for this or am I stuck having to use a class.
// Works!!!
type t5() =
member val Id = ObjectId.Empty with get, set
member val Name = "" with get, set
Also, does anyone has any Idea of why when the C# MongoDB library translates the mutable he gets the property with the # at the end?
I would be fine with having all my properties set as mutable, although this wouldn't be my first choice, having he create multiple properties on the DB is quite bad.
You could try annotating your records with CLIMutable (and no mutable fields).
The #s end up in the DB because MongoDB using reflection and F# implementing mutable with backing fields fieldName#