Issue with subqueries in OrientDb - orientdb

I am trying to make a query that returns information between classes, (I am planning to convert them to edges later using the output from this query)
This is my test setup (I am using OrientDB 3.0.14):
CREATE CLASS users
INSERT INTO users CONTENT {"name": "user1", "state_code": "CA"}
INSERT INTO users CONTENT {"name": "user2", "state_code": "VA"}
INSERT INTO users CONTENT {"name": "other3", "state_code": "FL"}
CREATE CLASS states
INSERT INTO states CONTENT {"code": "CA", "name": "California"}
INSERT INTO states CONTENT {"code": "VA", "name": "Virginia"}
INSERT INTO states CONTENT {"code": "FL", "name": "Florida"}
Now,this query works fine, I can see the expected results:
SELECT name, state_code, $state FROM users LET $state=(SELECT FROM states WHERE code=$parent.$current.state_code)
+----+------+----------+---------------------------+
|# |name |state_code|$state |
+----+------+----------+---------------------------+
|0 |user1 |CA |[{code:CA,name:California}]|
|1 |user2 |VA |[{code:VA,name:Virginia}] |
|2 |other3|FL |[{code:FL,name:Florida}] |
+----+------+----------+---------------------------+
So, I tried to add a subquery to filter and return only some of the records in users using:
SELECT name, state_code, $state FROM (SELECT FROM users WHERE name LIKE 'user%') LET $state=(SELECT FROM states WHERE code=$parent.$current.state_code)
+----+-----+----------+---------------------------+
|# |name |state_code|$state |
+----+-----+----------+---------------------------+
|0 |user1|CA |[{code:CA,name:California}]|
|1 |user2|VA |[{code:CA,name:California}]|
+----+-----+----------+---------------------------+
I cannot find a way for the calculated value $state to return the proper values, it seems to be stuck in using only the first record ?
I really appreciate any help you can give me to figure out how to fix this query.

Related

The stream created in ksqlDB shows NULL value

I am trying to create a stream in ksqlDB to get the data from the kafka topic and perform query on it.
CREATE STREAM test_location (
id VARCHAR,
name VARCHAR,
location VARCHAR
)
WITH (KAFKA_TOPIC='public.location',
VALUE_FORMAT='JSON',
PARTITIONS=10);
The data in the topics public.location is in JSON format.
UPDATED topic message.
print 'public.location' from beginning limit 1;
Key format: ¯\_(ツ)_/¯ - no data processed
Value format: JSON or KAFKA_STRING
rowtime: 2021/05/23 11:27:39.429 Z, key: <null>, value: {"sourceTable":{"id":"1","name":Sam,"location":Manchester,"ConnectorVersion":null,"connectorId":null,"ConnectorName":null,"DbName":null,"DbSchema":null,"TableName":null,"payload":null,"schema":null},"ConnectorVersion":null,"connectorId":null,"ConnectorName":null,"DbName":null,"DbSchema":null,"TableName":null,"payload":null,"schema":null}, partition: 3
After the stream is created, and performing SELECT on the created stream I get NULL in the output. Although the topic has the data.
select * from test_location
>EMIT CHANGES limit 5;
+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+
|ID |NAME |LOCATION |
+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+
|null |null |null |
|null |null |null |
|null |null |null |
|null |null |null |
|null |null |null |
Limit Reached
Query terminated
Here is the details from docker file
version: '2'
services:
ksqldb-server:
image: confluentinc/ksqldb-server:0.18.0
hostname: ksqldb-server
container_name: ksqldb-server
depends_on:
- schema-registry
ports:
- "8088:8088"
environment:
KSQL_LISTENERS: "http://0.0.0.0:8088"
KSQL_BOOTSTRAP_SERVERS: "broker:29092"
KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
# Configuration to embed Kafka Connect support.
KSQL_CONNECT_GROUP_ID: "ksql-connect-01"
KSQL_CONNECT_BOOTSTRAP_SERVERS: "broker:29092"
KSQL_CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
KSQL_CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
KSQL_CONNECT_CONFIG_STORAGE_TOPIC: "_ksql-connect-01-configs"
KSQL_CONNECT_OFFSET_STORAGE_TOPIC: "_ksql-connect-01-offsets"
KSQL_CONNECT_STATUS_STORAGE_TOPIC: "_ksql-connect-01-statuses"
KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
KSQL_CONNECT_PLUGIN_PATH: "/usr/share/kafka/plugins"
Update:
Here is a message in the topic that I see in the Kafka
{
"sourceTable": {
"id": "1",
"name": Sam,
"location": Manchester,
"ConnectorVersion": null,
"connectorId": null,
"ConnectorName": null,
"DbName": null,
"DbSchema": null,
"TableName": null,
"payload": null,
"schema": null
},
"ConnectorVersion": null,
"connectorId": null,
"ConnectorName": null,
"DbName": null,
"DbSchema": null,
"TableName": null,
"payload": null,
"schema": null
}
Which step or configuration I am missing?
Given your payload, you would need to declare the schema nested, because id, name, and location are not "top level" fields in the Json, but they are nested within sourceTable.
CREATE STREAM est_location (
sourceTable STRUCT<id VARCHAR, name VARCHAR, location VARCHAR>
)
It's not possible to "unwrap" the data when defining the schema, but the schema must match what is in the topic. In addition to sourceTable you could also add ConnectorVersion etc to the schema, as they are also "top level" fields in you JSON. Bottom line is, that column in ksqlDB can only be declared on top level field. Everything else is nested data that you can access using STRUCT type.
Of course later, when you query est_location you can refer to individual fields via sourceTable->id etc.
It would also be possible to declare a derived STREAM if you want to unnest the schema:
CREATE STREAM unnested_est_location AS
SELECT sourceTable->id AS id,
sourceTable->name AS name,
sourceTable->location AS location
FROM est_location;
Of course, this would write the data into a new topic.

Postgresql Query - select json

I have a postgresql query that I want to save as .json, just from a especific part of the query result:
SELECT info FROM d.tests where tag like 'HMIZP'
The result of this query is:
{"blabla":{a lot of blabla}, "Body":[{....
I just want everything after "Body" (including " Body")
How can I do it?
You can combine the extraction with building a json
SELECT json_build_object('Body',json_extract_path('{"blabla": { "a": "a lot of blabla"},"Body": [{"a": [1,2]}, {"b":2}]}','Body'))
| json_build_object |
| :--------------------------------- |
| {"Body" : [{"a": [1,2]}, {"b":2}]} |
db<>fiddle here

KSQLDB coalesce always returns null despite parameters

I have the following ksql query:
SELECT
event->acceptedevent->id as id1,
event->refundedevent->id as id2,
coalesce(event->acceptedevent->id, event->refundedevent->id) as coalesce_col
FROM events
EMIT CHANGES;
Based on the documentation, (https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/scalar-functions/#coalesce) COALESCE returns the first non-null parameter.
Query returns the following:
+-----------------------------------------------+-----------------------------------------------+-----------------------------------------------+
|ID1 |ID2 |COALESCE_COL |
+-----------------------------------------------+-----------------------------------------------+-----------------------------------------------+
|1 |null |null |
|2 |null |null |
|3 |null |null |
I was expecting since ID1 is clearly not null, being the first parameter to the call, COALESCE will return same value as ID1 but it returns null. What am I missing?
I am using confluentinc/cp-ksqldb-server:6.1.1 and use avro for the value serde.
EventMessage.avsc:
{
"type": "record",
"name": "EventMessage",
"namespace": "com.example.poc.processor2.avro",
"fields": [
{
"name": "event",
"type": [
"com.example.poc.processor2.avro.AcceptedEvent",
"com.example.poc.processor2.avro.RefundedEvent"
]
}
]
}
Probably a bug in how data is deserialized, or the COALESCE function.
What KSQL version are you running
How is your data serialized in the topic?
I tried with a JSON format and it worked.
ksql> describe events;
Name : EVENTS
Field | Type
------------------------------------------------------------------------------------
EVENT | STRUCT<ACCEPTEDEVENT STRUCT<ID INTEGER>, REFUNDEDEVENT STRUCT<ID INTEGER>>
------------------------------------------------------------------------------------
ksql> print 'events' from BEGINNING;
Key format: ¯\_(ツ)_/¯ - no data processed
Value format: JSON or KAFKA_STRING
rowtime: 2021/03/24 13:57:27.403 Z, key: <null>, value: {"event":{"acceptedevent":{"id":1}, "refundedevent":{}}}, partition:
ksql> select event->acceptedevent->id, event->refundedevent->id, coalesce(event->acceptedevent->id, event->refundedevent->id) from events emit changes;
+----------------------------------------------------------+----------------------------------------------------------+----------------------------------------------------------+
|ID |ID_1 |KSQL_COL_0 |
+----------------------------------------------------------+----------------------------------------------------------+----------------------------------------------------------+
|1 |null |1 |

Double flattening of arrays. Ksqldb 0.8.1

DATA.
kafkacat -b 127.0.0.1 -t group-topic -P
{"groups":[{"name":"Roberth","surname":"Smith","origin":"England","albums":["Wish","Desintegration"],"group":"The Cure"},{"name":"Peter","surname":"Murphy","origin":"England","albums":["Mask","In The Flat Field"],"group":"Bauhaus"}]};
// STRUCTURE STREAM
SET 'auto.offset.reset' = 'earliest';
CREATE STREAM GROUPS_01
(groups ARRAY<STRUCT<
albums ARRAY<VARCHAR>,
name VARCHAR,
surname VARCHAR
>>)
WITH (kafka_topic='group-topic', value_format='JSON');
SELECT
EXPLODE(groups)->name AS name,
EXPLODE(groups)->surname AS surname,
EXPLODE(groups)->albums AS albums
FROM GROUPS_01
EMIT CHANGES;
// I have
NAME SURNAME ALBUMS
Roberth Smith [Wish,Desintegration]
Peter Murphy [Mask,In The Flat Field]
// I need
NAME SURNAME ALBUM
Roberth Smith Wish
Roberth Smith Desintegration
Peter Murphy Mask
Peter Murphy In The Flat Field
// TRY
EXPLODE(groups)->EXPLODE(albums)->album AS album
EXPLODE(albums)->album AS album
For clarity, here's the source data you provided:
{
"groups": [
{
"name": "Roberth",
"surname": "Smith",
"origin": "England",
"albums": [
"Wish",
"Desintegration"
],
"group": "The Cure"
},
{
"name": "Peter",
"surname": "Murphy",
"origin": "England",
"albums": [
"Mask",
"In The Flat Field"
],
"group": "Bauhaus"
}
]
}
First explode out the root array
ksql> CREATE STREAM EX1A AS SELECT EXPLODE(GROUPS) AS GROUP_SINGLE FROM GROUPS_01 EMIT CHANGES;
Message
-----------------------------------
Created query with ID CSAS_EX1A_5
-----------------------------------
This gives us :
ksql> SELECT * FROM EX1A EMIT CHANGES;
+----------------+-------+-----------------------------------------------------------+
|ROWTIME |ROWKEY |GROUP_SINGLE |
+----------------+-------+-----------------------------------------------------------+
|1585666857714 |null |{ALBUMS=[Wish, Desintegration], NAME=Roberth, SURNAME=Smith|
| | |} |
|1585666857714 |null |{ALBUMS=[Mask, In The Flat Field], NAME=Peter, SURNAME=Murp|
| | |hy} |
Now use the -> operator to access the nested structure and explode the ALBUMS array:
CREATE STREAM ALBUMS_EXPLODED AS
SELECT GROUP_SINGLE->NAME AS NAME,
GROUP_SINGLE->SURNAME AS SURNAME,
EXPLODE(GROUP_SINGLE->ALBUMS) AS ALBUM
FROM EX1A
EMIT CHANGES;
ksql> SELECT NAME, SURNAME, ALBUM FROM ALBUMS_EXPLODED EMIT CHANGES;
+-------------------+----------------------+-------------------+
|NAME |SURNAME |ALBUM |
+-------------------+----------------------+-------------------+
|Roberth |Smith |Wish |
|Roberth |Smith |Desintegration |
|Peter |Murphy |Mask |
|Peter |Murphy |In The Flat Field |

Retrieving the property information in Orientdb

I had created many classes with properties in orient Db. Now i want to retrieve only the property information.
In MySQL we are using query "desc table Name'
in orient Db which query is used to get the property details with out the data embedded in it.
Try:
select #type, #rid, #version, #class from v
where v is your class.
You might be interested in example query for metadata as documented in http://orientdb.com/docs/2.2/SQL.html#query-metadata
You can retrive property name with :
select name from (select expand(properties) from ( select expand(classes) from metadata:schema ) where name = 'OUser')
+----+--------+
|# |name |
+----+--------+
|0 |status |
|1 |password|
|2 |name |
|3 |roles |
+----+--------+