Run Time of the Trigger in Siddhi - postgresql

I want read the postgres table from siddhi, and I am using a Trigger:
#From(eventtable='rdbms', jdbc.url='jdbc:postgresql://localhost:5432/pruebabg', username='postgres', password='Easysoft16', driver.name='org.postgresql.Driver', table.name='Trazablack')
define table Trazablack (sensorValue double);
define trigger FiveMinTriggerStream at every 10 min;
from FiveMinTriggerStream join Trazablack as t
select t.sensorValue as sensorValue
insert into StreamBlack;
But, I have a problem, the query is run every 10 minutes, I need to run when a new event arrives.
is this possible?
from sensorStream#window.length(2)
JOIN StreamBlack#window.length(1)
on sensorStream.sensorValue==StreamBlack.sensorValue
select sensorStream.meta_timestamp, sensorStream.meta_sensorName, sensorStream.correlation_longitude,
sensorStream.correlation_latitude, sensorStream.sensorValue as valor1, StreamBlack.sensorValue as valor2
insert INTO StreamPaso;
from sensorStream#window.length(2)
LEFT OUTER JOIN StreamBlack#window.length(1)
on sensorStream.sensorValue==StreamBlack.sensorValue
select sensorStream.meta_timestamp, sensorStream.meta_sensorName, sensorStream.correlation_longitude,
sensorStream.correlation_latitude, sensorStream.sensorValue as valor1, StreamBlack.sensorValue as valor2
insert INTO StreamPaso;

In this case you can simply do the join with the input stream instead of trigger stream using a window. We can use a 0 length window since we don't need to store anything in the window and we need the query to trigger when an event arrives through the stream. Also we can use the 'current events' (i.e. incoming events) clause to make sure that the 'expired events' (i.e. events kept and emitted by the window) of the window are not considered (refer Siddhi QL documentation on windows for more information).
eg (assuming sensorStream is the input stream):
from sensorStream#window.length(0) join Trazablack as t
select t.sensorValue as sensorValue
insert current events into StreamBlack;

Related

REPEATABLE READ isolation level - successive SELECTs return different values

I'm trying to check the REPEATABLE READ isolation level in PostgreSQL, specifically the statement:
successive SELECT commands within a single transaction see the same
data, i.e., they do not see changes made by other transactions that
committed after their own transaction started.
To check that I run two scripts in two Query Editors in pg_admin.
The 1st one:
begin ISOLATION LEVEL REPEATABLE READ;
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
select * into t1 from topic where id=1;
select pg_sleep(5);
select * into t2 from topic where id=1 ;
commit;
Here we have two successive SELECTs with the pause of 5 seconds between them, just to have time to run the 2nd UPDATE script. t1 and t2 are tables used to save the results. Thanks to them I'm able to check the selected data after the scripts execution.
I run the 2nd script immediately after the fist:
begin ISOLATION LEVEL REPEATABLE READ;
update topic set title='new value' where id=1;
commit;
It must commit after the 1st select but before the 2nd one.
The problem is that two successive SELECTs return different values - the results in t1 and t2 are different. I suggested they must the same. Could you explain me why this is happening?
Maybe pg_sleep starts transaction implicitly?

Can I convert from Table to Stream in KSQL?

I am working in the kafka with KSQL. I would like to find out the last row within 5 min in different DEV_NAME(ROWKEY). Therefore, I have created the stream and aggregated table for further joining.
By below KSQL, I have created the table for finding out the last row within 5 min for different DEV_NAME
CREATE TABLE TESTING_TABLE AS
SELECT ROWKEY AS DEV_NAME, max(ROWTIME) as LAST_TIME
FROM TESTING_STREAM WINDOW TUMBLING (SIZE 5 MINUTES)
GROUP BY ROWKEY;
Then, I would like to join together:
CREATE STREAM TESTING_S_2 AS
SELECT *
FROM TESTING_S S
INNER JOIN TESTING_T T
ON S.ROWKEY = T.ROWKEY
WHERE
S.ROWTIME = T.LAST_TIME;
However, it occured the error:
Caused by: org.apache.kafka.streams.errors.StreamsException: A serializer (org.apache.kafka.streams.kstream.TimeWindowedSerializer) is not compatible to the actual key type (key type: org.apache.kafka.connect.data.Struct). Change the default Serdes in StreamConfig or provide correct Serdes via method parameters.
It should be the WINDOW TUMBLING function changed my ROWKEY style
(e.g. DEV_NAME_11508 -> DEV_NAME_11508 : Window{start=157888092000 end=-}
Therefore, without setting the Serdes, could I convert from the table to stream and set the PARTITION BY DEV_NAME?
As you've identified, the issue is that your table is a windowed table, meaning the key of the table is windowed, and you can not look up into a windowed table with a non-windowed key.
You're table, as it stands, will generate one unique row per-ROWKEY for each 5 minute window. Yet it seems like you don't care about anything but the most recent window. It may be that you don't need the windowing in the table, e.g.
CREATE TABLE TESTING_TABLE AS
SELECT
ROWKEY AS DEV_NAME,
max(ROWTIME) as LAST_TIME
FROM TESTING_STREAM
WHERE ROWTIME > (UNIX_TIMESTAMP() - 300000)
GROUP BY ROWKEY;
Will track the max timestamp per key, ignoring any timestamp that is over 5 minutes old. (Of course, this check is only done at the time the event is received, the row isn't removed after 5 minutes).
Also, this join:
CREATE STREAM TESTING_S_2 AS
SELECT *
FROM TESTING_S S
INNER JOIN TESTING_T T
ON S.ROWKEY = T.ROWKEY
WHERE
S.ROWTIME = T.LAST_TIME;
Almost certainly isn't doing what you think and wouldn't work in the way you want due to race conditions.
It's not clear what you're trying to achieve. Adding more information about your source data and required output may help people to provide you with a solution.

KSQL SELECT code works, but CREATE TABLE `..` AS SELECT code returns error - io.confluent.ksql.util.KsqlStatementException: Column cannot be resolved

I've got a problem with creating a table or stream in KSQL.
I've made everything as shown in official examples and I don't get why my code does not work.
Example from https://docs.confluent.io/current/ksql/docs/tutorials/examples.html#joining :
CREATE TABLE pageviews_per_region_per_session AS
SELECT regionid,
windowStart(),
windowEnd(),
count(*)
FROM pageviews_enriched
WINDOW SESSION (60 SECONDS)
GROUP BY regionid;
NOW MY CODE:
I've tried to run select in command prom and it WORKS WELL:
SELECT count(*) as attempts_count, "computer", (WINDOWSTART() / 1000) as row_time
FROM LOG_FLATTENED
WINDOW TUMBLING (SIZE 20 SECONDS)
WHERE "event_id" = 4625
GROUP BY "computer"
HAVING count(*) > 2;
But when I try to create the table based on this select (from ksql command-line tool):
CREATE TABLE `incorrect_logins` AS
SELECT count(*) as attempts_count, "computer", (WINDOWSTART() / 1000) as row_time
FROM LOG_FLATTENED
WINDOW TUMBLING (SIZE 20 SECONDS)
WHERE "event_id" = 4625
GROUP BY "computer"
HAVING count(*) > 2;
I GET AN ERROR - io.confluent.ksql.util.KsqlStatementException: Column COMPUTER cannot be resolved. But this column exists and select without create table statement works perfectly.
I'm using the latest stable KSQL image (confluentinc/cp-ksql-server:5.3.1)
In first place, I apologize for my bad english, if something that I'll say it's not clear enough, do not hesitate to reply me and I try to explain me in a better way.
I don't know a lot of KSQL, but I'll try to help you, based on my experience creating STREAMS like your TABLE.
1) As you probably know, KSQL process everything as UpperCase unless you specify the opposite.
2) KSQL doesn't support double quotes in a SELECT inside a CREATE query, in fact, KSQL will ignore this characters and will handle your field as a UpperCase column, for that reason, in the error returned to you, appears COMPUTER and not "computer".
A workaround of this issue is:
In first place, create an empty table with the lowerCase fields:
CREATE TABLE "incorrect_logins" ("attempts_count" INTEGER, "computer" VARCHAR, "row_time" INTEGER) WITH (KAFKA_TOPIC='topic_that_you_want', VALUE_FORMAT='avro')
(If the topic doesn't exist, you'll have to create it before)
Once the table has been created, you could insert data in the table using your SELECT query:
INSERT INTO "incorrect_logins" SELECT count() as "attempts_count", "computer", (WINDOWSTART() / 1000) as "row_time"
FROM LOG_FLATTENED
WINDOW TUMBLING (SIZE 20 SECONDS)
WHERE "event_id" = 4625
GROUP BY "computer"
HAVING count() > 2;
Hope it helps you!

How can you use 'For update skip locked' in postgres without locking rows in all tables used in the query?

When you want to use postgres's SELECT FOR UPDATE SKIP LOCKED functionality to ensure that two different users reading from a table and claiming tasks do not get blocked by each other and also do not get tasks already being read by another user:
A join is being used in the query to retrieve tasks. We do not want any other table to have row-level locking except the table that contains the main info. Sample query below - Lock only the rows in the table -'task' in the below query
SELECT v.someid , v.info, v.parentinfo_id, v.stage FROM task v, parentinfo pi WHERE v.stage = 'READY_TASK'
AND v.parentinfo_id = pi.id
AND pi.important_info_number = (
SELECT MAX(important_info_number) FROM parentinfo )
ORDER BY v.id limit 200 for update skip locked;
Now if user A is retrieving some 200 rows of this table, user B should be able to retrieve another set of 200 rows.
EDIT: As per the comment below, the query will be changed to :
SELECT v.someid , v.info, v.parentinfo_id, v.stage FROM task v, parentinfo pi WHERE v.stage = 'READY_TASK'
AND v.parentinfo_id = pi.id
AND pi.important_info_number = (
SELECT MAX(important_info_number) FROM parentinfo) ORDER BY v.id limit 200 for update of v skip locked;
How best to place order by such that rows are ordered? While the order would get effected if multiple users invoke this command, still some order sanctity should be maintained of the rows that are being returned.
Also, does this also ensure that multiple threads invoking the same select query would be retrieving a different set of rows or is the locking only done for update commands?
Just experimented with this a little bit - multiple select queries will end up retrieving different set of rows. Also, order by ensures the order of the final result obtained.
Yes,
FOR UPDATE OF "TABLE_NAME" SKIP LOCKED
will lock only TABLE_NAME

select from final table(update table) concurrent select i.e select from two threads

Due to a concurrency issue in my project which happend due to 2 threads coming in together to do a select at same time, both recieve the same values which ideally should not happen.
After selecting a value it should perform a update and then second thread should select the updated value.
Am using DB2
I thought of using this approach of using
select number from final table(update tablename set columnanme=""
where )
.
My question is would this approach lock the db when the other thread comes in to select the value as theere is an update within select? and solve my concurrency issue.
OR
I was browsing and found another approach
update table (.....) select col from table where wait for
outcome
Would this select wait until the first thread finishes the select?
One thing you can certainly do to avoid multiple reads of the same value before it gets updated by one of the readers:
LOCK TABLE tablename IN EXCLUSIVE MODE;
SELECT id, ... FROM tablename WHERE ...;
UPDATE tablename SET id=newval WHERE ...;
COMMIT;
This will of course block the full table, which is maybe not what you want!
Alternative approach (relatively standard, but somewhat more involved programming logic):
SELECT id, ... FROM tablename WHERE ...
SELECT count(1) FROM FINAL TABLE (UPDATE tablename SET id=newval
WHERE ... AND id=newval);
while this count(1) is zero (meaning: someone else meanwhile updated
it) repeat from 1)
--Peter Vanroose,
ABIS Training & Consulting,
Leuven, Belgium.