Selecting count of occurences of values in kdb - kdb

How to count occurences of distinct values from one column in another column in kdb. The idea is to return the count of values in another column.
The table looks like
Col1 : x,y,z and Col2: x,x,l
The idea is to find count of occurences of x,y,z from col1 in col2, which in above case is 2,0,0

You could try this:
tab:([]col1:`x`y`z;col2:`x`x`w)
q)exec([]distinct col1)!0^([]count each group col2)distinct col1 from tab
col1| col2
----| ----
x | 2
y | 0
z | 0

Desired value can be found as a map of Col2 occurrences. Which is later looked up by values from Col1
t: ([] Col1:`x`y`z; Col2:`x`x`l);
update Col1Col2Count: 0^(count each group Col2) Col1 from t

Related

DB2 - Concat all values in a column into a Single string

Let's say I have a table like this:
test
Col1
Col2
A
1
B
1
C
1
D
2
I am doing query select col1 from test where col2 = 1;
This will return a column with values A B and C in 3 separate rows.
I want the SQL to return a single row with value A|B|C. Is this possible to do? If it is how should I do it?
you can use LISTAGG function like this:
SELECT LISTAGG(col1, ',')
If LISTAGG is not available, it can be reproduced with XMLAGG:
SELECT SUBSTR(XMLSERIALIZE(XMLAGG(XMLTEXT('|'||"Col1"))),2)
FROM "test"
WHERE "Col2" = 1

filter rows from Postgres table based on specific conditions without missing relevant rows

I have table with following columns in postgres.
col1 col2 col3
1 Other a
2 Drug b
1 Procedure c
3 Combination Drug d
4 Biological e
3 Behavioral f
3 Drug g
5 Drug g
6 Procedure h
I would like to filter rows based on following filters.
select col1, col2, col3
from tbl
where col2 in ('Other', 'Drug', 'Combination Drug', 'Biological')
But this query will exclude below rows
1 Procedure c
3 Behavioral f
The Desired output is:
col1 col2 col3
1 Other a
1 Procedure c
2 Drug b
3 Combination Drug d
3 Behavioral f
3 Drug g
4 Biological e
5 Drug g
How can I achieve the above output without missing the mentioned rows.
Any suggestion here will be really helpful. Thanks
I think you want the rows where there is as col1 containing any of the values of col2 in the list:
select col1, col2, col3
from tbl
where col1 in (
select col1 from tbl
where col2 in ('Other', 'Drug', 'Combination Drug', 'Biological')
)
order by col1;
Or with EXISTS:
select t.col1, t.col2, t.col3
from tbl t
where exists (
select 1 from tbl
where col1 = t.col1
and col2 in ('Other', 'Drug', 'Combination Drug', 'Biological')
)
order by col1;
See the demo.

Counting all entries with KSQL

Is it possible to use KSQL to not only count entries of a specific column via GROUP BY but instead get an aggregate over all the entries that stream through the application?
I'm searching for something like this:
| Count all | Count id1 | count id2 |
| ---245----|----150----|----95-----|
Or more like this in KSQL:
[some timestamp] | Count all | 245
[some timestamp] | Count id1 | 150
[some timestamp] | Count id2 | 95
.
.
.
Thank you
- Tim
You cannot have both counts for the all and count for each key in the same query. You can have two queries here, one for counting each value in the given column and another for counting all values in the given column.
Let's assume you have a stream with two columns, col1 and col2.
To count each value in col1 with infinite window size you can use the following query:
SELECT col1, count(*) FROM mystream1 GROUP BY col1;
To count all the rows you need to write two queries since KSQL always needs GROUP BY clause for aggregation. First you create a new column with constant value and then you can count the values in new column and since it is a constant, the count will represent the count of all rows. Here is an example:
CREATE STREAM mystream2 AS SELECT 1 AS col3 FROM mystream1;
SELECT col3, count(*) FROM mystream2 GROUP BY col3;
This works too to get total rows count for a table:
ksql> SELECT COUNT(*) FROM `mytable` GROUP BY 1 EMIT CHANGES;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|KSQL_COL_0 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|2298
you can do a extended describe on the stream or table to see the total messages
ksql> describe extended <stream or table name>
sample output
Local runtime statistics
------------------------
messages-per-sec: 0 total-messages: 2415888 last-message: 2019-12-06T02:29:43.005Z

Postgres function to attach random integers to selected rows

I want a function or a trigger that when a row is inserted that all rows with matching criteria are given a random integer between 1 and the number of rows so to randomise the rows on a select.
E.g. if I have the data
Col1 Col2 Order
A 1
B 2
B 2
B 3
A 2
and I insert another row with Col1=B and Col2=2 then I want to end up with
Col1 Col2 Order
A 1
B 2 2
B 2 3
B 3
A 2
B 2 1
Where Order is a number with a value of 1 - with each number appearing only once?
There is no need to store this, you can generate such a number when you retrieve the data.
select col1,
col2,
row_number() over (partition by col1, col2 order by random()) as random_order
from the_table

Generate Dynamic Update Statement via TSQL

I have a table with a number of columns:
col1
col2
col3
coln....
I need to generate dynamic UPDATE statement like below which will be used in production for bulk update:
UPDATE TableA
SET TableA.ColA = ValueOfCol2
WHERE
TableA.ColB='A'
Could anyone please share a TSQL script that generate n number of UPDATE statement as above, please?
Thank you
Unless I'm misunderstanding your problem, your example code works:
UPDATE [updateDemo] SET [updateDemo].[col2] = [updateDemo].[col3] WHERE [col4] = 'A'
This is based on the assumption that your table is something like this:
col1 | col2 | col3 | col4
1 P Z A
2 Y Z A
3 K S V
This above update query would result in (changes in square brackets):
col1 | col2 | col3 | col4
1 [Z] Z A
2 [Z] Z A
3 K S V