How to regexp split to table regexed splitted table - postgresql

I have table with combined string and I want to split it to first parts. I have results from query with regexped split to table.
Now i have split from this: 1:9,5:4,4:8,6:9,3:9,2:5,7:8,34:8,24:6
to this table:
campaign_skill
----------------
1:9
5:4
4:8
6:9
3:9
2:5
7:8
34:8
24:6
with this expression:
select *
from regexp_split_to_table((select user_skill from users where user_token = 'ded8ab43-efe2-4aea-894d-511ed3505261'), E'[\\s,]+') as campaign_skill
How to split actual results to tables like this:
campaign | skill
---------|------
1 | 9
5 | 4
4 | 8
6 | 9
3 | 9
2 | 5
7 | 8
34 | 8
24 | 6

You can use split_part() for that.
select split_part(t.campaign_skill, ':', 1) as campaign,
split_part(t.campaign_skill, ':', 2) as skill
from users u,
regexp_split_to_table(u.user_skill, E'[\\s,]+') as t(campaign_skill)
where u.user_token = 'ded8ab43-efe2-4aea-894d-511ed3505261';

Related

Is there a way to select elements associated with checked items without using multiple SELECT statements?

I'm trying to make a query that selects the neighborhoods ids of places that only have all the transport checked in a checkbox list. For instance, if 'Bus' and 'Railway' are checked, it should give me 7,8, and if only 'Railway' is checked, it should give me 7,8,11. The 'transporte' table is like this
b_codigo | tipo_transporte
----------+-----------------
1 | Underground
1 | Bus
2 | Bus
2 | Underground
3 | Bus
3 | Underground
4 | Bus
4 | RENFE
4 | Underground
5 | RENFE
5 | Underground
5 | Bus
5 | Tram
6 | Bus
6 | Underground
7 | RENFE
7 | Underground
7 | Bus
7 | Railway (FGC)
8 | Underground
8 | Railway (FGC)
8 | Bus
9 | Underground
9 | Bus
10 | Underground
10 | Bus
11 | Railway (FGC)
11 | Underground
12 | Bus
I tried with a query of the form
SELECT DISTINCT b_codigo
FROM transporte
WHERE (b_codigo, 'checked1') IN (SELECT * FROM transporte)
AND (b_codigo, 'checked2') IN (SELECT * FROM transporte)
AND ...
and another of the form
SELECT b_codigo
FROM transporte
WHERE tipo_transporte = 'checked1'
INTERSECT
SELECT b_codigo
FROM transporte
WHERE tipo_transporte = 'checked2'
INTERSECT
...;
and both give me the same results, but I'm worried about the efficiency of this two queries.
Is there a way of doing the same query without using N SELECT statements with N the number of checked boxes?
One way to do it, is to use aggregation:
select b_codigo
from transporte
where tipo_transporte in ('Bus', 'Railway (FGC)')
group by b_codigo
having count(distinct tipo_transporte) = 2
The number to compare to with the HAVING clause, needs to match the number of elements for the IN clause.

A Postgres query to get subtraction of a value in a row by the value in the next row

I have a table like(mytable):
id | value
=========
1 | 4
2 | 5
3 | 8
4 | 16
5 | 8
...
I need a query to give me subtraction on each rows by next row:
id | value | diff
=================
1 | 4 | 4 (4-Null)
2 | 5 | 1 (5-4)
3 | 8 | 3 (8-5)
4 | 16 | 8 (16-8)
5 | 8 | -8 (8-16)
...
Right now I use a python script to do so, but I guess it's faster if I create a view from this table.
You should use window functions - LAG() in this case:
SELECT id, value, value - LAG(value, 1) OVER (ORDER BY id) AS diff
FROM mytable
ORDER BY id;

PostgresQL for each row, generate new rows and merge

I have a table called example that looks as follows:
ID | MIN | MAX |
1 | 1 | 5 |
2 | 34 | 38 |
I need to take each ID and loop from it's min to max, incrementing by 2 and thus get the following WITHOUT using INSERT statements, thus in a SELECT:
ID | INDEX | VALUE
1 | 1 | 1
1 | 2 | 3
1 | 3 | 5
2 | 1 | 34
2 | 2 | 36
2 | 3 | 38
Any ideas of how to do this?
The set-returning function generate_series does exactly that:
SELECT
id,
generate_series(1, (max-min)/2+1) AS index,
generate_series(min, max, 2) AS value
FROM
example;
(online demo)
The index can alternatively be generated with RANK() (example, see also #a_horse_­with_­no_­name's answer) if you don't want to rely on the parallel sets.
Use generate_series() to generate the numbers and a window function to calculate the index:
select e.id,
row_number() over (partition by e.id order by g.value) as index,
g.value
from example e
cross join generate_series(e.min, e.max, 2) as g(value);

How would you create a group identifier based on one column, but sorted by another?

I am attempting to create column Group via T-SQL.
If a cluster of accounts are in a row, consider that as one group. if the account is seen again lower in the list (cluster or not), then consider it a new group. This seems straight forward, but I cannot seem to see the solution... Below there are three clusters of account 3456, each having a different group number (Group 1,4, and 6)
+-------+---------+------+
| Group | Account | Sort |
+-------+---------+------+
| 1 | 3456 | 1 |
| 1 | 3456 | 2 |
| 2 | 9878 | 3 |
| 3 | 5679 | 4 |
| 4 | 3456 | 5 |
| 4 | 3456 | 6 |
| 4 | 3456 | 7 |
| 5 | 1295 | 8 |
| 6 | 3456 | 9 |
+-------+---------+------+
UPDATE: I left this out of the original requirements, but a cluster of accounts could have more than two accounts. I updated the example data to include this scenario.
Here's how I'd do it:
--Sample Data
DECLARE #table TABLE (Account INT, Sort INT);
INSERT #table
VALUES (3456,1),(3456,2),(9878,3),(5679,4),(3456,5),(3456,6),(1295,7),(3456,8);
--Solution
SELECT [Group] = DENSE_RANK() OVER (ORDER BY grouper.groupID), grouper.Account, grouper.Sort
FROM
(
SELECT t.*, groupID = ROW_NUMBER() OVER (ORDER BY t.sort) +
CASE t.Account WHEN LEAD(t.Account,1) OVER (ORDER BY t.sort) THEN 1 ELSE 0 END
FROM #table AS t
) AS grouper;
Results:
Group Account Sort
------- ----------- -----------
1 3456 1
1 3456 2
2 9878 3
3 5679 4
4 3456 5
4 3456 6
5 1295 7
6 3456 8
Update based on OPs comment below (20190508)
I spent a couple days banging my head on how to handle groups of three or more; it was surprisingly difficult but what I came up with handles bigger clusters and is way better than my first answer. I updated the sample data to include bigger clusters.
Note that I include a UNIQUE constraint for the sort column - this creates a unique index. You don't need the constraint for this solution to work but, having an index on that column (clustered, nonclustered unique or just nonclustered) will improve the performance dramatically.
--Sample Data
DECLARE #table TABLE (Account INT, Sort INT UNIQUE);
INSERT #table
VALUES (3456,1),(3456,2),(9878,3),(5679,4),(3456,5),(3456,6),(1295,7),(1295,8),(1295,9),(1295,10),(3456,11);
-- Better solution
WITH Groups AS
(
SELECT t.*, Grouper =
CASE t.Account WHEN LAG(t.Account,1,t.Account) OVER (ORDER BY t.Sort) THEN 0 ELSE 1 END
FROM #table AS t
)
SELECT [Group] = SUM(sg.Grouper) OVER (ORDER BY sg.Sort)+1, sg.Account, sg.Sort
FROM Groups AS sg;
Results:
Group Account Sort
----------- ----------- -----------
1 3456 1
1 3456 2
2 9878 3
3 5679 4
4 3456 5
4 3456 6
5 1295 7
5 1295 8
5 1295 9
5 1295 10
6 3456 11

Take new columns as output table - KDB

I have a query which returns results of data, which runs on a frequent basis. The new table will contain results of the old table as well but I only want to take whatever is in new in the most recent run of the new table and send that as an email. I already have the line for the email and trade data but just need a way to be able to:
display the results of the new table to be emailed
save the complete results of the new table to be used in the next run of the query
e.g.
Old results: tbl
| idx | name | age |
| 0 | Tom | 30 |
| 1 | Jerry | 25 |
| 2 | Bob | 30 |
| 3 | Ken | 45 |
New results: tbl
| idx | name | age |
| 0 | Tom | 30 |
| 1 | Jerry | 25 |
| 2 | Bob | 30 |
| 3 | Ken | 45 |
| 4 | Sam | 40 |
output required:
| 4 | Sam | 40 |
and then save the New results to be used in the next run
Thanks! :)
If the only changes between runs is that records are being appended onto the new table, you could just keep a variable denoting the last index seen and then select only those rows where idx is larger than that.
If the indexes are always increasing, this could be achieved using a query like
lastidx:exec last idx from tbl
select from tbl where idx>lastidx
If the idx values don't always increase monotonically, you could keep a count of the number of rows instead and only
lasti:count tbl
select from tbl where i>=lasti
This doesn't require saving the whole table in memory for use in the next iteration.
E.g to start with the old table had 4 rows so lasti = 4
q)tbl
idx name age
-------------
0 Tom 30
1 Jerry 25
2 Bob 30
3 Ken 45
q)lasti
4
The new table comes in and running the command selects the new row
q)tbl
idx name age
-------------
0 Tom 30
1 Jerry 25
2 Bob 30
3 Ken 45
4 Sam 40
q)select from tbl where i>lasti
idx name age
------------
4 Sam 40
lasti can then be updated to reflect the new count
q)lasti:count tbl
q)lasti
5
One way you can get this done, assuming the idx is the unique key :
q)old:([] idx:0 1 2 3; name:`T`J`B`K; age: 30 25 30 45)
q)new:old,enlist `idx`name`age!(4; `S;40) //new output from your query
q)out:()
q)if[0<count i:new[`idx] except old[`idx] ; out:new i ; old:new]
q)out
idx name age
------------
4 S 40
Another way, if your new records are always added to the last of old records:
q)old:([] idx:0 1 2 3; name:`T`J`B`K; age: 30 25 30 45)
q)i:count old
q)new:old,enlist `idx`name`age!(4; `S;40) //new output from your query
q)out:()
q)if[i<c:count new ; out:(i-c)#new ; old:new; i:c]
q)out
idx name age
------------
4 S 40