How to pass value from another table to jsonb in postgres - postgresql

I have two tables.
Table A
id | json
----+------------------
a | {"st":[{"State": "TX", "Value":"0.02"}, {"State": "CA", "Value":"0.2" ...
----+------------------
b | {"st":[{"State": "TX", "Value":"0.32"}, {"State": "CA", "Value":"0.47" ...
Table B
idx | state| dir
----+-------+----------
1 | TX | 123
----+-------+----------
2 | CA | 15
I want to filter table A using column temp from table B. And Table B will select base upon idx value.
I want to select value from each row when state equal to temporary table which is created from tableB using where idx is certain number
lets say idx is equal to 2. That means I can create temporary table using following sql query
with tempT AS(
SELECT *
FROM tableB
where idx = 2);
This is what I am trying to achieve
idx | state| value
----+-------+----------
2 | CA | 0.2
----+-------+----------
2 | CA | 0.47
How can I do that ?

You should use jsonb_array_elements like:
WITH A AS
(SELECT 'a' AS id,
'{"st":[{"State": "TX", "Value":"0.02"}, {"State": "CA", "Value":"0.2"}]}'::jsonb AS json
UNION SELECT 'b' AS id,
'{"st":[{"State": "TX", "Value":"0.32"}, {"State": "CA", "Value":"0.47"}]}'::jsonb AS json),
B AS
(SELECT 1 AS idx,
'TX' AS state,
123 AS dir
UNION SELECT 2 AS idx,
'CA' AS state,
15 AS dir)
SELECT *
FROM
(SELECT A.id,
jsonb_array_elements(A.json->'st') AS obj
FROM A) AS A
inner JOIN B on B.state = obj->>'State'::text
where B.idx = 2;

Related

Postgres JSONB, query first appearance of inner field

Trying to query product_id and inner JSONB value "quantity_on_hand" with "unit:one". Below is example table
Products table
| product_id | data |
| -------- | -------------|
| 00445 | {"storage"...}| - rest of data specified right below
{
"storage": [
{
"unit": "one",
"quantity": 3,
},
{
"unit": "two",
"quantity": 2,
}
}
I found a query:
SELECT product_id, e -> 'quantity' as quant
FROM Products t,
jsonb_array_elements(t.value->'storage') e
WHERE (product_id) IN (('00445'));
The query returns following output:
product_id | quant
00445 | 3
00445 | 2
Please advise how to set rule: "quantity_on_hand" with "unit:one" to return only:
product_id | quant
00445 | 3
Thanks
You can add a clause for filtering the result of the jsonb_array_elements to only include elements where the JSON key "unit"'s value is "one":
SELECT product_id,
e -> 'quantity' AS quant
FROM Products t,
JSONB_ARRAY_ELEMENTS(t.value -> 'storage') e
WHERE (product_id) IN (('00445'))
AND e ->> 'unit' = 'one';
This should give:
product_id | quant
------------+-------
1 | 3
(1 row)
See https://www.postgresql.org/docs/14/functions-json.html for more information on JSONB operators and functions in Postgres.

Creating clusters of related columns

I have a table named Stores with columns:
StoreCode NVARCHAR(10),
OldStoreCode NVARCHAR(10)
Here is a sample of my data:
| StoreCode | OldStoreCode |
|-----------|--------------|
| A | B |
| B | A |
| D | E |
| E | F |
| M | K |
| J | K |
| K | L |
|-----------|--------------|
I want to create clusters of related Stores. Related store means there is a one way relation between StoreCodes and OldStoreCodes.
Expected result table:
| StoreCode | ClusterId |
|-----------|-----------|
| A | 1 |
| B | 1 |
| D | 2 |
| E | 2 |
| F | 2 |
| M | 3 |
| K | 3 |
| J | 3 |
| L | 3 |
|-----------|-----------|
There is no maximum number hops. There may be a StoreCode A which has a OldStoreCode B, which has a OldStoreCode C, which has a OldStoreCode D etc.
How can I cluster stores like this?
Try it like this:
EDIT: With changes by OP taken from comment
DECLARE #tbl TABLE(ID INT IDENTITY, StoreCode VARCHAR(100),OldStoreCode VARCHAR(100));
INSERT INTO #tbl VALUES
('A','B'),('B','A'),('D','E'),('E','F'),('M','K'),('J','K'),('K','L');
WITH Related AS
(
SELECT DISTINCT t1.ID,Val
FROM #tbl AS t1
INNER JOIN #tbl AS t2 ON t1.StoreCode=t2.StoreCode
OR t1.OldStoreCode=t2.OldStoreCode
OR t1.OldStoreCode=t2.StoreCode
OR t1.StoreCode=t2.OldStoreCode
CROSS APPLY(SELECT DISTINCT Val
FROM
(VALUES(t1.StoreCode),(t2.StoreCode),(t1.OldStoreCode),(t2.OldStoreCode)) AS A(Val)
) AS valsInCols
)
,ClusterKeys AS
(
SELECT r1.ID
,(
SELECT r2.Val AS [*]
FROM Related AS r2
WHERE r2.ID=r1.ID
ORDER BY r2.Val
FOR XML PATH('')
) AS ClusterKey
FROM Related AS r1
GROUP BY r1.ID
)
,ClusterIds AS
(
SELECT ClusterKey
,MIN(ID) AS ID
FROM ClusterKeys
GROUP BY ClusterKey
)
SELECT r.ID
,r.Val
FROM ClusterIds c
INNER JOIN Related r ON c.ID = r.ID
The result
ID Val
1 A
1 B
3 D
3 E
3 F
5 J
5 K
5 L
5 M
This should do it:
SAMPLE DATA:
IF OBJECT_ID('tempdb..#Temp1') IS NOT NULL
BEGIN
DROP TABLE #Temp1;
END;
CREATE TABLE #Temp1(StoreCode NVARCHAR(10)
, OldStoreCode NVARCHAR(10));
INSERT INTO #Temp1(StoreCode
, OldStoreCode)
VALUES
('A'
, 'B'),
('B'
, 'A'),
('D'
, 'E'),
('E'
, 'F'),
('M'
, 'K'),
('J'
, 'K'),
('K'
, 'L');
QUERY:
;WITH A -- get all distinct new and old storecodes
AS (
SELECT StoreCode
FROM #Temp1
UNION
SELECT OldStoreCode
FROM #Temp1),
B -- give a unique number id to each store code
AS (SELECT rn = RANK() OVER(ORDER BY StoreCode)
, StoreCode
FROM A),
C -- combine the store codes and the unique number id's in one table
AS (SELECT b2.rn AS StoreCodeID
, t.StoreCode
, b1.rn AS OldStoreCodeId
, t.OldStoreCode
FROM #Temp1 AS t
LEFT OUTER JOIN B AS b1 ON t.OldStoreCode = b1.StoreCode
LEFT OUTER JOIN B AS b2 ON t.StoreCode = b2.StoreCode),
D -- assign a row number for each entry in the data set
AS (SELECT rn = RANK() OVER(ORDER BY StoreCode)
, *
FROM C),
E -- derive first and last store in the path
AS (SELECT FirstStore = d2.StoreCode
, LastStore = d1.OldStoreCode
, GroupID = d1.OldStoreCodeId
FROM D AS d1
RIGHT OUTER JOIN D AS d2 ON d1.StoreCodeID = d2.OldStoreCodeId
AND d1.rn - 1 = d2.rn
WHERE d1.OldStoreCode IS NOT NULL) ,
F -- get the stores wich led to the last store with one hop
AS (SELECT C.StoreCode
, E.GroupID
FROM E
INNER JOIN C ON E.LastStore = C.OldStoreCode)
-- combine to get the full grouping
SELECT A.StoreCode, ClusterID = DENSE_RANK() OVER (ORDER BY A.GroupID) FROM (
SELECT C.StoreCode,F.GroupID FROM C INNER JOIN F ON C.OldStoreCode = F.StoreCode
UNION
SELECT * FROM F
UNION
SELECT E.LastStore,E.GroupID FROM E) AS A ORDER BY StoreCode, ClusterID
RESULTS:

Update Count column in Postgresql

I have a single table laid out as such:
id | name | count
1 | John |
2 | Jim |
3 | John |
4 | Tim |
I need to fill out the count column such that the result is the number of times the specific name shows up in the column name.
The result should be:
id | name | count
1 | John | 2
2 | Jim | 1
3 | John | 2
4 | Tim | 1
I can get the count of occurrences of unique names easily using:
SELECT COUNT(name)
FROM table
GROUP BY name
But that doesn't fit into an UPDATE statement due to it returning multiple rows.
I can also get it narrowed down to a single row by doing this:
SELECT COUNT(name)
FROM table
WHERE name = 'John'
GROUP BY name
But that doesn't allow me to fill out the entire column, just the 'John' rows.
you can do that with a common table expression:
with counted as (
select name, count(*) as name_count
from the_table
group by name
)
update the_table
set "count" = c.name_count
from counted c
where c.name = the_table.name;
Another (slower) option would be to use a co-related sub-query:
update the_table
set "count" = (select count(*)
from the_table t2
where t2.name = the_table.name);
But in general it is a bad idea to store values that can easily be calculated on the fly:
select id,
name,
count(*) over (partition by name) as name_count
from the_table;
Another method : Using a derived table
UPDATE tb
SET count = t.count
FROM (
SELECT count(NAME)
,NAME
FROM tb
GROUP BY 2
) t
WHERE t.NAME = tb.NAME

SQL to remove rows with duplicated value while keeping one

Say I have this table
id | data | value
-----------------
1 | a | A
2 | a | A
3 | a | A
4 | a | B
5 | b | C
6 | c | A
7 | c | C
8 | c | C
I want to remove those rows with duplicated value for each data while keeping the one with the min id, e.g. the result will be
id | data | value
-----------------
1 | a | A
4 | a | B
5 | b | C
6 | c | A
7 | c | C
I know a way to do it is to do a union like:
SELECT 1 [id], 'a' [data], 'A' [value] INTO #test UNION SELECT 2, 'a', 'A'
UNION SELECT 3, 'a', 'A' UNION SELECT 4, 'a', 'B'
UNION SELECT 5, 'b', 'C' UNION SELECT 6, 'c', 'A'
UNION SELECT 7, 'c', 'C' UNION SELECT 8, 'c', 'C'
SELECT * FROM #test WHERE id NOT IN (
SELECT MIN(id) FROM #test
GROUP BY [data], [value]
HAVING COUNT(1) > 1
UNION
SELECT MIN(id) FROM #test
GROUP BY [data], [value]
HAVING COUNT(1) <= 1
)
but this solution has to repeat the same group by twice (consider the real case is a massive group by with > 20 columns)
I would prefer a simpler answer with less code as oppose to complex ones. Is there any more concise way to code this?
Thank you
You can use one of the methods below:
Using WITH CTE:
WITH CTE AS
(SELECT *,RN=ROW_NUMBER() OVER(PARTITION BY data,value ORDER BY id)
FROM TableName)
DELETE FROM CTE WHERE RN>1
Explanation:
This query will select the contents of the table along with a row number RN. And then delete the records with RN >1 (which would be the duplicates).
This Fiddle shows the records which are going to be deleted using this method.
Using NOT IN:
DELETE FROM TableName
WHERE id NOT IN
(SELECT MIN(id) as id
FROM TableName
GROUP BY data,value)
Explanation:
With the given example, inner query will return ids (1,6,4,5,7). The outer query will delete records from table whose id NOT IN (1,6,4,5,7).
This fiddle shows the records which are going to be deleted using this method.
Suggestion: Use the first method since it is faster than the latter. Also, it manages to keep only one record if id field is also duplicated for the same data and value.
I want to add MYSQL solution for this query
Suggestion 1 : MySQL prior to version 8.0 doesn't support the WITH clause
Suggestion 2 : throw this error (you can't specify table TableName for update in FROM clause
So the solution will be
DELETE FROM TableName WHERE id NOT IN
(SELECT MIN(id) as id
FROM (select * from TableName) as t1
GROUP BY data,value) as t2;

Select bitmask in Postgresql

I have a table with columns "one" and "two":
a | x
a | y
a | z
b | x
b | z
c | y
I want to write a query to complement it with missing nested values
b | null | y
c | null | x
c | null | z
Then I will select it with array_agg(two) group by one, such that
a {1 1 1}
b {1 0 1}
c {0 1 0}
And eventually export it in a CSV file with COPY query
What query should I write for the first step?
You can use a CROSS JOIN to build all the possible pairs of elements then a LEFT JOIN to check if each pair of elements exists:
SELECT
T1.one,
T2.two,
CASE WHEN your_table.one IS NULL THEN 0 ELSE 1 END AS is_present
FROM (SELECT DISTINCT one FROM your_table) T1
CROSS JOIN (SELECT DISTINCT two FROM your_table) T2
LEFT JOIN your_table
ON T1.one = your_table.one AND T2.two = your_table.two
You can then add a GROUP BY T1.one and an ARRAY_AGG(...) to this query.