how to make atomic rows using function regexp_split_to_table #postgresql

how to make atomic rows using function regexp_split_to_table #postgresql - postgresql

i have a table that stores amenities of a room (wifi,tv etc) a room can have many amenities i want to make a column where every amenity will be atomic
id
amenity_name
1
tv
2
wifi
3
bed
4
Smokling allowed
current table :
id
Another header
1
Wifi,Breakfast
2
Wifi,Kitchen,Smoking allowed,Pets allowed,Heating,Washer,Essentials,Lock on bedroom door,24-hour check-in,Hangers,Hair dryer,Laptop friendly workspace
i have tried using regexp_split_to_table but i can't make anything out from this function
any ideas?
thanks.

Try a lateral join:
SELECT tab.id, a.name
FROM tab
CROSS JOIN LATERAL regexp_split_to_table(tab.amenity_name, ',') AS a(name);

Related

Loop a result set and feed two tables

I have a select query that returns a huge result set (500k records). But for this example let's say it has only two records:
SELECT * FROM INVENTORY I
INNER JOIN PARTS P
ON I.partcode = P.partcode
ORDER BY I.partcode
The result will look more or less like this:
pk partcode genericname partname stock
1 001 mouse logitech 10
2 002 keyboard genius 8
I have to loop the result above and feed two tables (product and variant).
I first have to insert two of the columns into 'product' table, like this:
INSERT INTO PRODUCT
(p_code,product_name) values (partcode,genericname)
pk p_code product_name
5 001 mouse
6 001 keyboard
Then I have to grab the pk that was automatically generated into the table above (say ppk) and then insert it together with the other two columns into the 'variant' table, like this:
INSERT INTO VARIANT
(product_pk,variant_name,in_stock) values (ppk,partname,stock)
pk product_pk variant_name in_stock
10 5 logitech 10
11 6 genius 8
At the end I should have the product and the variant tables with 2 records each.
I could write a VB code to do that but I think that it can de done in pure SQL, and I just am not sure the best approach.
Someone could give me some help with this?
Thank you!

You could use a SQL cursor to loop through and insert a row at a time into PRODUCT and then use SCOPE_IDENTITY() to get the newly assigned identity value to insert a corresponding row into VARIANT, but best practice is to avoid cursors if there's another way. (There usually is, but not always.)
If the partcode/genericname combination will uniquely identify 1 record in PRODUCT, you could do this:
INSERT INTO PRODUCT (p_code,product_name)
SELECT partcode, genenricname
FROM INVENTORY I INNER JOIN PARTS P ON I.partcode = P.partcode
(I would eliminate the ORDER BY from your query unless you care about the order the identity values are assigned.)
Then, run this:
INSERT INTO VARIANT
(product_pk,variant_name,in_stock)
SELECT pr.ppk, i.partname, i.stock
FROM inventory i INNER JOIN parts p ON i.partcode = p.partcode
INNER JOIN product pr on i.partcode = pr.p_code and i.genericname = pr.product_name
You may have to clean up the aliases between i and p in the 2nd query. I can't tell which table (inventory or parts) the variant_name and in_stock fields are coming from so I just used i.
Again - this assumes that partcode/genericname combination is unique in the PRODUCT table.

How to use DISTINCT in VIEWS correctly

I have two tables from which I want to make a join with some columns to provide a view for my java/hibernate application. It looks like this:
CREATE VIEW customer_contacts AS cc
SELECT DISTINCT ON (cust.id) cust.id
cust.company
cust.zip
...
con.name
con.forename
...
FROM contacts con
LEFT JOIN customer cust ON con.customer = cust.id
ORDER BY cust.id
So far so good. Very simple.
If I make a SELECT on the view like:
SELECT *
FROM cc
WHERE name ilike '%schult%'
I get 13 results.
If I make the same query directly with the view statement
SELECT DISTINCT ON (cust.id) cust.id
cust.company
cust.zip
...
con.name
con.forename
...
FROM contacts con
LEFT JOIN customer cust ON con.customer = cust.id
WHERE name ilike '%schult%'
ORDER BY cust.id
I got 75 results!
I figured out that it is the DISTINCT that corrupts the result. But why?
And how can I use it correctly?

Your queries (view based and direct) have different order of applying condition:
direct query searches for %shult% and then applies distinct on
view applies distinct on and then searches for %shult%
Are you aware how distinct on works?
It selects first row (it may be undeterministic if proper sort is not defined) for given attributes and leaves other.
For instance:
Let's say we have customer with id=1 and two connected contacts one with name='Schultz' and one with name='Schmidt'.
Now view based select will apply distinct on and select customer with some contact (first one, undeterministic in this case), then name ilike '%schult%' will be applied - it may happen that Schultz will be removed by distinct on.
Recommended reading:
https://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-DISTINCT

How to group rows of the table without using Aggregate functions?

Sample example:
SELECT
Equip.name, equip.type, region.regionname, region.equipfk
FROM
Equip
INNER JOIN
region on Equip.equippk = region.equipfk
GROUP BY
Equip.Name, region.equipfk
Expected output :
name type regionname equipfk
----------------------------------------
TCT Steel detroit 8235
TCT steel detroit 8235
GTH COPPER michigan 8569
GCT COPPER michigan 8569
I have a table containing 10 columns. I want to group each row by some ids using group by clause. I dont want to use any of the aggregate functions in select query. Is there is any alternate way to use group by clause and avoid aggregate functions in the select query..
Thanks in advance.

It sounds like you're using GROUP BY in an attempt to sort the rows (which it only does by a coincidental side effect that isn't guaranteed). If that's the case, you need ORDER BY instead:
SELECT
Equip.name, equip.type, region.regionname, region.equipfk
FROM
Equip
INNER JOIN
region on Equip.equippk = region.equipfk
ORDER BY
Equip.Name, region.equipfk
Both GROUP BY and DISTINCT may produce output that appears to be sorted - but this is only a side effect of how they've been implemented, and many things (load on the server, server configuration, hot fixes, service packs, major upgrades) may change the order in which rows are returned - unless you also have an ORDER BY clause.

query gives two of the same results

I have the following SQL query but I got a problem:
When I execute it I got two of the same serial numbers from the "sn" column in the "products" table.
SELECT specifications.productname,
products.sn, specifications.year,
lendings.lending_date
FROM products
INNER JOIN lendings ON products.id = lendings.product_id
INNER JOIN specifications ON products.sn LIKE CONCAT(\'%\', specifications.sn, \'%\') OR products.type LIKE CONCAT(\'%\', specifications.type, \'%\')
WHERE lendings.user_id = ?
EDIT:
lendings table:
user_id product_id
1 1
1 2
2 3
Specifications table:
productname year type sn
name1 2012 1 1234
name2 2011 2 4321
name3 2010 3 3241
products table:
id sn
1 AAAAAAAA1234
2 BBBBBBBB4321
3 CCCCCCCC3241
EDIT2:
SELECT products.id,
specifications.productname,
products.sn,
specifications.year,
lendings.lending_date
FROM products
INNER JOIN lendings ON products.id = lendings.product_id
INNER JOIN specifications ON products2.sn LIKE CONCAT(specifications.sn, \'%\') OR products.type = specifications.type
WHERE lendings.user_id = ?

One of your Join on conditions is too slack then
for instance two lendings records pointing to the same product.

Usually, that means you don't have all the necesary join columns present in one of your joins and you are getting a cartesian product. In database terms, this means you are joining to a table and expected to join to a single row, but multiple rows match the criteria, so you are actually joining to more than one row. When this happens, you will get the same row multiple times (product row in your example) in your result.
It would have been better if you posted some test data so this scenario could be confirmed, but since you didn't, I would recommend checking each of your joins to make sure you are not getting multiple rows back for the given products row.
One part of your query I find particularly suspect is this join:
INNER JOIN specifications ON products.sn LIKE CONCAT(\'%\', specifications.sn, \'%\') OR products.type LIKE CONCAT(\'%\', specifications.type, \'%\')
You're joining using a LIKE operator, which seems to have a high chance of getting multiple rows.

Fully matching sets of records of two many-to-many tables

I have Users, Positions and Licenses.
Relations are:
users may have many licenses
positions may require many licenses
So I can easily get license requirements per position(s) as well as effective licenses per user(s).
But I wonder what would be the best way to match the two sets? As logic goes user needs at least those licenses that are required by a certain position. May have more, but remaining are not relevant.
I would like to get results with users and eligible positions.
PersonID PositionID
1 1 -> user 1 is eligible to work on position 1
1 2 -> user 1 is eligible to work on position 2
2 1 -> user 2 is eligible to work on position 1
3 2 -> user 3 is eligible to work on position 2
4 ...
As you can see I need a result for all users, not a single one per call, which would make things much much easier.
There are actually 5 tables here:
create table Person ( PersonID, ...)
create table Position (PositionID, ...)
create table License (LicenseID, ...)
and relations
create table PersonLicense (PersonID, LicenseID, ...)
create table PositionLicense (PositionID, LicenseID, ...)
So basically I need to find positions that a particular person is licensed to work on. There's of course a much more complex problem here, because there are other factors, but the main objective is the same:
How do I match multiple records of one relational table to multiple records of the other. This could as well be described as an inner join per set of records and not per single record as it's usually done in TSQL.
I'm thinking of TSQL language constructs:
rowsets but I've never used them before and don't know how to use them anyway
intersect statements maybe although these probably only work over whole sets and not groups

Final solution (for future reference)
In the meantime while you fellow developers answered my question, this is something I came up with and uses CTEs and partitioning which can of course be used on SQL Server 2008 R2. I've never used result partitioning before so I had to learn something new (which is a plus altogether). Here's the code:
with CTEPositionLicense as (
select
PositionID,
LicenseID,
checksum_agg(LicenseID) over (partition by PositionID) as RequiredHash
from PositionLicense
)
select per.PersonID, pos.PositionID
from CTEPositionLicense pos
join PersonLicense per
on (per.LicenseID = pos.LicenseID)
group by pos.PositionID, pos.RequiredHash, per.PersonID
having pos.RequiredHash = checksum_agg(per.LicenseID)
order by per.PersonID, pos.PositionID;
So I made a comparison between these three techniques that I named as:
Cross join (by Andriy M)
Table variable (by Petar Ivanov)
Checksum - this one here (by Robert Koritnik, me)
Mine already orders results per person and position, so I also added the same to the other two to make return identical results.
Resulting estimated execution plan
Checksum: 7%
Table variable: 2% (table creation) + 9% (execution) = 11%
Cross join: 82%
I also changed Table variable version into a CTE version (instead of table variable a CTE was used) and removed order by at the end and compared their estimated execution plans. Just for reference CTE version 43% while original version had 53% (10% + 43%).

One way to write this efficiently is to do a join of PositionLicences with PersonLicences on the licenceId. Then count the non nulls grouped by position and person and compare with the count of all licences for position - if equal than that person qualifies:
DECLARE #tmp TABLE(PositionId INT, LicenseCount INT)
INSERT INTO #tmp
SELECT PositionId as PositionId
COUNT(1) as LicenseCount
FROM PositionLicense
GROUP BY PositionId
SELECT per.PersonID, pos.PositionId
FROM PositionLicense as pos
INNER JOIN PersonLicense as per ON (pos.LicenseId = per.LicenseId)
GROUP BY t.PositionID, t.PersonId
HAVING COUNT(1) = (
SELECT LicenceCount FROM #tmp WHERE PositionId = t.PositionID
)

I would approach the problem like this:
Get all the (distinct) users from PersonLicense.
Cross join them with PositionLicense.
Left join the resulting set with PersonLicense using PersonID and LicenseID.
Group the results by PersonID and PositionID.
Filter out those (PersonID, PositionID) pairs where the number of licenses in PositionLicense does not match the number of those in PersonLicense.
And here's my implementation:
SELECT
u.PersonID,
pl.PositionID
FROM (SELECT DISTINCT PersonID FROM PersonLicense) u
CROSS JOIN PositionLicense pl
LEFT JOIN PersonLicense ul ON u.PersonID = ul.PersonID
AND pl.LicenseID = ul.LicenseID
GROUP BY
u.PersonID,
pl.PositionID
HAVING COUNT(pl.LicenseID) = COUNT(ul.LicenseID)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse