PostgreSQL Hierarchical, category tree - postgresql

ENV : postgresql-8.4
I'm trying to build a category tree . Basically I'm expecting a final output such :
categoryName
categoryPath
leafcategory
e.g. :
Digital Camera
Electronics ::: Digital Camera
true
The table structure is
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
categoryid bigint,
categoryparentid bigint,
categoryname text,
status integer DEFAULT 0,
lang text,
eysiteid text,
country text,
tempid text,
leafcategory boolean
);
So far I've got this but is not working. Any help would be highly appreciated :
WITH RECURSIVE tree (CategoryID, CategoryParentID, CategoryName, category_tree, depth)
AS (
SELECT
CategoryID,
CategoryParentID,
CategoryName,
CategoryName AS category_tree,
0 AS depth
FROM categories
WHERE CategoryParentID IS NULL
UNION ALL
SELECT
c.CategoryID,
c.CategoryParentID,
c.CategoryName,
tree.category_tree || '/' || c.CategoryName AS category_tree,
depth+1 AS depth
FROM tree
JOIN categories c ON (tree.category_tree = c.CategoryParentID)
)
SELECT * FROM tree ORDER BY category_tree;
Sample from database
cat=> select * from categories;
id | categoryid | categoryparentid | categoryname | status | lang | eysiteid | country | tempid | leafcategory
-------+------------+------------------+--------------------------------+--------+------+------------+---------+--------+--------------
1 | -1 | 0 | Root | 1 | en | 0 | us | | f
2 | 20081 | -1 | Antiques | 1 | en | 0 | us | | f
17 | 1217 | 20081 | Primitives | 0 | en | 0 | us | | t
23 | 22608 | 20081 | Reproduction Antiques | 0 | en | 0 | us | | t
24 | 12 | 20081 | Other | 0 | en | 0 | us | | t
25 | 550 | -1 | Art | 1 | en | 0 | us | | f
29 | 2984 | -1 | Baby | 1 | en | 0 | us | | f

It appears you were joining on the wrong field.
-- create some test data
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE categories
-- ( id SERIAL PRIMARY KEY
( categoryid SERIAL PRIMARY KEY
, categoryparentid bigint REFERENCES categories(categoryid)
, categoryname text
-- , status integer DEFAULT 0
-- , lang text
-- , ebaysiteid text
-- , country text
-- , tempid text
-- , leafcategory boolean
);
INSERT INTO categories(categoryid,categoryparentid) SELECT gs, 1+(gs/6)::integer
FROM generate_series(1,50) gs;
UPDATE categories SET categoryname = 'Name_' || categoryid::text;
UPDATE categories SET categoryparentid = NULL WHERE categoryparentid <= 0;
UPDATE categories SET categoryparentid = NULL WHERE categoryparentid >= categoryid;
WITH RECURSIVE tree (categoryid, categoryparentid, categoryname, category_tree, depth)
AS (
SELECT
categoryid
, categoryparentid
, categoryname
, categoryname AS category_tree
, 0 AS depth
FROM categories
WHERE categoryparentid IS NULL
UNION ALL
SELECT
c.categoryid
, c.categoryparentid
, c.categoryname
, tree.category_tree || '/' || c.categoryname AS category_tree
, depth+1 AS depth
FROM tree
JOIN categories c ON tree.categoryid = c.categoryparentid
)
SELECT * FROM tree ORDER BY category_tree;
EDIT: the other ("non-function") notation for recursive seems to work better:
WITH RECURSIVE tree AS (
SELECT
categoryparentid AS parent
, categoryid AS self
, categoryname AS treepath
, 0 AS depth
FROM categories
WHERE categoryparentid IS NULL
UNION ALL
SELECT
c.categoryparentid AS parent
, c.categoryid AS self
, t.treepath || '/' || c.categoryname AS treepath
, depth+1 AS depth
FROM categories c
JOIN tree t ON t.self = c.categoryparentid
)
SELECT * FROM tree ORDER BY parent,self
;
UPDATE: in the original query, you should replace
WHERE CategoryParentID IS NULL
by:
WHERE CategoryParentID = 0
or maybe even:
WHERE COALESCE(CategoryParentID, 0) = 0

Have a look at this gist it is more or less what you want to do. In your case I would better have used LTree materialized path Postgresql's extension.

Related

Maintaining order in DB2 "IN" query

This question is based on this one. I'm looking for a solution to that question that works in DB2. Here is the original question:
I have the following table
DROP TABLE IF EXISTS `test`.`foo`;
CREATE TABLE `test`.`foo` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(45) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Then I try to get records based on the primary key
SELECT * FROM foo f where f.id IN (2, 3, 1);
I then get the following result
+----+--------+
| id | name |
+----+--------+
| 1 | first |
| 2 | second |
| 3 | third |
+----+--------+
3 rows in set (0.00 sec)
As one can see, the result is ordered by id. What I'm trying to achieve is to get the results ordered in the sequence I'm providing in the query. Given this example it should return
+----+--------+
| id | name |
+----+--------+
| 2 | second |
| 3 | third |
| 1 | first |
+----+--------+
3 rows in set (0.00 sec)
You could use a derived table with the IDs you want, and the order you want, and then join the table in, something like...
SELECT ...
FROM mcscb.mcs_premise prem
JOIN mcscb.mcs_serv_deliv_id serv
ON prem.prem_nb = serv.prem_nb
AND prem.tech_col_user_id = serv.tech_col_user_id
AND prem.tech_col_version = serv.tech_col_version
JOIN (
SELECT 1, '9486154876' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 2, '9403149581' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 3, '9465828230' FROM SYSIBM.SYSDUMMY1
) B (ORD, ID)
ON serv.serv_deliv_id = B.ID
WHERE serv.tech_col_user_id = 'CRSSJEFF'
AND serv.tech_col_version = '00'
ORDER BY B.ORD
You can use derived column to do custom ordering.
select
case
when serv.SERV_DELIV_ID = '9486154876' then 1 ELSE
when serv.SERV_DELIV_ID = '9403149581' then 2 ELSE 3
END END as custom_order,
...
...
ORDER BY custom_order
To make the logic a little bit more evident you might modify the solution provided by bhamby like so:
WITH ordered_in_list (ord, id) as (
VALUES (1, '9486154876'), (2, '9403149581'), (3, '9465828230')
)
SELECT ...
FROM mcscb.mcs_premise prem
JOIN mcscb.mcs_serv_deliv_id serv
ON prem.prem_nb = serv.prem_nb
AND prem.tech_col_user_id = serv.tech_col_user_id
AND prem.tech_col_version = serv.tech_col_version
JOIN ordered_in_list il
ON serv.serv_deliv_id = il.ID
WHERE serv.tech_col_user_id = 'CRSSJEFF'
AND serv.tech_col_version = '00'
ORDER BY il.ORD

Fetch records with distinct value of one column while replacing another col's value when multiple records

I have 2 tables that I need to join based on distinct rid while replacing the column value with having different values in multiple rows. Better explained with an example set below.
CREATE TABLE usr (rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(12) NOT NULL,
email VARCHAR(20) NOT NULL);
CREATE TABLE usr_loc
(rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR NOT NULL PRIMARY KEY,
loc_id INT NOT NULL PRIMARY KEY);
INSERT INTO usr VALUES
(1,'John','john#product'),
(2,'Linda','linda#product'),
(3,'Greg','greg#product'),
(4,'Kate','kate#product'),
(5,'Johny','johny#product'),
(6,'Mary','mary#test');
INSERT INTO usr_loc VALUES
(1,'A',4532),
(1,'I',4538),
(1,'I',4545),
(2,'I',3123),
(3,'A',4512),
(3,'A',4527),
(4,'I',4567),
(4,'A',4565),
(5,'I',4512),
(6,'I',4567);
(6,'I',4569);
Required Result Set
+-----+-------+------+-----------------+
| rid | name | Code | email |
+-----+-------+------+-----------------+
| 1 | John | B | 'john#product' |
| 2 | Linda | I | 'linda#product' |
| 3 | Greg | A | 'greg#product' |
| 4 | Kate | B | 'kate#product' |
| 5 | Johny | I | 'johny#product' |
| 6 | Mary | I | 'mary#test' |
+-----+-------+------+-----------------+
I have tried some queries to join and some to count but lost with the one which exactly satisfies the whole scenario.
The query I came up with is
SELECT distinct(a.rid)as rid, a.name, a.email, 'B' as code
FROM usr
JOIN usr_loc b ON a.rid=b.rid
WHERE a.rid IN (SELECT rid FROM usr_loc GROUP BY rid HAVING COUNT(*) > 1);`
You need to group by the users and count how many occurrences you have in usr_loc. If more than a single one, then replace the code by B. See below:
select
rid,
name,
case when cnt > 1 then 'B' else min_code end as code,
email
from (
select u.rid, u.name, u.email, min(l.code) as min_code, count(*) as cnt
from usr u
join usr_loc l on l.rid = u.rid
group by u.rid, u.name, u.email
) x;
Seems to me that you are using MySQL, rather than IBM DB2. Is that so?

EAV data in SQL Server

I have no control over the data or the database structure. I have this EAV type of data where a consultant can speak one or many languages and he can travel to 1 or many countries in Europe and he has many skills indeed.
FYI there are 10 different main categories in my data.
Some consultants speak 10 languages while other speak only one.
The data looks a bit like this
____________________________________________
| ConsultantID | Category | Value |
--------------------------------------------
| 1 | Language | English |
| 1 | Language | French (fluent) |
| 1 | Language | Spanish (working)|
| 1 | Country | Ireland |
| 1 | Country | Italy |
| 1 | Country | Germany |
| 1 | Country | Belgium |
| 456 | Language | French (working) |
| 456 | Country | Belgium |
| 847 | Language | English |
| 847 | Country | Belgium |
--------------------------------------------
I want to list all consultants willing to travel to Belgium and who speak French (working or fluent). Based on my current example that would be #1 and #456
I wrote the query below which list all values matching a category for a consultant (note this is not dynamic as the number of value in my example is set to 5 max - so already a poor design).
SELECT
ID, category,
MAX(CASE seq WHEN 1 THEN value ELSE '' END ) +
MAX(CASE seq WHEN 2 THEN ',' + value ELSE '' END ) +
MAX(CASE seq WHEN 3 THEN ',' + value ELSE '' END ) +
MAX(CASE seq WHEN 4 THEN ',' + value ELSE '' END ) +
MAX(CASE seq WHEN 5 THEN ',' + value ELSE '' END )
FROM
(SELECT
p1.ID, p1.category, p1.value,
(SELECT COUNT(*)
FROM tblWebPracticeInfo p2
WHERE p2.category = p1.category
AND p2.ID = P1.ID
AND p2.value <= p1.value)
FROM
tblWebPracticeInfo p1) D (ID, category, value, seq )
GROUP BY
ID, category
ORDER BY
ID;
I would then need to query this table...
But without even a where clause it takes already 2 seconds to execute
I have something else more basic (but similarly not efficient)
select *
from tblWebMemberInfo m
where
m.ID in (select p.id from tblWebPracticeInfo p
where p.category = 'Language' and p.value like 'French%')
and m.ID in (select p.id from tblWebPracticeInfo p
where p.category = 'Country' and p.value = 'Belgium')
order by m.ID
That's basically where I am. As you can see nothing genius and nothing which is really working.
Can you point me to the right track.
I'm using SQL Server 2005 - v9.00.1
Many thanks in advance for your time & help
If you just need to list the consultants then you can use exists():
select p.Id ...
from Person p /* Assuming you have a regular table for people,
if not, use distinct or group by */
where exists (
select 1
from tblWebPracticeInfo l
where l.Id = p.Id
and l.Category = 'Language'
and l.Value = 'French'
)
and exists (
select 1
from tblWebPracticeInfo c
where c.Id = p.Id
and c.Category = 'Country'
and c.Value = 'Belgium'
)
You could also use aggregation and having like so:
select ConsultantID
from tblWebMemberInfo m
where (p.category = 'Language' and p.value like 'French%')
or (p.category = 'Country' and p.value = 'Belgium')
group by ConsultantID
having count(*) = 2 /* number of conditions to match is 2 */

Postgres - updates with join gives wrong results

I'm having some hard time understanding what I'm doing wrong.
The result of this query shows the same results for each row instead of being updated by the right result.
My DATA
I'm trying to update a table of stats over a set of business
business_stats ( id SERIAL,
pk integer not null,
b_total integer,
PRIMARY KEY(pk)
);
the details of each business are stored here
business_details (id SERIAL,
category CHARACTER VARYING,
feature_a CHARACTER VARYING,
feature_b CHARACTER VARYING,
feature_c CHARACTER VARYING
);
and here a table that associate the pk with the category
datasets (id SERIAL,
pk integer not null,
category CHARACTER VARYING;
PRIMARY KEY(pk)
);
WHAT I DID (wrong)
UPDATE business_stats
SET b_total = agg.total
FROM business_stats b,
( SELECT d.pk, count(bd.id) total
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
GROUP BY d.pk
) agg
WHERE b.pk = agg.pk;
The result of this query is
| id | pk | b_total |
+----+----+-----------+
| 1 | 14 | 273611 |
| 2 | 15 | 273611 |
| 3 | 16 | 273611 |
| 4 | 17 | 273611 |
but if I run just the SELECT the results of each pk are completely different
| pk | agg.total |
+----+-------------+
| 14 | 273611 |
| 15 | 407802 |
| 16 | 179996 |
| 17 | 815580 |
THE QUESTION
why is this happening?
why is the WHERE clause not working?
Before writing this question I've used as reference these posts: a, b, c
Do the following (I always recommend against joins in Updates)
UPDATE business_stats bs
SET b_total =
( SELECT count(c.id) total
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
where d.pk=bs.pk
)
/*optional*/
where exists (SELECT *
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
where d.pk=bs.pk)
The issue is your FROM clause. The repeated reference to business_stats means you aren't restricting the join like you expect to. You're joining agg against the second unrelated mention of business_stats rather than the row you want to update.
Something like this is what you are after (warning not tested):
UPDATE business_stats AS b
SET b_total = agg.total
FROM
(...) agg
WHERE b.pk = agg.pk;

Postgresssql Advanced Join

I am new to postgresssql.
I have two table like this
Table A
id |value | type1 | type2 | type3
bigint |text | bigint | bigint | bigint
Table B
Id | description
bigint | text
Table A's type1,type2,type3 is the ids of Table B but not foreign key constraint.
I have to retrieve like this
Select a.id,
a.value,
b.description1(as of a.type1),
b.description1(as of a.type1),
b.description1(as of a.type1)
If you have to many columns you should consider change your db design
TableA
id | value
1 | <something>
2 | <something>
TableAType
id | TableA.id | type_id | typeValue
1 | 1 | type1 | bigint
2 | 1 | type2 | bigint
3 | 1 | type3 | bigint
.....
4 | 1 | typeN | bigint
TableB (type_description)
Id | description
bigint | text
Then your query become more simple and isn't affected when you add/remove types.
SELECT TB.Description, TT.TypeValue
FROM TableAType TT
JOIN TableB TB
ON TT.Type_id = TB.id
Then you can use a PIVOT to get the tabular data. Again the advantage is you can delete remove types, and your query doesnt change, only need update the types tables.
You should (LEFT) JOIN tableB 3 times, and use a different alias for each one.
select id, value,
type1, t1.description descri1,
type2, t2.description descri2,
type3, t3.description descri3
from tableA ta
left join tableB t1
on ta.type1 = t1.id
left join tableB t2
on ta.type2 = t2.id
left join tableB t3
on ta.type3 = t3.id;