I have relational database and would like to use string_agg() since it seems fit to my need.
I want :
product_id | quiz_id
-----------+----------
1 | 1,6
2 | 2,7
3 | 3,8
4 | 4
Here is my database.
select quiz_id , product_id, lastmodified from dugong.quiz;
quiz_id | product_id | lastmodified
---------+------------+-------------------------------
1 | 1 | 2015-11-11 14:46:55.619162+07
2 | 2 | 2015-11-11 14:46:55.619162+07
3 | 3 | 2015-11-11 14:46:55.619162+07
4 | 4 | 2015-11-11 14:46:55.619162+07
5 | 5 | 2015-11-11 14:46:55.619162+07
6 | 1 | 2015-11-11 14:46:55.619162+07
7 | 2 | 2015-11-11 14:46:55.619162+07
8 | 3 | 2015-11-11 14:46:55.619162+07
My attempt :
Refer to document.
How to concatenate strings of a string field in a PostgreSQL 'group by' query?
http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES
select product_id , string_agg(quiz_id, ',' order by lastmodified) from dugong.quiz;
ERROR: function string_agg(integer, unknown) does not exist
LINE 1: select product_id , string_agg(quiz_id, ',' order by lastmod...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Postgres version :
PostgresApp 9.4.4.1
Update :
#code-monk
It still error.
select product_id , string_agg(quiz_id::int, ',' order by lastmodified) from dugong.quiz;
ERROR: function string_agg(integer, unknown) does not exist
LINE 1: select product_id , string_agg(quiz_id::int, ',' order by la...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Question :
What is wrong with my query?
Try this:
select product_id , string_agg(quiz_id::character varying, ',' order by lastmodified)
from quiz group by product_id;
String_agg function works with String values only ,You are getting the error because quiz_id is integer.
I have converted it to character varying and added group by for grouping the data product ID wise.
SQL Fiddle Example:http://sqlfiddle.com/#!15/9dafe/1
Related
I have an unusual problem I'm trying to solve with SQL where I need to generate sequential numbers for partitioned rows but override specific numbers with values from the data, while not breaking the sequence (unless the override causes a number to be used greater than the number of rows present).
I feel I might be able to achieve this by selecting the rows where I need to override the generated sequence value and the rows I don't need to override the value, then unioning them together and somehow using coalesce to get the desired dynamically generated sequence value, or maybe there's some way I can utilise recursive.
I've not been able to solve this problem yet, but I've put together a SQL Fiddle which provides a simplified version:
http://sqlfiddle.com/#!17/236b5/5
The desired_dynamic_number is what I'm trying to generate and the generated_dynamic_number is my current work-in-progress attempt.
Any pointers around the best way to achieve the desired_dynamic_number values dynamically?
Update:
I'm almost there using lag:
http://sqlfiddle.com/#!17/236b5/24
step-by-step demo:db<>fiddle
SELECT
*,
COALESCE( -- 3
first_value(override_as_number) OVER w -- 2
, 1
)
+ row_number() OVER w - 1 -- 4, 5
FROM (
SELECT
*,
SUM( -- 1
CASE WHEN override_as_number IS NOT NULL THEN 1 ELSE 0 END
) OVER (PARTITION BY grouped_by ORDER BY secondary_order_by)
as grouped
FROM sample
) s
WINDOW w AS (PARTITION BY grouped_by, grouped ORDER BY secondary_order_by)
Create a new subpartition within your partitions: This cumulative sum creates a unique group id for every group of records which starts with a override_as_number <> NULL followed by NULL records. So, for instance, your (AAA, d) to (AAA, f) belongs to the same subpartition/group.
first_value() gives the first value of such subpartition.
The COALESCE ensures a non-NULL result from the first_value() function if your partition starts with a NULL record.
row_number() - 1 creates a row count within a subpartition, starting with 0.
Adding the first_value() of a subpartition with the row count creates your result: Beginning with the one non-NULL record of a subpartition (adding the 0 row count), the first following NULL records results in the value +1 and so forth.
Below query gives exact result, but you need to verify with all combinations
select c.*,COALESCE(c.override_as_number,c.act) as final FROM
(
select b.*, dense_rank() over(partition by grouped_by order by grouped_by, actual) as act from
(
select a.*,COALESCE(override_as_number,row_num) as actual FROM
(
select grouped_by , secondary_order_by ,
dense_rank() over ( partition by grouped_by order by grouped_by, secondary_order_by ) as row_num
,override_as_number,desired_dynamic_number from fiddle
) a
) b
) c ;
column "final" is the result
grouped_by | secondary_order_by | row_num | override_as_number | desired_dynamic_number | actual | act | final
------------+--------------------+---------+--------------------+------------------------+--------+-----+-------
AAA | a | 1 | 1 | 1 | 1 | 1 | 1
AAA | b | 2 | | 2 | 2 | 2 | 2
AAA | c | 3 | 3 | 3 | 3 | 3 | 3
AAA | d | 4 | 3 | 3 | 3 | 3 | 3
AAA | e | 5 | | 4 | 5 | 4 | 4
AAA | f | 6 | | 5 | 6 | 5 | 5
AAA | g | 7 | 999 | 999 | 999 | 6 | 999
XYZ | a | 1 | | 1 | 1 | 1 | 1
ZZZ | a | 1 | | 1 | 1 | 1 | 1
ZZZ | b | 2 | | 2 | 2 | 2 | 2
(10 rows)
Hope this helps!
The real world problem I was trying to solve did not have a nicely ordered secondary_order_by column, instead it would be something a bit more randomised (a created timestamp).
For the benefit of people who stumble across this question with a similar problem to solve, a colleague solved this problem using a cartesian join, who's solution I'm posting below. The solution is Snowflake SQL which should be possible to adapt to Postgres. It does fall down on higher override_as_number values though unless the from table(generator(rowcount => 1000)) 1000 value is not increased to something suitably high.
The SQL:
with tally_table as (
select row_number() over (order by seq4()) as gen_list
from table(generator(rowcount => 1000))
),
base as (
select *,
IFF(override_as_number IS NULL, row_number() OVER(PARTITION BY grouped_by, override_as_number order by random),override_as_number) as rownum
from "SANDPIT"."TEST"."SAMPLEDATA" order by grouped_by,override_as_number,random
) --select * from base order by grouped_by,random;
,
cart_product as (
select *
from tally_table cross join (Select distinct grouped_by from base ) as distinct_grouped_by
) --select * from cart_product;
,
filter_product as (
select *,
row_number() OVER(partition by cart_product.grouped_by order by cart_product.grouped_by,gen_list) as seq_order
from cart_product
where CONCAT(grouped_by,'~',gen_list) NOT IN (select concat(grouped_by,'~',override_as_number) from base where override_as_number is not null)
) --select * from try2 order by 2,3 ;
select base.grouped_by,
base.random,
base.override_as_number,
base.answer, -- This is hard coded as test data
IFF(override_as_number is null, gen_list, seq_order) as computed_answer
from base inner join filter_product on base.rownum = filter_product.seq_order and base.grouped_by = filter_product.grouped_by
order by base.grouped_by,
random;
In the end I went for a simpler solution using a temporary table and cursor to inject override_as_number values and shuffle other numbers.
I have one table, say buchas containing a list of products. I had to create another table buchas_type with the type (class if you will) of the products in buchas.
Now I have to reference the class of product in buchas, by adding a type_id column.
So, my problem can be divided in two parts:
I need a subquery to "guess" the product type by its name. That should be somewhat easy, since the name of the product contains somewhere the name of its type. Example:
-----------------Name----------------- | ---Type---
BUCHA GOB 1600 GOB
The problem is I have a type GOB(2) that will mess things up with the type GOB. (I have also other look alike types raising the same problem).
So far, I got this:
SELECT buchas_type.id
FROM buchas_type
JOIN buchas ON buchas.description LIKE '%' || buchas_type.type || '%'
ORDER BY length(type) desc LIMIT 1;
That will solve the problem with the look alike types, since it returns the longest match. However, I need a WHERE clause otherwise I get always the same buchas_type_id. If I try for instance Where buchas.id = 50 I get a correct result.
The second part of the problem is the UPDATE command itself. I need to fill up the recently created buchas_type_id from the subquery I showed in (1). So, for every row in buchas, it has to search for the type in buchas_type using the current row's description as a parameter.
How to achieve that?
Assuming you have a primary key on buchas, the first part of your problem can be solved with distinct on.
select * from buchas;
buchas_id | description | buchas_type_id
-----------+-------------------+----------------
1 | BUCHA GOB 1600 |
2 | BUCHA GOB(2) 1700 |
(2 rows)
select * from buchas_type;
buchas_type_id | type
----------------+--------
1 | GOB
2 | GOB(2)
(2 rows)
select distinct on (b.buchas_id) b.buchas_id, t.buchas_type_id, b.description, t.type
from buchas b
join buchas_type t
on b.description ilike '%'||t.type||'%'
order by b.buchas_id, length(t.type) desc;
buchas_id | buchas_type_id | description | type
-----------+----------------+-------------------+--------
1 | 1 | BUCHA GOB 1600 | GOB
2 | 2 | BUCHA GOB(2) 1700 | GOB(2)
(2 rows)
With this solved, you can do an update...from:
with type_map as (
select distinct on (b.buchas_id) b.buchas_id, t.buchas_type_id
from buchas b
join buchas_type t
on b.description ilike '%'||t.type||'%'
order by b.buchas_id, length(t.type) desc
)
update buchas
set buchas_type_id = t.buchas_type_id
from type_map t
where t.buchas_id = buchas.buchas_id;
UPDATE 2
select * from buchas b join buchas_type t on t.buchas_type_id = b.buchas_type_id;
buchas_id | description | buchas_type_id | buchas_type_id | type
-----------+-------------------+----------------+----------------+--------
1 | BUCHA GOB 1600 | 1 | 1 | GOB
2 | BUCHA GOB(2) 1700 | 2 | 2 | GOB(2)
(2 rows)
I want to create web app which will use tree data structures. Users will be able to create, update and delete trees. I have the following table in PostgreSQL called nodes in database:
id INTEGER PRIMARY KEY,
name VARCHAR(50) NOT NULL UNIQUE,
parent_id INTEGER NULL REFERENCE nodes(id)
Getting data
I want to get data in the following form:
id | name | children
---|------|--------------
1 | a | [2,3]
2 | b | []
3 | c | [4]
4 | d | []
I created query which returns data in form
id | name | parent_id
---|------|--------------
1 | a |
2 | b | 1
3 | c | 1
4 | d | 3
And here is code:
WITH RECURSIVE nodes_cte(id, name, parent_id, level) AS (
SELECT nodes.id, nodes.name, nodes.parent_id, 0 AS level
FROM nodes
WHERE name = 'a'
UNION ALL
SELECT nodes.id, nodes.name, nodes.parent_id, level+1
FROM nodes
JOIN nodes_cte
ON nodes_cte.id = nodes.parent_id
)
SELECT * FROM nodes_cte;
Can I change SQL code to get what I want or should I do that in app??
Inserting data
I want to know what are the ways to insert data into the table. I think that following approach will work for me:
create sequence in database
increase sequence for number of elements in tree
manually compute ids in app and insert elements in the table
Are there better ways?
CREATE TABLE nodes
( id INTEGER PRIMARY KEY
, name VARCHAR(50) NOT NULL UNIQUE
, parent_id INTEGER NULL REFERENCES nodes(id)
);
-- I created query which returns data in form
INSERT INTO nodes(id,name,parent_id)VALUES
( 1 , 'a' , NULL)
,( 2 , 'b' , 1)
,( 3 , 'c' , 1)
,( 4 , 'd' , 3)
;
SELECT p.id, p.name
, array_agg(c.id) AS children
FROM nodes p
LEFT JOIN nodes c ON c.parent_id = p.id
GROUP BY p.id, p.name
;
Result:
id | name | children
----+------+----------
1 | a | {2,3}
2 | b | {NULL}
3 | c | {4}
4 | d | {NULL}
(4 rows)
Extra: using generate_series() to insert a bunch of records. Each record having id/3 as parent, (except when zero).
INSERT INTO nodes(id,name,parent_id)
SELECT gs, 'zzz_'|| gs::text, NULLIF(gs/3 , 0)
FROM generate_series ( 5,25) gs
;
INSERTING/UPDATING DATA
Normally, your front-end should not mess with sequences, but leave that to the DBMS. You already have a UNIQUE constraint on name, because it is a natural key . So, your front-end should use that key to address rows in the nodes table, like in:
CREATE TABLE nodes2
( id SERIAL NOT NULL PRIMARY KEY
, name VARCHAR(50) NOT NULL UNIQUE
, parent_id INTEGER NULL REFERENCES nodes(id)
);
INSERT INTO nodes2(name,parent_id)
SELECT 'Omg_'|| gs::text, NULLIF(gs/3 , 0)
FROM generate_series ( 1,15) gs
;
PREPARE upd (text, text) AS
-- child, parent
UPDATE nodes2 c
SET parent_id = p.id
FROM nodes2 p
WHERE p.name = $2 -- parent
AND c.name = $1 -- child
;
EXECUTE upd( 'Omg_12', 'Omg_11');
EXECUTE upd( 'Omg_15', 'Omg_11');
Result:
CREATE TABLE
INSERT 0 15
PREPARE
UPDATE 1
UPDATE 1
id | name | children
----+--------+-----------
1 | Omg_1 | {3,4,5}
2 | Omg_2 | {6,7,8}
3 | Omg_3 | {9,10,11}
4 | Omg_4 | {13,14}
5 | Omg_5 | {NULL}
6 | Omg_6 | {NULL}
7 | Omg_7 | {NULL}
8 | Omg_8 | {NULL}
9 | Omg_9 | {NULL}
10 | Omg_10 | {NULL}
11 | Omg_11 | {15,12}
12 | Omg_12 | {NULL}
13 | Omg_13 | {NULL}
14 | Omg_14 | {NULL}
15 | Omg_15 | {NULL}
(15 rows)
I have a table containing, for example, this data:
id | value | name | date
1 | 1 | 'one' | 2015-01-02
2 | 1 | 'two' | 2015-02-03
3 | 2 | 'three'| 2014-01-03
4 | 2 | 'four' | 2014-01-02
I want for each distinct value, the name of the row with the latest date. So:
value | name | date
1 | 'two' | 2015-02-03
2 | 'three'| 2014-01-03
I currently have this query: SELECT value, MAX(date) FROM table GROUP BY value, which gives me the value and date columns I'm looking for. How do I modify the query to add the name field? Simply adding it to the SELECT clause won't work, as Postgres will (understandably) complain I have to add it to the GROUP BY clause. But doing so will add it to the uniqueness check, and my query will return all 4 rows. All I need is the name of the row where it found the latest date.
distinct on() is the most efficient way to do this with Postgres
select distinct on (value) id, value, name, date
from the_table
order by value, date;
SQLFiddle example: http://sqlfiddle.com/#!15/dff68/1
This will give you all required fields:
select t1.* from table t1
inner join (
SELECT value, MAX(date) as date FROM table GROUP BY value
)t2 on t1.date=t2.date;
SQL Fiddle: http://sqlfiddle.com/#!15/9491f/2
I have following table:
CREATE TABLE Kundendaten (
beschreiben_knr INTEGER REFERENCES Kunde(knr) DEFERRABLE INITIALLY DEFERRED,
erstelldatum DATE,
anschrift VARCHAR(40),
sonderrabat INTEGER,
PRIMARY KEY (erstelldatum, beschreiben_knr)
);
If i make this query:
select * from Kundendaten ORDER BY erstelldatum DESC;
i get:
beschreiben_knr | erstelldatum | anschrift | sonderrabat
-----------------+--------------+---------------+-------------
1 | 2015-11-01 | Winkelgasse 5 | 0
2 | 2015-11-01 | Badeteich 7 | 10
3 | 2015-11-01 | Senfgasse 7 | 15
1 | 2015-10-30 | Sonnenweg 3 | 5
But i need to get only the entry for the highest date entry if there are more then one. In this case the last row should not appear.
How can i achieve this in postgresql?
You want something like WHERE erstelldatum = MAX(DATE) but that doesn't work. You can use a sub-query to get the newest date.
SELECT *
FROM Kundendaten
WHERE erstelldatum = (
SELECT MAX(erstelldatum) FROM Kundendaten
);
(SQL Fiddle)
Postgres will optimize that subquery so it is only run once, but you'll want to make sure erstelldatum is indexed.