UPDATE in a specific order - postgresql

So let's say I have a table:
SELECT * from test_table order by name;
----|----
name|ord
----|----
a |4
a |5
b |2
c |3
d |1
And I want to change the ord such that it matches the alphabetized result of the "order by name" clause. My goal, therefore, is:
SELECT * from test_table order by name;
----|----
name|ord
----|----
a |1
a |2
b |3
c |4
d |5
Is there a good way in Postgres to do this? I have a new sequence I can pull from, I'm just not sure how to do this cleanly in-place, or if that's even possible. Or should I just store the results of the selection, then iterate over and select each name, assigning a new ord value to them? (They all have unique IDs, so the repeat shouldn't matter)

You don't need any sequence for this.
The first step is determinate the new data:
SELECT
*
FROM test_table AS test_table_old
LEFT JOIN (
SELECT
*, row_number() OVER () AS ord_new
FROM test_table
ORDER BY name, ord
) AS test_table_new USING (name, ord)
;
Then convert this to an update:
UPDATE test_table SET
ord = test_table_new.ord_new
FROM test_table AS test_table_old
LEFT JOIN (
SELECT
*, row_number() OVER () AS ord_new
FROM test_table
ORDER BY name, ord
) AS test_table_new USING (name, ord)
WHERE (test_table.name, test_table.ord) = (test_table_old.name, test_table_old.ord)
;
If you need a new sequence, then replace "row_numer() OVER ()" to "nextval('the_new_sequence_name')".

Related

finding a given item value in an array of a given column

I have a table like this:
mTable:
|id | text[] mVals |
|1 | {"a,b"} |
|2 | {"a"} |
|3 | {"b"} |
I'd like a query to return both rows {a,b},{b} If I specify only b, but, it doesn't really return the row having atleast one of the specified values and only returns rows with the value specified.
I have tried this:
SELECT mVals
FROM mTable
WHERE ARRAY['a'] && columnarray; -- Returns only {'a'}
Also tried:
SELECT mVals
FROM mTable
WHERE mVals && '{"a"}'; -- Returns only {'a'}
Nothing seems to be working as it should. What could be the issue?
to me it looks is working as expected, recreating your case with
create table test(id int, mvals text[]);
insert into test values(1, '{a}');
insert into test values(2, '{a,b}');
insert into test values(3, '{b}');
A query similar to the 1st one you posted works
SELECT mVals
FROM test
WHERE ARRAY['a'] && mvals;
Results
mvals
-------
{a}
{a,b}
(2 rows)
and with b instead of a
mvals
-------
{a,b}
{b}
(2 rows)
P.S.: you should probably use the contain operator #> to check if a value (or an array) is contained in another array

Getting earliest date by matching two columns, and returning array

I have a query I'm trying to write, but I cannot get the syntax quite right. From the table below, I have a set to dates with an id, and if the id does not have parent_id, and if the parent_id does not exist for an id it is NULL.
I'm trying to get an output of all the children of a parent that have the same date as the parent. As shown in the expected output below, [D#P, Z#Z] would be assigned to A because they have the same date and their parent_id is A, however Q#L would not be assigned to A because its date is not 1/1/2019. Nothing is assigned to B or D because they have no children on their created dates.
I've found some posts on how to do this in Postgres, however because I'm using Redshift some of the operations don't work.
Any help would be appreciated.
|date |id |parent_id |
-------------------------
1/1/2019|A |NULL
1/1/2019|B |NULL
1/1/2019|C |NULL
1/1/2019|D#P |A
1/1/2019|Z#Z |A
1/1/2019|K#H |C
1/2/2019|Q#L |A
1/3/2019|D |NULL
1/4/2019|H#Q |C
Expected Output:
date |id |children
-----------------------
1/1/2019 |A |[D#P, Z#Z]
1/1/2019 |C |[K#H]
Current Work:
SELECT
first_value(case
when parent_id
then date
end)
over (
partition by parent_id
order by date
rows between unbounded preceding and unbounded following)
as first_date)
id,
list_agg(parent_id)
FROM foo
I don't know why I am getting an error when using LISTAGG aggregate function, therefore I decided to use SELECT DISTINCT with LISTAGG window function:
WITH input as (
SELECT '1/1/2019' as date, 'A' as id, NULL as parent_id UNION ALL
SELECT '1/1/2019', 'B', NULL UNION ALL
SELECT '1/1/2019', 'C', NULL UNION ALL
SELECT '1/1/2019', 'D#P', 'A' UNION ALL
SELECT '1/1/2019', 'Z#Z', 'A' UNION ALL
SELECT '1/1/2019', 'K#H', 'C' UNION ALL
SELECT '1/2/2019', 'Q#L', 'A' UNION ALL
SELECT '1/3/2019', 'D', NULL UNION ALL
SELECT '1/4/2019', 'H#Q', 'C'
), parents as (
SELECT *
FROM input
WHERE parent_id IS NULL
), children as (
SELECT *
FROM input
WHERE parent_id IS NOT NULL
)
SELECT DISTINCT
parents.date,
parents.id,
listagg(children.id, ',') WITHIN GROUP ( ORDER BY children.id )OVER (PARTITION BY parents.id, parents.date) as children
FROM parents JOIN children
ON parents.id = children.parent_id
AND parents.date = children.date
Outputs:
date id children
1/1/2019 A D#P,Z#Z
1/1/2019 C K#H
Solution with GROUP BY and an LISTAGG aggregate function, would be for me more natural of solving your problem:
WITH input as (
[...]
SELECT
parents.date,
parents.id,
listagg(children.id, ',') WITHIN GROUP ( ORDER BY children.id )
FROM parents JOIN children
ON parents.id = children.parent_id
AND parents.date = children.date
group by parents.id, parents.date
Sadly it returns an error which I don't really understand:
[XX000][500310] Amazon Invalid operation: One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT, etc; java.lang.RuntimeException: com.amazon.support.exceptions.ErrorException: Amazon Invalid operation: One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT, etc;

Select only the rows with the latest date in postgres

I only want the latest date for each row (house) the number of entries per house varies sometimes there might be one sale sometimes multiple.
Date of sale | house number | street | price |uniqueref
-------------|--------------|--------|-------|----------
15-04-1990 |1 |castle |100000-| 1xzytt
15-04-1995 |1 |castle |200000-| 2jhgkj
15-04-2005 |1 |castle |800000-| 3sdfsdf
15-04-1995 |2 |castle |200000-| 2jhgkj
15-04-2005 |2 |castle |800000-| 3sdfsdf
What I have working is as follows
Creating VIEW as (v_orderedhouses) ORDER BY house number, street with date ordered on DESCso that latest date is first returned.
I then feed that into another VIEW (v_latesthouses) using DISTINCT ON (house number, street). Which gives me;
Date of sale | house number | street | price |uniqueref
-------------|--------------|--------|-------|----------
15-04-2005 |1 |castle |800000-| 3sdfsdf
15-04-2005 |2 |castle |800000-| 3sdfsdf
This works but seems like there should be a more elegant solution. Can I get to the filtered view in one step?
You do not need to create a bunch of views, just:
select distinct on(street, house_number)
*
from your_table
order by
street, house_number, -- those fields should be in the "order by" clause because it is in the "distinct on" expression
date_of_sale desc;
To make this query faster you could to create an index according to the order by:
create index index_name on your_table(street, house_number, date_of_sale desc);
Do not forget to analyse your tables regularly (depending on the grown speed):
analyse your_table;
You can use window function row_number for this
select * from (
select your_table.*, row_number() over(partition by house_number order by Date_of_sale desc) as rn from your_table
) tt
where rn = 1
This is what I use and it works fast(is a generic solution, as far as I tested every database software can do this):
SELECT t1.date_of_sale, t1.house_number
FROM table t1
LEFT JOIN table t2 ON (t2.house_number = t1.house_number AND t2.date_of_sale>t1.date_of_sale)
WHERE t2.pk IS NULL
GROUP BY t1.date_of_sale, t1.house_number

HQL select rows with max

I got this table:
+---+----+----+----+
|ID |KEY1|KEY2|COL1|
+---+----+----+----+
|001|aaa1|bbb1|ccc1|
|101|aaa1|bbb1|ddd2|
|002|aaa2|bbb2|eee3|
|102|aaa2|bbb2|fff4|
|003|aaa3|bbb3|ggg5|
|103|aaa3|bbb3|hhh6|
+---+----+----+----+
The Result must contain the rows with the highest ID if the columns key1 and key2 are equals.
+---+----+----+----+
|ID |KEY1|KEY2|COL1|
+---+----+----+----+
|101|aaa1|bbb1|ddd2|
|102|aaa2|bbb2|fff4|
|103|aaa3|bbb3|hhh6|
+---+----+----+----+
Since in HQL I can't do a subquery like:
select * from (select....)
How can I perform this query?
**SOLUTION**
Actually the solution were a little bit more complex so i want share it since the KEY1 and KEY2 were on an other table which join on the first table with two keys.
+-----+-------+-------+-------+
|t1.ID|t2.KEY1|t2.KEY2|t1.COL1|
+-----+-------+-------+-------+
| 001| aaa1| bbb1| ccc1|
| 101| aaa1| bbb1| ddd2|
| 002| aaa2| bbb2| eee3|
| 102| aaa2| bbb2| fff4|
| 003| aaa3| bbb3| ggg5|
| 103| aaa3| bbb3| hhh6|
+-----+-------+-------+-------+
I used this CORRECT query:
SELECT t1.ID, t2.KEY1, t2.KEY2, t1.COL1
FROM yourTable1 t1, yourTable2 t2
WHERE
t1.JoinCol1 = t2.JoinCol1 and t1.JoinCol2=t2.JoinCol2 and
t1.ID = (SELECT MAX(s1.ID) FROM yourTable1 s1, yourTable2 s2
WHERE
s1.JoinCol1 = s2.JoinCol1 and s1.JoinCol2=s2.JoinCol2 and
s2.KEY1 = t2.KEY1 AND s2.KEY2 = t2.KEY2)
If we were writing this query to be run directly on a regular database, such as MySQL or SQL Server, we might be tempted to join to a subquery. However, from what I read here, subqueries in HQL can only appear in the SELECT or WHERE clauses. We can phrase your query as follows, using the WHERE clause to implement your logic.
The query will be:
SELECT t1.ID, t1.KEY1, t1.KEY2, t1.COL1
FROM yourTable t1
WHERE t1.ID = (SELECT MAX(t2.ID) FROM yourTable t2
WHERE t2.KEY1 = t1.KEY1 AND t2.KEY2 = t1.KEY2)

How to group by in DB2 IBM and get the first item in each group?

I have a table like this:
|sub_account|name|email|
|-----------|----|-----|
// same account and same name: email different
|a1 |n1 |e1 |
|a1 |n1 |e2 |
// same account, name and email
|a2 |n2 |e3 |
|a2 |n2 |e3 |
I would like a query to get a table like this:
|sub_account|name|email|
|-----------|----|-----|
// nothing to do here
|a1 |n1 |e1 |
|a1 |n1 |e2 |
// remove the one that is exactly the same, but leave at least one
|a2 |n2 |e3 |
I've tried:
select sub_account, name, first(email)
from table
group by sub_account, name
but as you know "first" doesn't exists in the DB2; what is the alternative to it?
thanks
select sub_account, name, email
from table
group by sub_account, name, email
I am not sure in DB2. In SQL server, you can use DISTINCT for your issue.. You may try.
SELECT DISTINCT sub_acount, name, email
from TABLE
Create a subquery with the table values + a counter (pos) that gets increased for each row and gets reset to 1 each time a new sub-account+name is reached.
The final query filters out all results from the subquery other than those with pos 1 (i.e. first entries of the group):
select *
from (
select sub_account, name, email,
ROW_NUMBER() OVER (PARTITION BY sub_account, name
ORDER BY email DESC) AS pos
from table
)
where pos = 1
I found a way:
SELECT sub_account,
name,
CASE WHEN split_index=0 THEN MyList ELSE SUBSTR(MyList,1,LOCATE('|',MyList)-1) END
FROM (select sub_account, name, LISTAGG(email,'|') as MyList, LOCATE('|',LISTAGG(LB_ARTICLE_CAISSE,'|')) AS split_index
from TABLE
group by sub_account, name) AS TABLEA
This function will aggregate your mail and after split it and take the first one