PostgreSQL duplicating rows with two column duplicates - postgresql

In PostgreSQL I am looking for an answer to the following problem.
There are two columns providing data about 'start' and 'end', together with a 'date' column. Currently the date column only exists once with 'start' and 'end' being filled with possibilities.
I am looking for the possibility to create a 'start' and 'end' column with unique values, but with duplicating dates.
current:
id date start end
1 2017-03-13 a [null]
2 2017-03-14 [null] a
3 2017-03-14 b [null]
4 2017-03-16 [null] b
5 2017-03-16 c c
wish:
id date start end
1 2017-03-13 a [null]
2 2017-03-14 [null] a
3 2017-03-14 b [null]
4 2017-03-16 [null] b
5 2017-03-16 c [null]
6 2017-03-16 [null] c
Anyone an idea?

If I understood your problem correctly, and you want exactly one of start and "end" to be set, and the combination with date unique, you can do this:
ALTER TABLE tab
ADD CHECK(start IS NULL AND "end" IS NOT NULL
OR start IS NOT NULL AND "end" IS NULL);
CREATE UNIQUE INDEX ON tab (date, COALESCE(start, "end"));

Related

Postgresql unique index: only one date for one foreign key

I have a table "foo":
ID PRODUCT_ID END_DATE
------------------------------
1 1 NULL
2 1 2022-02-02
3 1 2022-01-06 - This date could not be exists
4 2 NULL
5 2 2022-01-23
6 3 NULL
7 3 NULL
How can i make unique index for one product whith the same id only one date can exist?
Create a conditional unique index:
CREATE UNIQUE INDEX u_productid_end_date_not_null
ON foo(product_Id)
WHERE end_date IS NOT NULL; -- this one will do the trick

Type bigint but expression is of type character varying

I'm hoping someone can lend a hand with this:
Trying to insert one row per order_id in a database that is running in RedShift, and sometimes subscription_id contains more than 1 value. This creates duplicate rows, so I figured I would LISTAGG. This is the line:
LISTAGG(DISTINCT CAST(script.subscription_id AS VARCHAR), ',') AS subscription_id
The subscription_id column is an int8 and after giving me the character varying error, I tried to CAST; but for some reason I cannot do it. Does LISTAGG do not support this type of nested CAST? If not, is there a wasy to actually achieve this?
ORIGINAL:
order_id subscription_id
1 123
2 124
3 125
1 126
2 127
IDEAL:
order_id subscription_id
1 123,126
2 124,127
3 125
Both columns are of int8.

Inconsistency datetime PostgreSQL default value now()

I have this situation, I have this table named balance:
id -> Integer Auto Increment not null
balance -> numeric(19) not null
date -> datetime default now() not null
id | balance | date
1 100 2019-09-12 16:15:29.091720
2 99 2019-09-12 16:15:33.404119
3 98 2019-09-12 16:15:33.412087
4 97 2019-09-12 16:15:33.425252
5 96 2019-09-12 16:15:33.442137
6 95 2019-09-12 16:15:33.513825 -> this time A
7 94 2019-09-12 16:15:33.444407 -> this time B
Then I do an insert to the table just 1 column, is balance. Example insert with thread process:
INSERT INTO balance(balance) VALUES(100)
Date B is lower than A. Where the id increment is 6 then 7, are the process by the id?
Example, id 6 is process first time then insert 7. So 6 must be have time lower than 7?
Any clue why this happens?
If the transaction that inserted row with id = 7 started earlier than the one that inserted id=6 then this is possible. Note that time when the transaction started is important, not the time when the insert was executing as part of that transaction.
As documented in the manual, now() returns the time at the start of the transaction (it's the same as transaction_timestamp(), not the "current" time.
If you need that, you should change your default value to clock_timestamp()

How to select only one result per condition met inside an individual table (no joins)?

I have a table containing all the trips taken by different cars. I've filtered down this table to trips that had multiple stops specifically. Now all i want to do is get the first stop that each car had.
What i've got is:
Car ID
Date_depart
Date_arrive
Count (from a previous table creation)
I've filtered this table by using Car ID + Date Depart and making a count where there are multiple date_arrives for a single date_depart. Now i'm trying to figure out how to only get back the first stop but am completely stuck. Outside of doing the lateral join X, order by Z limit 1 etc method; i have no idea how to get back only the first result in this table.
Here's some sample data:
Car ID Date_depart Date_arrive Count
949 2017-01-01 2017-01-05 2
949 2017-01-01 2017-01-09 2
1940 2017-01-09 2017-01-11 3
1940 2017-01-09 2017-01-14 3
1940 2017-01-09 2017-01-28 3
949 2018-04-19 2018-04-23 2
949 2018-04-19 2018-04-26 2
and the expected result would be:
Car ID Date_depart Date_arrive Count
949 2017-01-01 2017-01-05 2
1940 2017-01-09 2017-01-11 3
949 2018-04-19 2018-04-23 2
Any help?
You need DISTINCT ON
SELECT DISTINCT ON (date_depart, car_id)
*
FROM
trips
ORDER BY date_depart, car_id, date_arrive
This gives you the first (ordered) row of each group (date_depart, car_id)
demo: db<>fiddle

PostgreSQL group by error

I have a relation in a PostgreSQL database called 'processed_data' having the following schema:
Date -> date type, shop_id -> integer type, item_category_id -> integer type, sum_item_cnt_day -> real type.
Displaying the first 5 rows of the relation is as follows:
date | shop_id | item_category_id | sum_item_cnt_day
------+-----------+--------------------+------------------
2014-12-29 | 49 | 3 | 4
2014-12-29 | 49 | 6 | 1
2014-12-29 | 49 | 7 | 1
2014-12-29 | 49 | 12 | 3
2014-12-29 | 49 | 16 | 1
Now, the 'shop_id' has 60 unique shops ranging from 0-59 where each shop sells some items grouped to a new column 'item_category_id' where 'sum_item_cnt_day' denotes the number of items sold by a shop and it's item_category_id.
I am now trying to further aggregate the data by just trying to get the following columns as final result-
date, shop_id, sum_item_cnt_day
So that, data is aggregated according to number of all items sold in 'item_category_id' per shop (denoted by 'shop_id') and calculating sum of 'sum_item_cnt_day'.
When I try to execute the following SQL command-
select date, shop_id, sum(sum_item_cnt_day) from processed_data group by shop_id;
It gives the error-
ERROR: column "processed_data.date" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select date, shop_id, sum(sum_item_cnt_day) from processed_d...
^
Even the following SQL command-
select date, shop_id, sum(sum_item_cnt_day) from processed_data where date between '2013-01-01' and '2013-01-31' group by shop_id;
Gives the error-
ERROR: column "processed_data.date" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select date, shop_id, sum(sum_item_cnt_day) from processed_d...
^
Any suggestions as to what's going wrong and what am I missing?
Thanks!
The simplest fix, which is what I think you want, would be to just add date to the GROUP BY clause:
SELECT date, shop_id, SUM(sum_item_cnt_day)
FROM processed_data
GROUP BY date, shop_id;
If you really don't want sums taken for each shop on each day, but rather for each shop over all days, then you will have to think of which of the many dates you want to display.