How to increment value in counter table - postgresql

In my table I have the following scheme:
id - integer | date - text | name - text | count - integer
I want just to count some actions.
I want put 1 when date = '30-04-2019' not exist yet.
I want put +1 when is row already exist.
My idea is:
UPDATE "call" SET count = (1 + (SELECT count
FROM "call"
WHERE date = '30-04-2019'))
WHERE date = '30-04-2019'
But it is not working when row doesn't exist.
It is possible without some extra triggers, etc...

You can use a writeable CTE to achieve this. Additionally the UPDATE statement can be simplified to a simple set count = count + 1 there is no need for a sub-select.
with updated as (
update "call"
set count = count + 1
where date = '30-04-2019'
returning id
)
insert into "call" (date, count)
select '30-04-2019', 1
where not exists (select *
form updated);
If the update did not find a row, the where not exists condition will be true and the insert will be executed.
Note that the above is not safe for concurrent execution from multiple transactions. If you want to make this safe, create a unique index on the date column. Then use an INSERT ... ON CONFLICT instead:
insert into "call" (date, count)
values ('30-04-2019', 1)
on conflict (date)
do update
set count = "call".count + 1;
Again: the above requires a unique index (or constraint) on the date column.
Unrelated to the immediate problem, but: storing dates in a text column is a really, really bad idea. You should change your table definition and change the data type for the "date" column to date.

Related

PostgreSQL array of data composite update element using where condition

I have a composite type:
CREATE TYPE mydata_t AS
(
user_id integer,
value character(4)
);
Also, I have a table, uses this composite type as an array of mydata_t.
CREATE TABLE tbl
(
id serial NOT NULL,
data_list mydata_t[],
PRIMARY KEY (id)
);
Here I want to update the mydata_t in data_list, where mydata_t.user_id is 100000
But I don't know which array element's user_id is equal to 100000
So I have to make a search first to find the element where its user_id is equal to 100000 ... that's my problem ... I don't know how to make the query .... in fact, I want to update the value of the array element, where it's user_id is equal to 100000 (Also where the id of tbl is for example 1) ... What will be my query?
Something like this (I know it's wrong !!!)
UPDATE "tbl" SET "data_list"[i]."value"='YYYY'
WHERE "id"=1 AND EXISTS (SELECT ROW_NUMBER() OVER() AS i
FROM unnest("data_list") "d" WHERE "d"."user_id"=10000 LIMIT 1)
For example, this is my tbl data:
Row1 => id = 1, data = ARRAY[ROW(5,'YYYY'),ROW(6,'YYYY')]
Row2 => id = 2, data = ARRAY[ROW(10,'YYYY'),ROW(11,'YYYY')]
Now i want to update tbl where id is 2 and set the value of one of the tbl.data elements to 'XXXX' where the user_id of element is equal to 11
In fact, the final result of Row2 will be this:
Row2 => id = 2, data = ARRAY[ROW(10,'YYYY'),ROW(11,'XXXX')]
If you know the value value, you can use the array_replace() function to make the change:
UPDATE tbl
SET data_list = array_replace(data_list, (11, 'YYYY')::mydata_t, (11, 'XXXX')::mydata_t)
WHERE id = 2
If you do not know the value value then the situation becomes more complex:
UPDATE tbl SET data_list = data_arr
FROM (
-- UPDATE doesn't allow aggregate functions so aggregate here
SELECT array_agg(new_data) AS data_arr
FROM (
-- For the id value, get the data_list values that are NOT modified
SELECT (user_id, value)::mydata_t AS new_data
FROM tbl, unnest(data_list)
WHERE id = 2 AND user_id != 11
UNION
-- Add the values to update
VALUES ((11, 'XXXX')::mydata_t)
) x
) y
WHERE id = 2
You should keep in mind, though, that there is an awful lot of work going on in the background that cannot be optimised. The array of mydata_t values has to be examined from start to finish and you cannot use an index on this. Furthermore, updates actually insert a new row in the underlying file on disk and if your array has more than a few entries this will involve substantial work. This gets even more problematic when your arrays are larger than the pagesize of your PostgreSQL server, typically 8kB. All behind the scene so it will work, but at a performance penalty. Even though array_replace sounds like changes are made in-place (and they indeed are in memory), the UPDATE command will write a completely new tuple to disk. So if you have 4,000 array elements that means that at least 40kB of data will have to be read (8 bytes for the mydata_t type on a typical system x 4,000 = 32kB in a TOAST file, plus the main page of the table, 8kB) and then written to disk after the update. A real performance killer.
As #klin pointed out, this design may be more trouble than it is worth. Should you make data_list as table (as I would do), the update query becomes:
UPDATE data_list SET value = 'XXXX'
WHERE id = 2 AND user_id = 11
This will have MUCH better performance, especially if you add the appropriate indexes. You could then still create a view to publish the data in an aggregated form with a custom type if your business logic so requires.

SQL: How to select first record per day, assuming that each day contains more than 1 value

I am trying to write a SQL query where the results would show the first value (ID) per user per day for the last year.
I tried using the query below and am able to get results for one day but when I try to change the time range to > 2021-06-01, it does not give me the results I expect.
select * from table
where value in
(
SELECT min(value)
FROM table
WHERE valueid = x
group by user
) and Time = '2022-05-30' and value is not null

unique date field postgresql default value

I have a date column which I want to be unique once populated, but want the date field to be ignored if it is not populated.
In MySQL the way this is accomplished is to set the date column to "not null" and give it a default value of '0000-00-00' - this allows all other fields in the unique index to be "checked" even if the date column is not populated yet.
This does not work in PosgreSQL because '0000-00-00' is not a valid date, so you cannot store it in a date field (this makes sense to me).
At first glance, leaving the field nullable seemed like an option, but this creates a problem:
=> create table uniq_test(NUMBER bigint not null, date DATE, UNIQUE(number, date));
CREATE TABLE
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> select * from uniq_test;
number | date
--------+------
1 |
1 |
1 |
1 |
(4 rows)
NULL apparently "isn't equal to itself" and so it does not count towards constraints.
If I add an additional unique constraint only on the number field, it checks only number and not date and so I cannot have two numbers with different dates.
I could select a default date that is a 'valid date' (but outside working scope) to get around this, and could (in fact) get away with that for the current project, but there are actually cases I might be encountering in the next few years where it will not in fact be evident that the date is a non-real date just because it is "a long time ago" or "in the future."
The advantage the '0000-00-00' mechanic had for me was precisely that this date isn't real and therefore indicated a non-populated entry (where 'non-populated' was a valid uniqueness attribute). When I look around for solutions to this on the internet, most of what I find is "just use NULL" and "storing zeros is stupid."
TL;DR
Is there a PostgreSQL best practice for needing to include "not populated" as a possible value in a unique constraint including a date field?
Not clear what you want. This is my guess:
create table uniq_test (number bigint not null, date date);
create unique index i1 on uniq_test (number, date)
where date is not null;
create unique index i2 on uniq_test (number)
where date is null;
There will be an unique constraint for not null dates and another one for null dates effectively turning the (number, date) tuples into distinct values.
Check partial index
It's not a best practice, but you can do it such way:
t=# create table so35(i int, d date);
CREATE TABLE
t=# create unique index i35 on so35(i, coalesce(d,'-infinity'));
CREATE INDEX
t=# insert into so35 (i) select 1;
INSERT 0 1
t=# insert into so35 (i) select 2;
INSERT 0 1
t=# insert into so35 (i) select 2;
ERROR: duplicate key value violates unique constraint "i35"
DETAIL: Key (i, (COALESCE(d, '-infinity'::date)))=(2, -infinity) already exists.
STATEMENT: insert into so35 (i) select 2;

updatexml for particular rows only

Context: I want to increase the allowance value of some employees from £1875 to £7500, and update their balance to be £7500 minus whatever they have currently used.
My Update statement works for one employee at a time, but I need to update around 200 records, out of a table containing about 6000.
I am struggling to workout how to modify the below to update more than one record, but only the 200 records I need to update.
UPDATE employeeaccounts
SET xml = To_clob(Updatexml(Xmltype(xml),
'/EmployeeAccount/CurrentAllowance/text()',187500,
'/EmployeeAccount/AllowanceBalance/text()',
750000 - (SELECT Extractvalue(Xmltype(xml),
'/EmployeeAccount/AllowanceBalance',
'xmlns:ts=\"http://schemas.com/\", xmlns:xt=\"http://schemas.com\"'
)
FROM employeeaccounts
WHERE id = '123456')))
WHERE id = '123456'
Example of xml column (stored as clob) that I want to update. Table has column ID that hold PK of employees ID EG 123456
<EmployeeAccount>
<LastUpdated>2016-06-03T09:26:38+01:00</LastUpdated>
<MajorVersion>1</MajorVersion>
<MinorVersion>2</MinorVersion>
<EmployeeID>123456</EmployeeID>
<CurrencyID>GBP</CurrencyID>
<CurrentAllowance>187500</CurrentAllowance>
<AllowanceBalance>100000</AllowanceBalance>
<EarnedDiscount>0.0</EarnedDiscount>
<NormalDiscount>0.0</NormalDiscount>
<AccountCreditLimit>0</AccountCreditLimit>
<AccountBalance>0</AccountBalance>
</EmployeeAccount>
You don't need a subquery to get the old balance, you can use the value from the current row; which means you don't need to correlate that subquery and can just use an in() in the main statement:
UPDATE employeeaccounts
SET xml = To_clob(Updatexml(Xmltype(xml),
'/EmployeeAccount/CurrentAllowance/text()',187500,
'/EmployeeAccount/AllowanceBalance/text()',
750000 - Extractvalue(Xmltype(xml),
'/EmployeeAccount/AllowanceBalance',
'xmlns:ts=\"http://schemas.com/\", xmlns:xt=\"http://schemas.com\"')
))
WHERE id in (123456, 654321, ...);

PostgreSQL and pl/pgsql SYNTAX to update fields based on SELECT and FUNCTION (while loop, DISTINCT COUNT)

I have a large database, that I want to do some logic to update new fields.
The primary key is id for the table harvard_assignees
The LOGIC GOES LIKE THIS
Select all of the records based on id
For each record (WHILE), if (state is NOT NULL && country is NULL), update country_out = "US" ELSE update country_out=country
I see step 1 as a PostgreSQL query and step 2 as a function. Just trying to figure out the easiest way to implement natively with the exact syntax.
====
The second function is a little more interesting, requiring (I believe) DISTINCT:
Find all DISTINCT foreign_keys (a bivariate key of pat_type,patent)
Count Records that contain that value (e.g., n=3 records have fkey "D","388585")
Update those 3 records to identify percent as 1/n (e.g., UPDATE 3 records, set percent = 1/3)
For the first one:
UPDATE
harvard_assignees
SET
country_out = (CASE
WHEN (state is NOT NULL AND country is NULL) THEN 'US'
ELSE country
END);
At first it had condition "id = ..." but I removed that because I believe you actually want to update all records.
And for the second one:
UPDATE
example_table
SET
percent = (SELECT 1/cnt FROM (SELECT count(*) AS cnt FROM example_table AS x WHERE x.fn_key_1 = example_table.fn_key_1 AND x.fn_key_2 = example_table.fn_key_2) AS tmp WHERE cnt > 0)
That one will be kind of slow though.
I'm thinking on a solution based on window functions, you may want to explore those too.