How to update some columns of multi rows in PostgreSQL - postgresql

I have a table called Configuration. The 1st column of the table is deviceId, the 2nd column is parameter, 3rd column is value.
In the table, there are many devices, each device has only one device ID(column 1); each device has many config parameters(column 2), ex. VER, DATE, COMPANY etc, all devices have the same config parameters; the parameter's value(column 3) for different devices may be the same or not.
I want to update two config parameters say VER and DATE with new parameter's values(column 3) for the device which has ID equal to "id12345".
How can I achieve this in one PostgreSQL query?

This is what you could do, although I don't think it's a terrific idea.
UPDATE configuration
SET value =
CASE parameter
WHEN 'VER' THEN new_ver_value
WHEN 'DATE' THEN new_date_value
END
WHERE deviceid = 'id12345';
Parameter tables like this are generally considered a bad idea, as the complexity of this query helps illustrate. Also, since you say that all the devices have the same parameters, doing this instead of having a unique column for each parameter doesn't seem to achieve anything useful.
Also if possible use just a number for the deviceid instead of a string.
As requested, to update an additional field, like a time field set to the current_time, you could do the following.
UPDATE configuration
SET value =
CASE parameter
WHEN 'VER' THEN new_ver_value
WHEN 'DATE' THEN new_date_value
END,
time = current_time
WHERE deviceid = 'id12345'
AND parameter IN ('VER', 'DATE');
For the first query, here is a test to show the result.
CREATE TABLE configuration
(deviceid CHARACTER VARYING (20),
parameter CHARACTER VARYING (10),
VALUE integer);
INSERT INTO configuration
(VALUES ('id12345', 'VER', 1),
('id12345', 'DATE', 20190101),
('id12345', 'COMPANY', 55),
('id33333', 'VER', 2),
('id33333', 'DATE', 20180101),
('id33333', 'COMPANY', 6));
SELECT * FROM configuration;
id12345 VER 1
id1234 DATE 20190101
id12345 COMPANY 55
id33333 VER 2
id33333 DATE 20180101
id33333 COMPANY 6
UPDATE configuration
SET value =
CASE parameter
WHEN 'VER' THEN 11
WHEN 'DATE' THEN 2020010
END
WHERE deviceid = 'id12345'
AND parameter IN ('VER', 'DATE');
SELECT * FROM configuration;
id12345 COMPANY 55
id33333 VER 2
id33333 DATE 20180101
id33333 COMPANY 6
id12345 VER 11
id12345 DATE 2020010

Related

Create rows from part of column names

Source data
I am working on an ELT project to load data from CSV files into PostgreSQL where I will transform it. The CSV files have many columns that are consistent across files, but also contain activity columns that are inconsistent with names like Date (05/19/2020), Type (05/19/2020), etc.
In the loading script I am merging all of the columns with dates in the column name into one jsonb column so I don't have to constantly add new columns to the raw data table.
The resulting jsonb column in the raw data table looks like this:
id
activity
12345678
{"Date (05/19/2020)": null, "Type (05/19/2020)": null, "Date (06/03/2020)": "06/01/2020", "Type (06/03/2020)": "E"}
98765432
{"Date (05/19/2020)": "05/18/2020", "Type (05/19/2020)": "B", "Date (10/23/2020)": "10/26/2020", "Type (10/23/2020)": "T"}
JSON to columns
Using the amazing create_jsonb_flat_view function from this post I can convert the jsonb to columns like this:
id
Date (05/19/2020)
Type (05/19/2020)
Date (06/03/2020)
Type (06/03/2020)
Type (10/23/2020
Date (10/23/2020)
Type (10/23/2020)
10629465
null
null
06/01/2020
E
98765432
05/18/2020
B
10/26/2020
T
Need to move part of column name to row
Now, this is where I'm stuck. I need to remove the portion of the column name that is the Activity Date (e.g. (05/19/2020)) and create a row for each id and ActivityDate with additional columns for Date and Type like this:
id
ActivityDate
Date
Type
12345678
05/19/2020
null
null
12345678
06/03/2020
06/01/2020
E
98765432
05/19/2020
05/18/2020
B
98765432
10/23/2020
10/26/2020
T
I followed your link to the create_jsonb_flat_view article yesterday and then forgot this question. While I thank you for pointing me there, I think that mentioning it worked against you.
A more conventional approach using regexp_replace() works here. I left the date values as strings, but you can convert them with to_date() if needed:
with parse as (
select id, e.k, e.v,
regexp_replace(e.k, '\s+\([0-9/]{10}\)', '') as k_no_date,
regexp_replace(e.k, '^.+([0-9/]{10}).+', '\1') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
db<>fiddle here
#Mike-Organek's Answer works beautifully!
However, I was curious if the regexp_replace() calls might be slowing the query down a bit and it seemed I could get the same results using a simpler function.
Since Mike gave me a great example to start with I modified it to split on the space between Date and (05/19/2020).
For 20,000 rows, it went from taking an avg of 7 sec on my local machine to an avg of .9 sec.
Here is the resulting query:
with parse as (
select id, e.k, e.v,
split_part(e.k, ' ', 1) as k_no_date,
trim(split_part(e.k, ' ', 2),'()') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;

Firebird list domains and data types

I want to list all domains, their datatypes, and size.
Background
I've managed to do the query, based on this SO answer.
The basic code takes all fields:
SELECT
*
FROM
rdb$fields
I found that I could get fields from rdb$fields:
filter fields from this request by RDB$FIELD_NAME
get field type code from RDB$FIELD_TYPE
get field length from RDB$FIELD_LENGTH
Reference:
https://firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref-appx04-fields.html
Question
How to combine all this to list all domains, their datatypes, and size?
I want to get only domains created by users, not automatic ones.
The code:
select
t.RDB$FIELD_NAME Name,
case t.RDB$FIELD_TYPE
when 7 then 'SMALLINT'
when 8 then 'INTEGER'
when 10 then 'FLOAT'
when 12 then 'DATE'
when 13 then 'TIME'
when 14 then 'CHAR'
when 16 then 'BIGINT'
when 27 then 'DOUBLE PRECISION'
when 35 then 'TIMESTAMP'
when 37 then 'VARCHAR'
when 261 then 'BLOB'
end Type_Name,
t.RDB$CHARACTER_LENGTH Chr_Length
from RDB$FIELDS t
where coalesce( rdb$system_flag, 0) = 0
and not ( rdb$field_name starting with 'RDB$')
Also interesting, I could not find a system table with datatypes. Had to hardcode them from the reference.
Thanks for the help in comments:
#MarkRotteveel
RDB$TYPE contains types, but names them differently:
You can find all data types in the RDB$TYPE for RDB$FIELD_NAME =
'RDB$FIELD_TYPE' (although you will need to map some types as it lists
SMALLINT as SHORT, INTEGER as LONG, BIGINT as INT64 and VARCHAR as
VARYING)
Need to use field RDB$CHARACTER_LENGTH instead of RDB$FIELD_LENGTH.
Note that RDB$FIELD_LENGTH is the wrong column for char/varchar
columns as it is the length in bytes (which depends on the character
set), you need to use RDB$CHARACTER_LENGTH for the length in
characters, and for numerical fields, you'll more likely need
RDB$FIELD_PRECISION (+ RDB$FIELD_SCALE), you are also ignoring sub
type information.
I needed the length of varchars only but appears RDB$FIELD_LENGTH = RDB$CHARACTER_LENGTH, 1 byte = 1 char for 1 byte character set.
If you use a 1 byte character set [1 byte = 1 char], but for example, UTF-8 is
(max) 4 byte per character, so then the field_length = 4 x
character_length
#Arioch
The most reliable way to get user domains:
To an extent one may use select * from rdb$fields where coalesce(
rdb$system_flag, 0) = 0 and not ( rdb$field_name starting with 'RDB$')
however no one prohibits user from manually/explicitly creating column
named "RDB$1234567".

unique date field postgresql default value

I have a date column which I want to be unique once populated, but want the date field to be ignored if it is not populated.
In MySQL the way this is accomplished is to set the date column to "not null" and give it a default value of '0000-00-00' - this allows all other fields in the unique index to be "checked" even if the date column is not populated yet.
This does not work in PosgreSQL because '0000-00-00' is not a valid date, so you cannot store it in a date field (this makes sense to me).
At first glance, leaving the field nullable seemed like an option, but this creates a problem:
=> create table uniq_test(NUMBER bigint not null, date DATE, UNIQUE(number, date));
CREATE TABLE
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> select * from uniq_test;
number | date
--------+------
1 |
1 |
1 |
1 |
(4 rows)
NULL apparently "isn't equal to itself" and so it does not count towards constraints.
If I add an additional unique constraint only on the number field, it checks only number and not date and so I cannot have two numbers with different dates.
I could select a default date that is a 'valid date' (but outside working scope) to get around this, and could (in fact) get away with that for the current project, but there are actually cases I might be encountering in the next few years where it will not in fact be evident that the date is a non-real date just because it is "a long time ago" or "in the future."
The advantage the '0000-00-00' mechanic had for me was precisely that this date isn't real and therefore indicated a non-populated entry (where 'non-populated' was a valid uniqueness attribute). When I look around for solutions to this on the internet, most of what I find is "just use NULL" and "storing zeros is stupid."
TL;DR
Is there a PostgreSQL best practice for needing to include "not populated" as a possible value in a unique constraint including a date field?
Not clear what you want. This is my guess:
create table uniq_test (number bigint not null, date date);
create unique index i1 on uniq_test (number, date)
where date is not null;
create unique index i2 on uniq_test (number)
where date is null;
There will be an unique constraint for not null dates and another one for null dates effectively turning the (number, date) tuples into distinct values.
Check partial index
It's not a best practice, but you can do it such way:
t=# create table so35(i int, d date);
CREATE TABLE
t=# create unique index i35 on so35(i, coalesce(d,'-infinity'));
CREATE INDEX
t=# insert into so35 (i) select 1;
INSERT 0 1
t=# insert into so35 (i) select 2;
INSERT 0 1
t=# insert into so35 (i) select 2;
ERROR: duplicate key value violates unique constraint "i35"
DETAIL: Key (i, (COALESCE(d, '-infinity'::date)))=(2, -infinity) already exists.
STATEMENT: insert into so35 (i) select 2;

Access version 2000 & 2013 SQL pull latest date, MAX doesn't work

I have a table that needs to pull the latest date from different categories and the date might not always be filled out. I have tried to use MAX, MIN etc. it has not worked.
e.g. ID 1st Game Date 2nd Game Date 3rd Game Date
Joe 6/1/16 missing missing
Anna missing 7/2/16 7/6/16
Rita missing 7/31/16 missing
Needs to Return:
ID Date
Joe 6/1/16
Anna 7/6/16
Rita 7/31/16
I do have this sql that works well but it requires that all the dates get filled in other wise it doesn't return the latest date:
ApptDate: Switch([Pt1stApptDate]>=[2ndApptDate] And [Pt1stApptDate]>=
[3rdApptDate],[Pt1stApptDate],[2ndApptDate]>=[Pt1stApptDate] And [2ndApptDate]>=
[3rdApptDate],[2ndApptDate],[3rdApptDate]>=[Pt1stApptDate] And [3rdApptDate]>=
[2ndApptDate],[3rdApptDate])
Much appreciation in advance for all your help
Use the Nz function:
ApptDate: Switch(Nz([Pt1stApptDate],0)>=Nz([2ndApptDate],0) And
Nz([Pt1stApptDate],0)>= Nz([3rdApptDate],0), Nz([Pt1stApptDate],0),
Nz([2ndApptDate],0)>=Nz([Pt1stApptDate],0) And Nz([2ndApptDate],0)>=
Nz([3rdApptDate],0),Nz([2ndApptDate],0),
Nz([3rdApptDate],0)>=Nz([Pt1stApptDate],0) And Nz([3rdApptDate],0)>=
Nz([2ndApptDate],0),Nz([3rdApptDate],0))
Having said that, your table design is incorrect.
You should be storing each ApptDate per ID in a separate row:
ApptID ID ApptDate ApptNr
1 Joe 6/1/2016 1
2 Anna 7/2/2016 2
3 Anna 7/6/2016 3
4 Rita 7/31/2016 2
whereas ApptID is an autonumber and ApptNr is a sequence per ID (what you seem to call a category).
When you are having problems writing what should be simple queries (SQL DML) then you should consider you may have design flaws (in your SQL DDL).
The missing values are causing you to avoid the MAX set function and compels you to handle nulls in queries (note the NZ() function will cause errors outside of the Access UI). Better to model missing data by simply not adding a row to a table. Think about it: you want the smallest amount of data possible in your database, you can infer the remainder e.g. if Joe was not gaming on 1 Jan and 2 Jan and 3 Jan and 4 Jan etc then simply don't add anything to your database for all these dates.
The following SQL DDL requires ANSI-92 Query Mode (but you can create the same tables/views using the Access GUI tools):
CREATE TABLE Attendance
( gamer_name VARCHAR( 35 ) NOT NULL REFERENCES Gamers ( gamer_name ),
game_sequence NOT NULL CHECK ( game_sequence BETWEEN 1 AND 3 )
game_date DATETIME NOT NULL,
UNIQUE ( game_date, game_sequence ) );
INSERT INTO Attendance VALUES ( 'Joe', 1, '2016-06-01' );
INSERT INTO Attendance VALUES ( 'Anna', 2, '2016-07-02' );
INSERT INTO Attendance VALUES ( 'Anna', 3, '2016-07-06' );
INSERT INTO Attendance VALUES ( 'Rita', 1, '2016-07-31' );
CREATE VIEW MostRecentAttendance
AS
SELECT gamer_name, MAX ( game_date ) AS game_date
FROM Attendance
GROUP
BY gamer_name;
SELECT *
FROM Attendance a
WHERE EXISTS ( SELECT *
FROM MostRecentAttendance r
WHERE r.gamer_name = a.gamer_name
AND r.game_date = a.game_date );
To find the missing sequence values for players, create a table of all possible sequence numbers { 1, 2, 3 } to which you can 'anti-join' (e.g. NOT EXISTS).

Why does usage of lower() changes the order of resultset?

I have a table where I store information about users. The table has the following structure:
CREATE TABLE PERSONS
(
ID NUMBER(20, 0) NOT NULL,
FIRSTNAME VARCHAR2(40),
LASTNAME VARCHAR2(40),
BIRTHDAY DATE,
CONSTRAINT PERSONEN_PK PRIMARY KEY
(ID)
ENABLE
);
After inserting some test data:
SET DEFINE OFF;
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values ('1','Max','Mustermann',to_date('31.10.89','DD.MM.RR'));
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values ('2','Max','Mustermann',to_date('31.10.89','DD.MM.RR'));
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values ('3','Carl','Carlchen',to_date('01.01.12','DD.MM.RR'));
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values ('4','Max','Mustermann',to_date('31.10.89','DD.MM.RR'));
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values ('5','Max','Mustermann',to_date('31.10.89','DD.MM.RR'));
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values ('6','Carl','Carlchen',to_date('01.01.12','DD.MM.RR'));
I want to select all duplicates of a given user. Let's use "Max Mustermann" for example:
SELECT p.id,p.firstname,p.lastname,p.birthday
FROM persons p
WHERE p.firstname = 'Max'
AND p.lastname = 'Mustermann'
AND p.birthday = to_date('31.10.1989','dd.mm.yyyy')
ORDER BY p.firstname,p.lastname;
This gives me a result like this:
id first last birthday
=================================
1 Max Mustermann 31.10.89
2 Max Mustermann 31.10.89
4 Max Mustermann 31.10.89
5 Max Mustermann 31.10.89
I want to do a case insensitive compare, so I change the query using lower (and trim) like this:
SELECT p.id,p.firstname,p.lastname,p.birthday
FROM persons p
WHERE lower(trim(p.firstname)) = lower(trim('mAx '))
AND lower(trim(p.lastname)) = lower(trim(' musteRmann '))
AND p.birthday = to_date('31.10.1989','dd.mm.yyyy')
ORDER BY p.lastname,p.firstname;
Now surprise the order has changed!
id first last birthday
=================================
1 Max Mustermann 31.10.89
5 Max Mustermann 31.10.89
4 Max Mustermann 31.10.89
2 Max Mustermann 31.10.89
Why does the order change, just by using lower() (same result when using without trim())!? I can get a stable ordering by adding the id column to the ORDER BY. But shouldn't the lower() have no affect to the ordering?
Workaround by also using id column for ORDER BY:
SELECT p.id,p.firstname,p.lastname,p.birthday
FROM persons p
WHERE p.firstname = 'Max'
AND p.lastname = 'Mustermann'
AND p.birthday = to_date('31.10.1989','dd.mm.yyyy')
ORDER BY p.firstname,p.lastname,p.id;
SELECT p.id,p.firstname,p.lastname,p.birthday
FROM persons p
WHERE lower(trim(p.firstname)) = lower(trim('mAx '))
AND lower(trim(p.lastname)) = lower(trim(' musteRmann '))
AND p.birthday = to_date('31.10.1989','dd.mm.yyyy')
ORDER BY p.lastname,p.firstname,p.id;
If the values to be ordered by are identical, then the DBMS is free to choose any order it feels correct (the same way it is free to choose any order if no order by is specified alltogether).
Because all values of the columns in the order by are identical the resulting order is not stable. The only way to get a stable order is to include a unique column as an additional order criteria for ties - exactly what you did when you added the id column.
Why does the order change, just by using lower()
From a technical point, I'd guess that applying the lower() changed the execution plan and therefor the access path to the data.
But again (just to make sure): ordering on identical values never guarantees a stable order!
There is no ordering without an order by clause. Sometimes it looks like there might be (group by fooled a lot of people in older releases`, but it's only coincidental, and must not be relied upon. In your case you're ordering by some columns, but you expect duplicates within that ordering to be further ordered implicitly, which won't happen - or at least cannot be relied on.
In this case Oracle probably happens to be retrieving the rows for your first query in the order you inserted them purely as a side effect of how it's reading data from the blocks, and the order by sorts them within that set without actually changing them (or quite likely it's skipping the order by step internally if it realises it's pointless; the explain plan would tell you that).
If you change the order the order the records are created:
...
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values
('5','Max','Mustermann',to_date('31.10.89','DD.MM.RR'));
Insert into PERSONS (ID,FIRSTNAME,LASTNAME,BIRTHDAY) values
('4','Max','Mustermann',to_date('31.10.89','DD.MM.RR'));
...
then the result 'order' changes too:
SELECT p.id,p.firstname,p.lastname,p.birthday
FROM persons p
WHERE p.firstname = 'Max'
AND p.lastname = 'Mustermann'
AND p.birthday = to_date('31.10.1989','dd.mm.yyyy')
ORDER BY p.firstname,p.lastname;
ID FIRSTNAME LASTNAME BIRTHDAY
---------- -------------------- -------------------- ---------
1 Max Mustermann 31-OCT-89
2 Max Mustermann 31-OCT-89
5 Max Mustermann 31-OCT-89
4 Max Mustermann 31-OCT-89
Once you have the function things are changing enough for that happy accident to go out of the window, even if the records are inserted in id order (which has no relevance to the DB internally). lower() isn't changing the ordering, you just aren't getting lucky any more.
You cannot expect or rely on an order unless you fully specify it in the order by clause.