Postgresql jsonb column to row extracted values - postgresql

I have the below table
BEGIN;
CREATE TABLE IF NOT EXISTS "public".appevents (
id uuid DEFAULT uuid_generate_v4() NOT NULL,
"eventId" uuid NOT NULL,
name text NOT NULL,
"creationTime" timestamp without time zone NOT NULL,
"creationTimeInMilliseconds" bigint NOT NULL,
metadata jsonb NOT NULL,
PRIMARY KEY(id)
);
COMMIT;
I would like to extract with a query the metadata jsonb column as a row and tried with the below query.
SELECT
userId
FROM
appevents, jsonb_to_record(appevents.metadata) as x(userId text)
Unfortunately, all the rows returned for userid have the value NULL which is not true. The only weird thing noticed is that it is converting camelcase to lowercase but doesn't seem like the issue.
Here are the 2 records I currently have in the database where userId exists.

The only weird thing noticed is that it is converting camelcase to lowercase but doesn't seem like the issue.
Actually that is the culprit - column names are case-insensitive by default, and userId is normalised to userid, for which the JSON doesn't contain a property. Quoting the identifier (… as x("userId" text)) should work.
However, there's a much simpler solution for accessing json object properties as text: the ->> operator. You can use
SELECT metadata->>'userId' AS userid FROM appevents

Related

PostgreSQL Not Recognizing NULL Values

A table in my database (PostgreSQL 9.6) has a mixture of NULL and not null values, which I need to COALESCE() as a part of the creation of another attribute during insert into a resulting dimension table. However, Postgres seems unable to recognize the NULL values as NULL.
SELECT DISTINCT name, description
FROM my_table
WHERE name IN('STUDIO', 'ONE BEDROOM')
AND description IS NOT NULL;
returns
name
description
STUDIO
NULL
ONE BEDROOM
NULL
Whereas
SELECT DISTINCT name, description
FROM my_table
WHERE name IN('STUDIO', 'ONE BEDROOM')
AND description IS NULL;
returns
name
description
as such, something like
SELECT DISTINCT name, COALESCE(description, 'N/A')
FROM my_table
WHERE name IN('STUDIO', 'ONE BEDROOM');
will return
name
coalesce
STUDIO
NULL
ONE BEDROOM
NULL
instead of the expected
name
coalesce
STUDIO
N/A
ONE BEDROOM
N/A
The DDL for these attributes is fairly straightforward:
...
name text COLLATE pg_catalog."default",
description text COLLATE pg_catalog."default",
...
I've already checked whether the attribute was filled with 'NULL' rather than being an actual NULL value, and that's not the case. I've also tried quoting the attribute in question as "description" and that hasn't made a difference. Casting to VARCHAR hasn't helped (I thought it might be the fact that it's a TEXT attribute). If I nullify some values in the other text column (name) I'm able to coalesce with a test value, so that one is seemingly behaving as expected leading me to think it's not a data type issue. This table exists in multiple databases on multiple servers and exhibits the same behavior in all of them.
I've tried inserting into a new table that has different attribute definitions:
...
floorplan_name "character varying(128)" COLLATE pg_catalog."default" NOT NULL DEFAULT 'Unknown'::character varying,
floorplan_desc "character varying(256)" COLLATE pg_catalog."default" NOT NULL DEFAULT 'Not Provided'::character varying,
...
resulting in
name
coalesce
STUDIO
NULL
ONE BEDROOM
NULL
so, not only is the default value unable to populate, leaving the values NULL in an attribute that is defined as NOT NULL, but the example SELECT statements above all behave in exactly the same way when run against the new table.
Does anyone have any idea what might be causing this?
It turns out that the source database is writing empty strings instead of proper NULLs. Adding NULLIF(description, '') before trying to COALESCE() solves the problem.
Thanks to everyone!

Generated UUIDs behavior in postgres INSERT rule compared to the UPDATE rule

I have a postgres database with a single table. The primary key of this table is a generated UUID. I am trying to add a logging table to this database such that whenever a row is added or deleted, the logging table gets an entry. My table has the following structure
CREATE TABLE configuration (
id uuid NOT NULL DEFAULT uuid_generate_v4(),
name text,
data json
);
My logging table has the following structure
CREATE TABLE configuration_log (
configuration_id uuid,
new_configuration_data json,
old_configuration_data json,
"user" text,
time timestamp
);
I have added the following rules:
CREATE OR REPLACE RULE log_configuration_insert AS ON INSERT TO "configuration"
DO INSERT INTO configuration_log VALUES (
NEW.id,
NEW.data,
'{}',
current_user,
current_timestamp
);
CREATE OR REPLACE RULE log_configuration_update AS ON UPDATE TO "configuration"
WHERE NEW.data::json::text != OLD.data::json::text
DO INSERT INTO configuration_log VALUES (
NEW.id,
NEW.data,
OLD.data,
current_user,
current_timestamp
);
Now, if I insert a value in the configuration table, the UUID in the configuration table and the configuration_log table are different. For example, the insert query
INSERT INTO configuration (name, data)
VALUES ('test', '{"property1":"value1"}')
The result is this... the UUID is c2b6ca9b-1771-404d-baae-ae2ec69785ac in the configuration table whereas in the configuration_log table the result is this... the UUID id 16109caa-dddc-4959-8054-0b9df6417406
However, the update rule works as expected. So if I write an update query as
UPDATE "configuration"
SET "data" = '{"property1":"abcd"}'
WHERE "id" = 'c2b6ca9b-1771-404d-baae-ae2ec69785ac';
The configuration_log table gets the correct UUID as seen here i.e. c2b6ca9b-1771-404d-baae-ae2ec69785ac
I am using NEW.id in both the rules so I was expecting the same behavior. Can anyone point out what I might be doing wrong here?
Thanks
This is another good example why rules should be avoided
Quote from the manual:
For any reference to NEW, the target list of the original query is searched for a corresponding entry. If found, that entry's expression replaces the reference.
So NEW.id is replaced with uuid_generate_v4() which explains why you are seeing a different value.
You should rewrite this to a trigger.
Btw: using jsonb is preferred over json, then you can also get rid of the (essentially incorrect) cast of the json column to text to compare the content.

References to multiple tables in PostgreSQL

I have many time series stored in a PostgreSQL database over multiple tables. I would like to create a table 'anomalies' which references to time series with particuliar behaviour, for instance a value that is exceptionally high.
My question is the following: what is the best way to link the entries of 'anomalies' with other tables?
I could create a foreign key in each table referencing to an entry in anomaly, but then it would be not so obvious to go from the anomaly to the entry referencing the anomaly.
The other possibility I see is to store the name of the corresponding table in the entries of anomalies, but it does not seem like a good idea, as the table name might change, or the table might get deleted.
Is there a more elegant solution to do this?
CREATE TABLE type_1(
type_1_id SERIAL PRIMARY KEY,
type_1_name TEXT NOT NULL,
unique(type_1_name)
)
CREATE TABLE type_1_ts(
date DATE NOT NULL,
value REAL NOT NULL,
type_1_id INTEGER REFERENCES type_1(type_1_id) NOT NULL,
PRIMARY KEY(type_1_id, date)
)
CREATE TABLE type_2(
type_2_id SERIAL PRIMARY KEY,
type_2_name TEXT NOT NULL,
unique(type_2_name)
)
CREATE TABLE type_2_ts(
date DATE NOT NULL,
value REAL NOT NULL,
state INTEGER NOT NULL,
type_2_id INTEGER REFERENCES type_2(type_2_id) NOT NULL,
PRIMARY KEY(type_2_id, date)
)
CREATE TABLE anomalies(
anomaly_id SERIAL PRIMARY_KEY,
date DATE NOT NULL,
property TEXT NOT NULL,
value REAL NOT NULL,
-- reference to a table_name and an entry id?
table_name TEXT
data_id INEGER
)
What I'd like to do at the end is to be able to do:
SELECT * FROM ANOMALIES WHERE table_name='type_1',
or simply list the data_type corresponding to the entries

How do I represent an array of tuples in postgresql?

Here's the easiest way I can think of to explain this. Imagine a user wants to bookmark a bunch of webpages. There's a url table with a UrlID and the actual url. I'd like the user to have a list of UrlIDs which are unique (but I don't need the constraint) and a 32bit int value such as an epoch date. The only two things I care about is 1) being to check if UrlID is in this list or not and 2) get the entire list and sort it by date (or second value)
If it helps I'm expecting no more than 8K bookmarks but most likely it will be <128
If you really want to avoid the extra table to express the relationship, you can do something like that:
CREATE TABLE "user" (
id integer primary key,
name text not null,
bookmarks integer[] not null
);
CREATE TABLE url (
id integer primary key,
time timestamp with time zone not null,
val text not null
);
Then finding all bookmarks for a particular user (say with id 66) would involve doing something like that:
SELECT url,time
FROM (SELECT bookmarks FROM "user" WHERE id=66) u
JOIN url ON url.id=ANY(bookmarks)
ORDER BY TIME;
Now here's why I don't like this schema. First, adding a new bookmark would require to rewrite the bookmarks array and hence the entire user row (so adding n bookmarks, one after the other, would require Θ(n^2) time). Secondly, you cannot use foreign keys on the elements of the array. Thridly, many queries will become more complicated to write, e.g. in order to retrieve all bookmarks for all users, you have to do something like that:
SELECT "user".id,"user".name,url.val,url.time
FROM "user",
LATERAL unnest((SELECT bookmarks)) b
LEFT JOIN url ON b = url.id;
Edit: So here's the schema I would use and which I think fits best with the relational paradigm
CREATE TABLE "user" (
id integer primary key,
name text not null
);
CREATE TABLE url (
id integer primary key,
val text not null
);
CREATE TABLE bookmark (
user_id integer not null REFERENCES "user",
url_id integer REFERENCES url,
time timestamp with time zone not null,
UNIQUE (user_id,url_id)
);

Implicit Index for table

I am learning Postgresql and db in general. I have a simple query like this and I want to understand what it does
CREATE TABLE adempiere.c_mom(
c_mom_id NUMERIC(10,0) NOT NULL,
isactive character(1) DEFAULT 'Y'::bpchar NOT NULL,
start_date date NOT NULL,
start_time timestamp without time zone NOT NULL,
end_time timestamp without time zone NOT NULL,
CONSTRAINT c_mom_pkey PRIMARY KEY (c_mom_id)
);
So after I execute this I got
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "c_mom_pkey" for table "c_mom"
Now I know that my PK is c_mom_id, but what is the purpose of creating an implicit index it under name c_mom_key?
What does DEFAULT 'Y'::bpchar, or in general what does :: in psql do?
Thank you
The :: notation is a PostgreSQL-specific type cast notation, in this case to type bpchar (blank-padded char).
An index is created to back primary keys to make them efficient. If there wasn't an index to back it, each insert statement would have to scan the whole table just to figure out if that insertion would create a duplicate key or not. Using an index speeds that up (dramatically if the table is large).
This is not PostgreSQL specific. A lot of relational databases will create unique indexes to back primary keys.