column is ambiguous in multiple join postgresql - postgresql

if I remove the second join query, its works but otherwise not works !
ERROR: column reference "category" is ambiguous LINE 1: ...tion",
"prefix_product"."full_desc" as "description", "category"... ^
SELECT "prefix_product"."id" as "product_id",
"prefix_product"."title" as "name",
"prefix_product"."short_desc" as "briefDescription",
"prefix_product"."full_desc" as "description",
"category" as "productCategory"
FROM "prefix_product"
JOIN "prefix_category"
ON "prefix_product"."category"="prefix_category"."id"
JOIN "prefix_category_attribs"
ON "prefix_product"."category"="prefix_category"."parent"
WHERE "vendor" = '8'
I am using codeigniter3 with postgresql and in codeigniter, I have :
$this->db->select(['prefix_product.id as product_id', 'prefix_product.title as name', 'prefix_product.short_desc as briefDescription', 'prefix_product.full_desc as description','category as productCategory']);
$this->db->where('vendor',$vendorId);
$this->db->from($this->tblName);
$this->db->join('prefix_category','prefix_product.category=prefix_category.id');
$this->db->join('prefix_category_attribs','prefix_product.category=prefix_category.parent');
$queryResult =$this->db->get()->result();
thanks

It seems there are more than one tables having the same column name, i.e. category. So you need to prefix column category in the SELECT clause in order to explicitly specify the table the column belongs to :
"prefix_product"."category" as "productCategory"

Related

postgresql groupby count where date is between

So I have a table where I want the count of rows where the customer is McDonalds's and Date > 2019-06-30
I am trying
select "Customer",
Count("Customer")
FROM
public.master_environmental_data
WHERE "Customer" = 'McDonald''s' AND "Date" > '2021-06-30';
However I am getting this error:
column "master_environmental_data.Customer" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select "Customer",
^
SQL state: 42803
Character: 8
What is the correct query?
Should add a GROUP BY at the end of the query:
select "Customer",
Count("Customer")
FROM
public.master_environmental_data
WHERE "Customer" = 'McDonald''s' AND "Date" > '2021-06-30'
GROUP BY "Customer";

Postgres CTE exponentially saving time?

I would love some clear explanation on the below, I would have thought PG would have optimized the first query to be just as fast as the second query, which uses a CTE, since it's basically using a simple index to filter and join on 2 columns. Everything in the joins and filtering, except "l"."type", has an index. This would be on PG 10.
The below takes 20 minutes+.
SELECT
transactions.id::text AS id,
transactions.amount,
transactions.currency::text AS currency,
transactions.external_id::text AS external_id,
transactions.check_sender_balance,
transactions.created,
transactions.type::text AS type,
transactions.sequence,
transactions.legacy_id::text AS legacy_id,
transactions.reference_transaction::text AS reference_transaction,
a.user_id as user_id
FROM transactions
JOIN lines l ON transactions.id = l.transaction
JOIN accounts a ON l.account = a.id
WHERE l.type='DEBIT'
AND "sequence" > 357550718
AND user_id IN ('5bf4ceb45d27fd2985a000000')
But the following, which I suppose explicitly optimizes accounts via CTE, finishes in ~2-4minutes. I would have thought PG would have optimized to match this type of performance?
WITH "accts" AS (
SELECT "id", "user_id"
FROM "accounts" WHERE "user_id" IN ('5bf4ceb45d27fd2985a000000')
)
SELECT "transactions"."id"::TEXT AS "id",
"transactions"."amount",
"transactions"."currency"::TEXT AS "currency",
"transactions"."external_id"::TEXT AS "external_id",
"transactions"."check_sender_balance",
"transactions"."created",
"transactions"."type"::TEXT AS "type",
"transactions"."sequence",
"transactions"."legacy_id"::TEXT AS "legacy_id",
"transactions"."reference_transaction"::TEXT AS "reference_transaction",
a."user_id" AS "user_id"
FROM "transactions"
JOIN "lines" "l" ON "transactions"."id" = "l"."transaction"
JOIN "accts" "a" ON "a"."id" = "l"."account"
WHERE "l"."type" = 'DEBIT'
AND "sequence" > 357550718
You have a second predicate in your second query vs your first. In your second in the CTE you are limiting it to only a specific user_id. Nowhere in your first query do you have that filter. If there is an index on the user_id field then it is probably helping your performance. You can run an explain plan on both queries separately by adding EXPLAIN to the beginning of them and see how the plan differs. This will help you figure out why there is a difference.

Selecting values inside array in jsonb

I have the following table
CREATE TABLE country (
id INTEGER NOT NULL PRIMARY KEY ,
name VARCHAR(50),
extra_info JSONB
);
INSERT INTO country(id,extra_info)
VALUES (1, '{ "name" : "France", "population" : "65000000", "flag_colours": ["red", "blue","white"]}');
INSERT INTO country(id,extra_info)
VALUES (2, '{ "name": "Spain", "population" : "47000000", "borders": ["Portugal", "France"] }');
SELECT extra_info->>'name' as Name, extra_info->>'population' as Population
FROM country
I would like to select id and extra info
SELECT id,extra_info->>'population' as Population,extra_info->'flag_colours'->>1 as colors
FROM country
This query shows only id,population but the flag_colors is null.
I also would like to use flag_colors in a condition
SELECT extra_info->>'population' as Population FROM country where extra_info->'flag_colours'->>0
i get this error
ERROR: argument of WHERE must be type boolean, not type text
LINE 1: ...o->>'population' as Population FROM country where extra_info...
^
SQL state: 42804
Character: 67
How can i fix the two queries?
Wrote my query this way
SELECT *
FROM country
WHERE (extra_info -> 'flag_colours') ? 'red' and (extra_info -> 'flag_colours') ? 'white'
Many thanks to alt-f4
updated answer https://stackoverflow.com/a/62858683/492293

PostgreSQL: Find and delete duplicated jsonb data, excluding a key/value pair when comparing

I have been searching all over to find a way to do this.
I am trying to clean up a table with a lot of duplicated jsonb fields.
There are some examples out there, but as a little twist, I need to exclude one key/value pair in the jsonb field, to get the result I need.
Example jsonb
{
"main": {
"orders": {
"order_id": "1"
"customer_id": "1",
"update_at": "11/23/2017 17:47:13"
}
}
Compared to:
{
"main": {
"orders": {
"order_id": "1"
"customer_id": "1",
"updated_at": "11/23/2017 17:49:53"
}
}
If I can exclude the "updated_at" key when comparing, the query should find it a duplicate and this, and possibly other, duplicated entries should be deleted, keeping only one, the first "original" one.
I have found this query, to try and find the duplicates. But it doesn't take my situation into account. Maybe someone can help structuring this to meet the requirements.
SELECT t1.jsonb_field
FROM customers t1
INNER JOIN (SELECT jsonb_field, COUNT(*) AS CountOf
FROM customers
GROUP BY jsonb_field
HAVING COUNT(*)>1
) t2 ON t1.jsonb_field=t2.jsonb_field
WHERE
t1.customer_id = 1
Thanks in advance :-)
If the Updated at is always at the same path, then you can remove it:
SELECT t1.jsonb_field
FROM customers t1
INNER JOIN (SELECT jsonb_field, COUNT(*) AS CountOf
FROM customers
GROUP BY jsonb_field
HAVING COUNT(*)>1
) t2 ON
t1.jsonb_field #-'{main,orders,updated_at}'
=
t2.jsonb_field #-'{main,orders,updated_at}'
WHERE
t1.customer_id = 1
See https://www.postgresql.org/docs/9.5/static/functions-json.html
additional operators
EDIT
If you dont have #- you might just cast to text, and do a regex replace
regexp_replace(t1.jsonb_field::text, '"update_at": "[^"]*?"','')::jsonb
=
regexp_replace(t2.jsonb_field::text, '"update_at": "[^"]*?"','')::jsonb
I even think, you don't need to cast it back to jsonb. But to be save.
Mind the regex matche ANY "update_at" field (by key) in the json. It should not match data, because it would not match an escaped closing quote \", nor find the colon after it.
Note the regex actually should be '"update_at": "[^"]*?",?'
But on sql fiddle that fails. (maybe depends on the postgresbuild..., check with your version, because as far as regex go, this is correct)
If the comma is not removed, the cast to json fails.
you can try '"update_at": "[^"]*?",'
no ? : that will remove the comma, but fail if update_at was the last in the list.
worst case, nest the 2
regexp_replace(
regexp_replace(t1.jsonb_field::text, '"update_at": "[^"]*?",','')
'"update_at": "[^"]*?"','')::jsonb
for postgresql 9.4
Though sqlfidle only has 9.3 and 9.6
9.3 is missing the json_object_agg. But the postgres doc says it is in 9.4. So this should work
It will only work, if all records have objects under the important keys.
main->orders
If main->orders is a json array, or scalar, then this may give an error.
Same if {"main": [1,2]} => error.
Each json_each returns a table with a row for each key in the json
json_object_agg aggregates them back to a json array.
The case statement filters the one key on each level that needs to be handled.
In the deepest nest level, it filters out the updated_at row.
On sqlfidle set query separator to '//'
If you use psql client, replace the // with ;
create or replace function foo(x json)
returns jsonb
language sql
as $$
select json_object_agg(key,
case key when 'main' then
(select json_object_agg(t2.key,
case t2.key when 'orders' then
(select json_object_agg(t3.key, t3.value)
from json_each(t2.value) as t3
WHERE t3.key <> 'updated_at'
)
else t2.value
end)
from json_each(t1.value) as t2
)
else t1.value
end)::jsonb
from json_each(x) as t1
$$ //
select foo(x)
from
(select '{ "main":{"orders":{"order_id": "1", "customer_id": "1", "updated_at": "11/23/2017 17:49:53" }}}'::json as x) as t1
x (the argument) may need to be jsonb, if that is your datatype

Upsert error (On Conflict Do Update) pointing to duplicate constrained values

I have a problem with ON CONFLICT DO UPDATE in Postgres 9.5 when I try to use more than one source in the FROM statement.
Example of working code:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
Faulty code:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference", old."ReferenceAuthor"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
AND old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID"
--Year, Title and Author must be present in the data, otherwise the entry is deemed useless, hence won't be included
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
I added an additional source in the FROM statement and one more WHERE statement to make sure only entries that have a title, year and author are inserted into the new database. (If old."Reference"."ID" exists in old."ReferenceAuthor" as "ReferenceID", then an author exists.) Even without the additional WHERE statement the query is faulty. The columns I specified in SELECT are only present in old."Reference", not in old."ReferenceAuthor".
Currently old."ReferenceAuthor" and old."Reference" don't have a UNIQUE CONSTRAINT,the uniqe constraints for bookmonographs are:
CONSTRAINT bookmonographs_pk PRIMARY KEY (bookmonographsid),
CONSTRAINT bookmonographs_bookseries FOREIGN KEY (bookseriesid)
REFERENCES new.bookseries (bookseriesid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT bookmonographs_citaviid_unique UNIQUE (citavi_id)
The error PSQL throws:
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
********** Error **********
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
SQL state: 21000
Hint: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
I don't know what's wrong, or why the hint points to a duplicated constrained value.
The problem is caused by the fact that apparently some entries have multiple authors. So the inner join in the select query that you wrote will return multiple rows for the same entry and INSERT ... ON CONFLICT doesn't like that. Since you only use the ReferenceAuthor table for filtering, you can simply rewrite the query so that it uses that table to only filter entries that don't have any author by doing an exists on a correlated subquery. Here's how:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
AND exists(SELECT FROM old."ReferenceAuthor" WHERE old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID")
--Year, Title and Author must be present in the data, otherwise the entry is deemed useless, hence won't be included
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
Use an explicit INNER JOIN to join the two source tables together:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
INNER JOIN old."ReferenceAuthor" -- explicit join
ON old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID" -- ON condition
WHERE old."Reference"."ReferenceType" = 'Book' AND
old."Reference"."Year" IS NOT NULL AND
old."Reference"."Title" IS NOT NULL
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) =
(excluded.abstract, excluded.createdon, excluded.edition, excluded.title,
excluded.year)
There's a great explanation of the issue in postgres' docs (ctrl + f: "Cardinality violation" errors in detail, as there's no direct link).
To quote from the docs:
The idea of raising "cardinality violation" errors is to ensure that any one row is affected no more than once per statement executed. In the lexicon of the SQL standard's discussion of SQL MERGE, the SQL statement is "deterministic". The user ought to be confident that a row will not be affected more than once - if that isn't the case, then it isn't predictable what the final value of a row affected multiple times will be.
To replay their simpler example, on table upsert the below query could not work, as we couldn't reliably know if select val from upsert where key = 1 was equal to 'Foo' or 'Bar':
INSERT INTO upsert(key, val)
VALUES(1, 'Foo'), (1, 'Bar')
ON CONFLICT (key) UPDATE SET val = EXCLUDED.val;
ERROR: 21000: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.