Perform search based on seperate table data in PostgreSQL - postgresql

I have a table which looks like this:
deals
id vendorid name
1 52 '25% Off'
2 34 '10% Off'
-
vendors
id name
52 'Walmart'
34 'Home Depot'
I'm trying to do a search of the database that also searches based on the vendor id.
I was able to do this to get the vendor data in the query:
SELECT *,
(SELECT json_agg(vendors.*) FROM vendors WHERE deals.vendorid = vendors.id) as vendor
FROM deals
Now I want to add a search to it, here is something I tried
SELECT *,
(SELECT json_agg(vendors.*) FROM vendors WHERE deals.vendorid = vendors.id) as vendor
FROM deals
WHERE vendor.name ILIKE '%wal%'
In the above example the user would be starting a search for walmart however it says that the vendor.name column does not exist, what would the correct way to do this be?
The output I'm expecting from the above query is:
[
{
id: 1,
vendorid: 52,
name: '25% Off',
vendor: [
{
id: 52,
name: 'Walmart'
}
]
}
]

I was able to get it done by using this.
SELECT
*,
(SELECT json_agg(vendors.*) FROM vendors WHERE deals.vendorid = vendors.id) as vendor
FROM
deals
where (SELECT json_agg(vendors.*) FROM vendors WHERE deals.vendorid = vendors.id).name ILIKE '%wal%'
It's pretty ugly but get the work done. I am not really familiar with postgree but you can store it in a variable somehow.

The query you need
drop table if exists deals;
drop table if exists vendors;
create table if not exists deals(
id serial,
vendorid int not null,
name varchar not null
);
create table if not exists vendors(
id serial,
name varchar not null
);
insert into deals values
(1,52,'25% off'),
(2,34,'10% off');
insert into vendors values
(52,'Walmart'),
(34,'Home Depot');
select
d.id,
d.vendorid,
v.name,
json_agg(v.*)
from
vendors v
left join
deals d on d.vendorid = v.id
where
--v.id=52 and
v.name ILIKE '%wal%'
group by
d.id,
d.vendorid,
v.name;
If you are running a 9.3+ version of postgres, you can use a great feature provided by postgres : the full text search
check how it works here
Enjoy postgres power.

Related

Is it possible to bulk update specific values in postgresql efficiently?

I have created a pipeline which is required to update a high number of rows in postgres where each row should be updated differently.
After looking up I found that this could be done using postgres UPDATE.. FROM.. syntax (https://www.postgresql.org/docs/current/sql-update.html) and I came up with the following query that works perfectly fine:
update grades
set course_id = data_table.course_id,
student_id = data_table.student_id,
grade = data_table.grade
from
(select unnest(array[1,2]) as id, unnest(array['Math', 'Math']) as course_id, unnest(array[1000, 1001]) as student_id, unnest(array[95, 100]) as grade) as data_table
where grades.id = data_table.id;
There's also another way to do it with WITH syntax like this:
update grades
set course_id = data_table.course_id,
student_id = data_table.student_id,
grade = data_table.grade
from
(WITH vals (id, course_id, student_id, grade) as (VALUES (1, 'Math', 1000, 95), (2, 'Math', 1001, 100)) SELECT * from vals) as data_table
where grades.id = data_table.id;
My problem is that sometimes I want in some raws to update a field and sometime not. When I don't want to update I just want to keep the value that is currently in the table. In this case, I would want to potentially do something like:
update grades g
set course_id = data_table.course_id,
student_id = data_table.student_id,
grade = data_table.grade
from
(select unnest(array[1,2]) as id, unnest(array[g.course_id, 'Math2']) as course_id, unnest(array[1000, 1001]) as student_id, unnest(array[95, g.grade]) as grade) as data_table
where grades.id = data_table.id;
However this is not possible and I get back the error HINT: There is an entry for table "g", but it cannot be referenced from this part of the query.
Also postgresql documentation specifies about it in the From description:
Note that the target table must not appear in the from_list,
unless you intend a self-join (in which case it must appear with an alias in the from_list).
Does anyone know if there's a way to perform such bulk update ?
I've tried to use JOINs in inner query but with no luck..
Chose a value that cannot be a valid value, eg '-1' for course name and -1 for a grade, and use that for your generated values, then use a case in the insert to direct whether to use the current value or not:
update grades g
set course_id = case when data_table.course_id = '-1' then course_id else data_table.course_id end,
student_id = data_table.student_id,
grade = case when data_table.grade = -1 then g.grade else data_table.grade end
from (
select
unnest(array[1,2]) as id,
unnest(array['-1', 'Math2']) as course_id, -- use '-1' instead of g.course_id
unnest(array[1000, 1001]) as student_id,
unnest(array[95, -1]) as grade -- use -1 instead of g.grade
) as data_table
where grades.id = data_table.id
Pick whatever values you like for the impossible value.
If nulls were not allowed it would have been more straightforward and less code - use null for the impossible value and coalesce() in for the update value.

Using ANY with raw data work but not subquery

I just can't figure it out why this query work
SELECT id, name, organization_id
FROM facilities
WHERE organization_id = ANY(
'{abc-xyz-123,678-ght-nmp}'
)
But this query wont work with error operator does not exist: uuid = uuid[]
SELECT id, name, organization_id
FROM facilities
WHERE organization_id = ANY(
SELECT organization_ids
FROM admins
WHERE id = 'jkl-iop-345'
)
When the subquery
SELECT organization_ids
FROM admins
WHERE id = 'jkl-iop-345'
give the exact result of {abc-xyz-123,678-ght-nmp}.
I'm using postgres (PostgreSQL) 13.3
The subquery produces one row that contains an array.
If you use = ANY (SELECT ...), the result set is converted to an array, so you end up with
{{abc-xyz-123,678-ght-nmp}}
which is an array of arrays.
You probably want
SELECT id, name, organization_id
FROM facilities
WHERE EXISTS (SELECT 1 FROM admins
WHERE admins.id = 'jkl-iop-345'
AND facilities.organization_id = ANY (admins.organization_ids)
);
Let me remark that storing references to other tables in an array, JSON or other composite data type is an exceptionally bad idea. A normalized schema with a junction table would serve you better.

Cast a PostgreSQL column to stored type

I am creating a viewer for PostgreSQL. My SQL needs to sort on the type that is normal for that column. Take for example:
Table:
CREATE TABLE contacts (id serial primary key, name varchar)
SQL:
SELECT id::text FROM contacts ORDER BY id;
Gives:
1
10
100
2
Ok, so I change the SQL to:
SELECT id::text FROM contacts ORDER BY id::regtype;
Which reults in:
1
2
10
100
Nice! But now I try:
SELECT name::text FROM contacts ORDER BY name::regtype;
Which results in:
invalid type name "my first string"
Google is no help. Any ideas? Thanks
Repeat: the error is not my problem. My problem is that I need to convert each column to text, but order by the normal type for that column.
regtype is a object identifier type and there is no reason to use it when you are not referring to system objects (types in this case).
You should cast the column to integer in the first query:
SELECT id::text
FROM contacts
ORDER BY id::integer;
You can use qualified column names in the order by clause. This will work with any sortable type of column.
SELECT id::text
FROM contacts
ORDER BY contacts.id;
So, I found two ways to accomplish this. The first is the solution #klin provided by querying the table and then constructing my own query based on the data. An untested psycopg2 example:
c = conn.cursor()
c.execute("SELECT * FROM contacts LIMIT 1")
select_sql = "SELECT "
for row in c.description:
if row.name == "my_sort_column":
if row.type_code == 23:
sort_by_sql = row.name + "::integer "
else:
sort_by_sql = row.name + "::text "
c.execute("SELECT * FROM contacts " + sort_by_sql)
A more elegant way would be like this:
SELECT id::text AS _id, name::text AS _name AS n FROM contacts ORDER BY id
This uses aliases so that ORDER BY still picks up the original data. The last option is more readable if nothing else.

How to find primary key on a table in DB2?

I am unable to fetch primary key in DB2. I used following code but It is not working for me.
SELECT TBCREATOR, TBNAME, NAME, KEYSEQ
FROM SYSIBM.SYSCOLUMNS
WHERE TBCREATOR = 'DSN8710'
AND TBNAME = 'EMPLOYEE'
AND KEYSEQ > 0
ORDER BY KEYSEQ;
And what is the means of TBCREATOR in this code and how to modified TBCREATOR value according to my case?
I'll answer your last question first. creator is sometimes referred to as schema. If you're familiar with Oracle, this is roughly analogous to a database user (though not exactly).
As far as getting the "primary key" information, you probably want to know which index is the "clustering" index (which is what usually, but not always, determines the physical ordering of the rows on disk).
How you find the clustering index depends on the platform you're running:
Mainframe (z/OS):
SELECT
RTRIM(name) AS index_name
,RTRIM(creator) AS index_schema
,uniquerule
,clustering
FROM sysibm.sysindexes
WHERE tbname = #table
AND tbcreator = #schema
AND clustering = 'Y'
Then, to see the actual columns in that index, you perform this query:
SELECT colname AS name
FROM sysibm.sysindexes a
JOIN sysibm.syskeys b
ON a.name = b.ixname
AND a.tbcreator = b.ixcreator
WHERE a.name = #index_name
AND a.tbcreator = #index_schema
ORDER BY COLSEQ
Linux/Unix/Windows:
SELECT
RTRIM(indname) AS index_name
,RTRIM(indschema) AS index_schema
,uniquerule
,indextype
FROM syscat.indexes
WHERE tabname = #table
AND tabschema = #schema
AND indextype = 'CLUS'
Then, to see the actual columns in that index, you perform this query:
SELECT colnames as name
FROM sysibm.sysindexes
WHERE name = #index_name
AND tbcreator = #index_schema
ORDER BY NAME
LUW returns the list of columns as one string, delimited by +, which is kind of weird...

Insert multiple rows where not exists PostgresQL

I'd like to generate a single sql query to mass-insert a series of rows that don't exist on a table. My current setup makes a new query for each record insertion similar to the solution detailed in WHERE NOT EXISTS in PostgreSQL gives syntax error, but I'd like to move this to a single query to optimize performance since my current setup could generate several hundred queries at a time. Right now I'm trying something like the example I've added below:
INSERT INTO users (first_name, last_name, uid)
SELECT ( 'John', 'Doe', '3sldkjfksjd'), ( 'Jane', 'Doe', 'adslkejkdsjfds')
WHERE NOT EXISTS (
SELECT * FROM users WHERE uid IN ('3sldkjfksjd', 'adslkejkdsjfds')
)
Postgres returns the following error:
PG::Error: ERROR: INSERT has more target columns than expressions
The problem is that PostgresQL doesn't seem to want to take a series of values when using SELECT. Conversely, I can make the insertions using VALUES, but I can't then prevent duplicates from being generated using WHERE NOT EXISTS.
http://www.techonthenet.com/postgresql/insert.php suggests in the section EXAMPLE - USING SUB-SELECT that multiple records should be insertable from another referenced table using SELECT, so I'm wondering why I can't seem to pass in a series of values to insert. The values I'm passing are coming from an external API, so I need to generate the values to insert by hand.
Your select is not doing what you think it does.
The most compact version in PostgreSQL would be something like this:
with data(first_name, last_name, uid) as (
values
( 'John', 'Doe', '3sldkjfksjd'),
( 'Jane', 'Doe', 'adslkejkdsjfds')
)
insert into users (first_name, last_name, uid)
select d.first_name, d.last_name, d.uid
from data d
where not exists (select 1
from users u2
where u2.uid = d.uid);
Which is pretty much equivalent to:
insert into users (first_name, last_name, uid)
select d.first_name, d.last_name, d.uid
from (
select 'John' as first_name, 'Doe' as last_name, '3sldkjfksjd' as uid
union all
select 'Jane', 'Doe', 'adslkejkdsjfds'
) as d
where not exists (select 1
from users u2
where u2.uid = d.uid);
a_horse_with_no_name's answer actually has a syntax error, missing a final closing right parens, but other than that is the correct way to do this.
Update:
For anyone coming to this with a situation like mine, if you have columns that need to be type cast (for instance timestamps or uuids or jsonb in PG 9.5), you must declare that in the values you pass to the query:
-- insert multiple if not exists
-- where another_column_name is of type uuid, with strings cast as uuids
-- where created_at and updated_at is of type timestamp, with strings cast as timestamps
WITH data (id, some_column_name, another_column_name, created_at, updated_at) AS (
VALUES
(<id value>, <some_column_name_value>, 'a5fa7660-8273-4ffd-b832-d94f081a4661'::uuid, '2016-06-13T12:15:27.552-07:00'::timestamp, '2016-06-13T12:15:27.879-07:00'::timestamp),
(<id value>, <some_column_name_value>, 'b9b17117-1e90-45c5-8f62-d03412d407dd'::uuid, '2016-06-13T12:08:17.683-07:00'::timestamp, '2016-06-13T12:08:17.801-07:00'::timestamp)
)
INSERT INTO table_name (id, some_column_name, another_column_name, created_at, updated_at)
SELECT d.id, d.survey_id, d.arrival_uuid, d.gf_created_at, d.gf_updated_at
FROM data d
WHERE NOT EXISTS (SELECT 1 FROM table_name t WHERE t.id = d.id);
a_horse_with_no_name's answer saved me today on a project, but had to make these tweaks to make it perfect.