If two PostgreSQL table has gist index, can their union be considered as an indexed table? - postgresql

I have three tables, table_a, table_b, table_c. All of them has gist index.
I would like to perform a left join between table_c and the UNION of table_a and table_b.
Can the UNION be considered "indexed"? I assume it would better to create new table as the UNION, but these tables are huge so I try to avoid this kind of redundancy.
In terms of SQL, my question:
Is this
SELECT * FROM myschema.table_c AS a
LEFT JOIN
(SELECT col_1,col_2,the_geom FROM myschema.table_a
UNION
SELECT col_1,col_2,the_geom FROM myschema.table_b) AS b
ON ST_Intersects(a.the_geom,b.the_geom);
equal to this?
CREATE TABLE myschema.table_d AS
SELECT col_1,col_2,the_geom FROM myschema.table_a
UNION
SELECT col_1,col_2,the_geom FROM myschema.table_b;
CREATE INDEX idx_table_d_the_geom
ON myschema.table_d USING gist
(the_geom)
TABLESPACE mydb;
SELECT * FROM myschema.table_c AS a
LEFT JOIN myschema.table_d AS b
ON ST_Intersects(a.the_geom,b.the_geom);

You can look at the execution plan with EXPLAIN, but I doubt that it will use the indexes.
Rather than performing a left join between one table and the union of three other tables, you should perform the union of the left joins between the one table and each of the three tables in turn. That will be a longer statement, but PostgreSQL will be sure to use the index if that can speed up the left joins.
Be sure to use UNION ALL rather than UNION unless you really have to remove duplicates.

Related

Indexes to support OR condition over a JOIN

I'm wondering if Postgres has any support optimizing for following fundamental problem.
I want to do a search a agains two columns on different tables joined via a foreign key. I have created an index for each column. If I do my join query and have a where condition for either one or the other column, the respective index is used to filter the result and the query performance is great. If use two where clause combined by an OR for one field on each table, the query gets very slow and no indexes are used. Presumably this is because the optimizer sees no other way than doing a full table join and scan to resolve. The query looks something like this:
select table1.id
from table1
left join table2 on table1.fk = table2.id
where table1.haystack ilike '%needle%' or table2.haystack ilike '%needle%'
The operation (ilike) isn't the issue and interchangeable, I have a working Trigram index setup. I just want to find out if there is any other way to make this type of query performant beside denormalizing all searched fields into one table.
I would be very greateful for any ideas.
No, there is no special support in the database to optimize this. Do it yourself:
SELECT table1.id
FROM table1
JOIN table2 ON table1.fk = table2.id
WHERE table1.haystack ILIKE '%needle%'
UNION
SELECT table1.id
FROM table1
JOIN table2 ON table1.fk = table2.id
WHERE table2.haystack ILIKE '%needle%'
Provided both conditions are selective and indexed with a trigram index, and you have indexes on the join condition, that will be faster.

Should I do ORDER BY twice when selecting from subquery?

I have SQL query (code below) which selects some rows from subquery. In subquery I perform ORDER BY.
The question is: will order of subquery be preserved in parent query?
Is there some spec/document or something which proves that?
SELECT sub.id, sub.name, ot.field
FROM (SELECT t.id, t.name
FROM table t
WHERE t.something > 10
ORDER BY t.id
LIMIT 25
) sub
LEFT JOIN other_table ot ON ot.table_id = sub.id
/**order by id?**/```
will order of subquery be preserved in parent query
It might happen, but you can not rely on that.
For example, if the optimizer decides to use a hash join between your derived table and other_table then the order of the derived table will not be preserved.
If you want a guaranteed sort order, then you have to use an order by in the outer query as well.

PostgreSQL 9.4.5: Limit number of results on INNER JOIN

I'm trying to implement a many-to-many relationship using PostgreSQL's Array type, because it scales better for my use case than a join table would. I have two tables: table1 and table2. table1 is the parent in the relationship, having the column child_ids bigint[] default array[]::bigint[]. A single row in table1 can have upwards of tens of thousands of references to table2 in the table1.child_ids column, therefore I want to try to limit the amount returned by my query to a maximum of 10. How would I structure this query?
My query to dereference the child ids is SELECT *, json_agg(table2.*) as children FROM table1 INNER JOIN table2 ON table2 = ANY(table1.child_ids). I don't see a way I could set a limit without limiting the entire response as a whole. Is there a way to either limit this INNER JOIN, or at least utilize a subquery to that I can use LIMIT to restrict the amount of results from table2?
This would have been dead simple with properly normalized tables, but here goes with arrays:
SELECT *
FROM table1 t1, LATERAL (
SELECT json_agg(*) AS children
FROM table2
WHERE id = ANY (t1.child_ids)
LIMIT 10) t2;
Of course, you have no influence over which 10 rows per id of table2 will be selected.

Select from view and join from one table or another if no record available in first

I have a view and two tables. Tables one and two have the same columns, but table one is has as small number of records, and table two has old data and a huge number of records.
I have to join a view with these two tables to get the latest data from table one; if a record from the view is not available in table one then I have to select the record from table two.
How can i achieve this with MySQL?
I came to know by doing some research in internet that we can't apply full join and sub query in from clause.
Just do a simple UNION of the results excluding the records in table2 that are already mentioned in table1:
SELECT * FROM table1
UNION
SELECT * FROM table2
WHERE NOT EXISTS (SELECT * FROM table1 WHERE table2.id = table1.id)
Something like this.
SELECT *
FROM view1 V
INNER JOIN (SELECT COALESCE(a.commoncol, b.commoncol) AS commoncol
FROM table1 A
FULL OUTER JOIN table2 B
ON A.commoncol = B.commoncol) C
ON v.viewcol = c.commoncol
If you are using Mysql then check here to simulate Full Outer Join in MySQL
are you trying to update the view from two tables where old record in view needs to be overwritten by latest/updated record from table1 and non existant records from table1 to be appended from table2?
, or are you creating a view from two tables?

Intersect of Select Statements based on a particular column

I have a Q about INTERSECT clause between two select statements in Sql server 2008.
Select 1 a,b,c ..... INTERSECT Select 2 a,b,c....
Here, the datasets of the two queries should exactly match to return the common elements.
But, I want only column a of both select statements to match.
If the values of column a in both the queries have same values, the entire row should appear in the result set.
Can i Do that and How ??
Thanks,
Marcus..
The best thing to do is to look at the queries itself. DO they need an INTERSECT, of is it possible to make a join with it
for example.
An INTERSECT looks like this
select columnA
from tableA
INTERSECT
select columnAreference
from tableB
Your result would have all columns that are in BOTH tables.. so a join would be more usefull
select columnA
from tableA a
inner join tableB b
on b.columnAReference = a.columnA
If you look into the execution plan you'll see that the INTERSECT will do a left semi join and the inner join will do a, like expected, an inner join. A left semi join isn't something you can tell the query optimizer to do, BUT IT IS FASTER!!!! A left semi join will only return 1 row from the left table, where a normal join will return them all. In this particular case it will be faster.
So an INTERSECT isn't a bad thing which should be eliminated with an INNER JOIN construction, sometimes it will perform even better.
However, to give you the best answer, i will need some more details about your query :)
select * from table1 t1 inner join Table2 t2
on t1.col1=t2.col1