How to do indexing of two join tables in Postgres? - postgresql

I have two very big tables for example tab1 and tab2 in Postgres. I need to perform join and group by operation on these tables, to make my query fast i need indexing. Is there any way how to use index over join and group by queries ?? (as I heard indexing is not possible over join and group by)

Related

Merge join in PostgreSQL performs sort on indexed column

I am trying to optimize the following query in postgresql
SELECT ci.country_id, ci.ci_id,ci.name
FROM customer c
INNER JOIN address a ON c.a_id = a.a_id
INNER JOIN city ci ON ci.ci_id = a.ci_id
The columns customer.a_id, address.a_id, city.ci_id and adress.ci_id all have an btree index.
I wanted to use a merge join instead of a hash join as I read that a hash join not really uses indexes so I turned of the hash joins with Set enable_hashjoin=off.
My query is now according to the query plan using a merge join but it is performing always a quick sort before the merge join. I know that for merge join the columns need to be sorted but they should already be sorted through the index. Is there a way to force Postgres to use the index and not to perform the sort?
You are joining three tables. It is using two merge joins to do that, with the output of one merge join being one input of the other. The intermediate table is joined using two different columns, but it can't be ordered on two different columns simultaneously, so if you are only going to use merge joins, you need at least one sort.
This whole thing seems pointless, as the query is already very fast, and why do you care if it uses a hash join or not?

Quicksight query two tables has problem with left join

I have a bad result with using query in Quicksight.
I have one table with campaign, if I query just on this table, it is ok; I have all campaigns in my list. But when I use left join, i have just the results which match with join table.
Is this normal?
I have already tried all join possibilities and it is the same.

Nested Join vs Merge Join vs Hash Join in PostgreSQL

I know how the
Nested Join
Merge Join
Hash Join
works and its functionality.
I wanted to know in which situation these joins are used in Postgres
The following are a few rules of thumb:
Nested loop joins are preferred if one of the sides of the join has few rows. Nested loop joins are also used as the only option if the join condition does not use the equality operator.
Hash Joins are preferred if the join condition uses an equality operator and both sides of the join are large and the hash fits into work_mem.
Merge Joins are preferred if the join condition uses an equality operator and both sides of the join are large, but can be sorted on the join condition efficiently (for example, if there is an index on the expressions used in the join column).
A typical OLTP query that chooses only one row from one table and the associated rows from another table will always use a nested loop join as the only efficient method.
Queries that join tables with many rows (which cannot be filtered out before the join) would be very inefficient with a nested loop join and will always use a hash or merge join if the join condition allows it.
The optimizer considers each of these join strategies and uses the one that promises the lowest costs. The most important factor on which this decision is based is the estimated row count from both sides of the join. Consequently, wrong optimizer choices are usually caused by misestimates in the row counts.

Joining a large table on two conditions without timing out

I'm trying to join two tables together and one table has two identifiers that I want.
select stations.id, stations.name, count (rentals.startstation_id) as station_starts, count (rentals.endstation_id) as station_ends
from ny.station as stations
Left Join ny.rental as rentals on stations.id = rentals.startstation_id
Left Join ny.rental as rentals on station.id = rentals.endstation_id
group by 1,2
However the ny.rental table is very large and when I run this query my SQL Workbench/J crashes. The stations table is rather small.
What is the optimal way to construct this query?

sql query to retrieve DISTINCT rows on left join

I am developing a t-sql query to return left join of two tables, but when I just select records from Table A, it gives me only 2 records. The problem though is when I left join it Table B, it gives me 4 records. How can I reduce this to just 2 records?
One problem though is that I am only aware of one PK/FK to link these two tables.
The field you are using for the join must exist more than once in table B - this is why multiple rows are being returned in the join. In order to reduce the row count you will have to either add further fields to the join, or add a where clause to filter out rows not required.
Alternatively you could use a GROUP BY statement to group the rows up, but this may not be what you need.
Remember that the left join brings you null fields from joined table.
Also you can use select(distinct), but i can't see well you issue. Can you give us more details?