How to Optimize PostgresqlSQL Query which contain INNER JOIN

How to Optimize PostgresqlSQL Query which contain INNER JOIN - postgresql

I am trying to create a view which contain Suspicious order from a orders table.
The condition for the suspicious order is, every new order(in an interval), which have "New Customer" tag and used the discount codes(_sdc_sequence) from a table orders__discount_codes and either zip code or phone number of customer is matching in previous to that interval.
My attempt is
Created a view which contain all old orders(previous to the interval of 2 days) with "new customer tag"
CREATE OR REPLACE VIEW schema.old_orders_view AS
SELECT odr.id, odr.customer__id,odr.name, odr.billing_address__phone, odr.shipping_address__zip,odr.order_number, odr.updated_at
FROM schema.orders odr, schema.orders__discount_codes odc
WHERE odr._sdc_sequence=odc._sdc_sequence
AND
odr.updated_at<now() - interval '2 day'
AND
odr.tags LIKE'%New Customer%'
AND
odr.cancelled_at is null
AND
odr.confirmed ='t';
Created a view containing new orders(within the period 2 days)
CREATE OR REPLACE VIEW schema.new_orders_view AS
SELECT odr.id, odr.customer__id,odr.name, odr.billing_address__phone, odr.shipping_address__zip,odr.order_number, odr.updated_at
FROM schema.orders odr, schema.orders__discount_codes odc
WHERE odr._sdc_sequence=odc._sdc_sequence
AND
odr.updated_at>=now() - interval '2 day'
AND
odr.tags LIKE'%New Customer%'
AND
odr.cancelled_at is null
AND
odr. confirmed ='t';
later inner joined them
CREATE OR REPLACE VIEW schema.suspicious_orders_view AS
SELECT n_odr.customer__id new_customer__id,n_odr.name new_name,o_odr.customer__id old_customer__id,o_odr.name old_name,o_odr.updated_at old_updated_at,n_odr.updated_at new_updated_at, o_odr.id old_id, n_odr.id new_id
FROM
schema.new_orders_view n_odr, schema.old_orders_view o_odr
WHERE
o_odr.billing_address__phone=n_odr.billing_address__phone
OR
o_odr.shipping_address__zip=n_odr.shipping_address__zip;
What I need is the third View( suspicious_orders_view
).
Is there any way to optimize these queries? The table contain more than one 100K records. every day there is 50- 100 new records in new_orders_view
If the query is without two dummy view is more good (If it is not possible to optimize doing this also great).
I have used this in my application and tring to connect with google data studio
got error
ERROR:
Unable to Connect Host: An I/O error occurred while sending to the backed.
So, optimizing the query will be more appropriate.
I am using Postgresql 10.
Any help would be appreciated. Thank you in advance.

Related

Creating a column that returns date based on various conditions

Context: I'm fairly new to coding as a whole and is learning SQL. This is one of my practice/training session
I'm trying to create a Dimension Table called "Employee Info" using the Adventureworks2019 public Database. Below is my attempt query to fetch all the data needed for this table.
SELECT
e.BusinessEntityID AS EmployeeID,
EEKey = ROW_NUMBER() OVER(ORDER BY(SELECT NULL)),
p.FirstName,
p.MiddleName,
p.LastName,
p.PersonType,
e.Gender,
e.JobTitle,
ep.Rate,
ep.PayFrequency,
e.BirthDate,
e.HireDate,
ep.RateChangeDate AS PayFrom,
e.MaritalStatus
From HumanResources.Employee AS e FULL JOIN
Person.Person AS p ON p.BusinessEntityID = e.BusinessEntityID FULL JOIN
Person.BusinessEntityAddress AS bea ON bea.BusinessEntityID = e.BusinessEntityID FULL JOIN
HumanResources.EmployeePayHistory AS ep ON ep.BusinessEntityID = e.BusinessEntityID
Where
PersonType='SP'
OR PersonType='EM'
ORDER BY EmployeeID;
Query result
Each employee (EE for short) will have a unique [EmployeeID]. The [EEKey] is simply used to mark ordinal numbers of each record.
EEs are paid different rates shown in the [Rate] column. There will be duplicate records if any EE receives a change in his/her pay rate.
There is currently a [PayFrom] column indicating the first date a pay rate is being applied to each record.
Current requirements: Create a [PayTo] column on the right of [PayFrom] to return the last date each EE is getting paid their corresponding pay rate. There should be 2 scenarios:
If the EE being checked has multiple records, meaning his/her pay rate was adjusted at some point. [PayTo] will return the [PayFrom] date of the next record minus 1 day.
If the EE being checked does not have any additional record indicating pay rate changes. [PayTo] will return a fixed day that was specified (Say 31/12/2070)
Example:
[EmployeeID] no. 4 - Rob Walters with 3 consecutive records in Line 4,5,6. In Line 4, the [PayTo] column is expected to return the [PayFrom] date of Line 5 minus 1 day (2010-05-30). The same rule should be applied for Line 5, returning (2011-12-14).
As for Line 6, since there is no additional similar record to fetch data from, it will return the specified date (2070-12-31), using the same rule as every single-record EE.
As I have mentioned, I am a fresher and completely new to coding, so my interpretation and method might be off. If you can kindly point out what I'm doing wrong or show me what should I do to solve this issue, it will be much appreciated.

How to filter data based on a time parameter in Access?

I have a query from another thread which goes through a list of different events and pulls out the most recent event and puts it into a list. The code I'm using is:
SELECT Cleaning1, Max(Date1) AS most_recent
FROM CleaningLog
GROUP BY Cleaning1;
Cleaning1 is the column that has the different cleanings, and Date1 is the column that has the date the cleaning occurred, and CleaningLog is the name of the table. I currently have a macro in Access which is an OpenQuery, query. I am having it open the above query, and then having it view as a data sheet and it's in edit mode.
What I am stuck on, is getting a subsequent macro/query/vba code to take the datasheet the query produces and going through each item and determining if they're over due to be cleaned. I tried having a Make Table query, but the problem is, there is no user friendly way to refresh that table without having to delete it (I am having unskilled workers use this Access sheet).
I am wondering if there's a way to look at the most recent cleaning's date, what the query produces, and filter the dates out that are over due for a cleaning, specified by a parameter. I have been looking at this webpage to start playing with the notation, but I haven't been able to come up with much that is useful.
https://support.office.com/en-us/article/Examples-of-query-criteria-3197228C-8684-4552-AC03-ABA746FB29D8
Another problem that I am encountering is that each cleaning doesn't have the same time frame in which is needs to be cleaned.
Thank you in advance for any help!!

You should just be able to modify the query above to show entries with a max date lower than they should be. Below shows entries that haven't been cleaned in 30 days, for instance.
SELECT Cleaning1, Max(Date1) AS most_recent
FROM CleaningLog
GROUP BY Cleaning1
HAVING Max(Date1) < Now() - 30;

Checking for rows relating to previous days which might not exist

I am having some trouble in my check of whether or not I received prices yesterday for let´s say - my apples.
The tricky part is that in the table where prices are stored, there won´t be any row relating to yesterday if I did not get prices yesterday. So how can I make my check everyday if I want to be sure that the day before I got some prices?

If you have a Calendar table (see here for example) with a field called Date and making some assumptions about your data structure:
SELECT c.[Date],
ISNULL(p.Prices,'No Prices')
FROM Calendar c
LEFT JOIN Prices p ON c.[Date] = p.[Date]
Your question is not very clear, but it actually might even be as simple as just checking for the presence of a row for the previous day, rather than reporting across all dates (in this case I consider there are multiple products):
SELECT DISTINCT
prod.Product,
CASE WHEN prev.Product IS NULL
THEN 'No Prices for yesterday'
ELSE 'Prices recorded for yesterday'
END AS PricesYesterday
FROM Prices prod
LEFT JOIN Prices prev ON prev.Product = prod.Product
AND prev.[Date] = dateadd(day,datediff(day,0,GETDATE()),0) - 1

Crystal Reports Sum per date from 2 different reports

I have spent several hours trying to google my issue with no luck. I was wondering if anyone here would know how to do this.
I have 2 separate unrelated tables:
The first table has bank deposits(all deposits for a specific day) with the amounts example:
1/4/16 $10
1/4/16 $20
1/5/16 $15
1/5/16 $25
The second table has transactions from my billing software example:
1/4/16 $5
1/4/16 $12
1/4/16 $17
1/5/16 $22
1/5/16 $2
1/5/16 $4
I need to create a report, so that I can pull the sum for each day-
1/4/16 - first table sum: $30 - second table sum: $34
1/5/16 - first table sum: $40 - second table sum: $28
Is this possible? If so, how can I do this. I can get the sums for each table separately by using a group on the specific date field but I can not figure out how to do them both at the same time.

I you don't want to do much on database then create two sub-reports
Group data on date for both reports(Suppress details and header if you don't need them).
In both report Place Group name then First/Second table sum then Grand total Summary for your amount field.
Place both reports in front of each other and align them base line.
This will hopefully work for you if min date in both tables is same.
If min date is different then you can create a view or stored procedure to join data and just group it on date.

You can link both the tables using joins on date column and just group by date and take summary in group footer
Edit.......
I assume your last record will be duplicated so use running total.
Create a new running total
First select the field to summarize.
Second evaluate select radio button on every record
Third reset...select on change of your calculated field.
Suppose if you are summarizing amount field then
Filed to summarize: select amount
Evaluate: select for every record
Reset: on change of field amount

Reporting on multiple tables independently in Crystal Reports 11

I am using Crystal Reports Developer Studio to create a report that reports on two different tables, let them be "ATable" and "BTable". For my simplest task, I would like to report the count of each table by using Total Running Fields. I created one for ATable (Called ATableTRF) and when I post it on my report this is what happens:
1) The SQL Query (Show SQL Query) shows:
SELECT "ATABLE"."ATABLE_KEY"
FROM "DB"."ATABLE" "ATABLE"
2) The total records read is the number of records in ATable.
3) The number I get is correct (total records in ATable).
Same goes for BTableTRF, if I remove ATableTRF I get:
1) The SQL Query (Show SQL Query) shows:
SELECT "BTABLE"."BTABLE_KEY"
FROM "DB"."BTABLE" "BTABLE"
2) The total records read is the number of records in BTable.
3) The number I get is correct (total records in BTable).
The problems starts when I just put both fields on the reports. What happens then is that I get the two queries one after another (since the tables are not linked in crystal reports):
SELECT "ATABLE"."ATABLE_KEY"
FROM "DB"."ATABLE" "ATABLE"
SELECT "BTABLE"."BTABLE_KEY"
FROM "DB"."BTABLE" "BTABLE"
And the number of record read is far larger than each of the tables - it doesn't stop. I would verify it's count(ATable)xcount(BTable) but that would exceed my computer's limitation (probably - one is around 300k rows the other around 900k rows).
I would just like to report the count of the two tables. No interaction is needed - but crystal somehow enforces an interaction.
Can anyone help with that?
Thanks!

Unless there is some join describing the two tables' relationship, then the result will be a Cartesian product. Try just using two subqueries, either via a SQL Command or as individual SQL expressions, to get the row counts. Ex:
select count(distinct ATABLE_KEY) from ATABLE
If you're not interested in anything else in these tables aside from the row counts, then there's no reason to bring all those rows into Crystal - better to do the heavy lifting on the RDBMS.

You could UNION the two queries. This would give you one record set containing rows from each query once.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse