Npgsql - FULL OUTER JOIN on two unrelated tables - postgresql

I have two tables, point_transactions which shows how users got and spent their in-app points, and wallet_transactions which shows how users got and spent their wallet money (real money). These two tables do not have direct relation with each other. They both have a created_on column which shows when they were created. I need to create a table that shows history of a user's transactions (both point and wallet). This table is sorted based on the creation time of the transaction and has paging, which means it's better to get paged result from database rather than loading all data into memory.
The following query gives me what I want:
select *,
case
when pt.id is null then wt.created_on
else pt.created_on
end as tx_created_on
from point_transactions as pt
full outer join wallet_transactions as wt on false
order by tx_created_on desc
Is there any way I can get this with EF Core?

Related

PostgreSQL how to GROUP BY single field from returned table

So I have complicated query, to simplify let it be like
SELECT
t.*,
SUM(a.hours) AS spent_hours
FROM (
SELECT
person.id,
person.name,
person.age,
SUM(contacts.id) AS contact_count
FROM
person
JOIN contacts ON contacts.person_id = person.id
) AS t
JOIN activities AS a ON a.person_id = t.id
GROUP BY t.id
Such query works fine in MySQL, but Postgres needs to know that GROUP BY field is unique, and despite it actually is, in this case I need to GROUP BY all returned fields from returned t table.
I can do that, but I don't believe that will work efficiently with big data.
I can't JOIN with activities directly in first query, as person can have several contacts which will lead query counting hours of activity several time for every joined contact.
Is there a Postgres way to make this query work? Maybe force to treat Postgres t.id as unique or some other solution that will make same in Postgres way?
This query will not work on both database system, there is an aggregate function in the inner query but you are not grouping it(unless you use window functions). Of course there is a special case for MySQL, you can use it with disabling "sql_mode=only_full_group_by". So, MySQL allows this usage because of it' s database engine parameter, but you cannot do that in PostgreSQL.
I knew MySQL allowed indeterminate grouping, but I honestly never knew how it implemented it... it always seemed imprecise to me, conceptually.
So depending on what that means (I'm too lazy to look it up), you might need one of two possible solutions, or maybe a third.
If you intent is to see all rows (perform the aggregate function but not consolidate/group rows), then you want a windowing function, invoked by partition by. Here is a really dumbed down version in your query:
.
SELECT
t.*,
SUM (a.hours) over (partition by t.id) AS spent_hours
FROM t
JOIN activities AS a ON a.person_id = t.id
This means you want all records in table t, not one record per t.id. But each row will also contain a sum of the hours for all values that value of id.
For example the sum column would look like this:
Name Hours Sum Hours
----- ----- ---------
Smith 20 120
Jones 30 30
Smith 100 120
Whereas a group by would have had Smith once and could not have displayed the hours column in detail.
If you really did only want one row per t.id, then Postgres will require you to tell it how to determine which row. In the example above for Smith, do you want to see the 20 or the 100?
There is another possibility, but I think I'll let you reply first. My gut tells me option 1 is what you're after and you want the analytic function.

What could be the reason for "heavy IO" activity in the database when SQL Server Change Tracking is enabled?

I am doing some tests to quantify the performance/reliability of the Change Tracking feature of SQL Server. I have a single table t1 in which I insert 1 million rows, once with Change Tracking OFF and once with Change Tracking On. I am monitoring the sizes of syscommittab, the size of the change tracking table and the I/Os recorded against the database. As is to be expected, the change tracking table and syscommitab only get populated when Change Tracking is ON. And I expect the IOs recorded against the database to be "proportional" to the sizes of these tables. But to my surprise, these are way off. The IOs recorded against the database are many orders more than the sizes of these 2 tables. Wondering if anyone knows why or can give me pointers to figuring it out. I am using sys.dm_io_virtual_file_stats() to determine the IO activity on the database and the following query to determine the sizes of the tracked table and syscommittab.
SELECT sct1.name as CT_schema,
sot1.name as CT_table,
ps1.row_count as CT_rows,
ps1.reserved_page_count*8./1024. as CT_reserved_MB,
sct2.name as tracked_schema,
sot2.name as tracked_name,
ps2.row_count as tracked_rows,
ps2.reserved_page_count*8./1024. as tracked_base_table_MB,
change_tracking_min_valid_version(sot2.object_id) as min_valid_version
FROM sys.internal_tables it
JOIN sys.objects sot1 on it.object_id=sot1.object_id
JOIN sys.schemas AS sct1 on
sot1.schema_id=sct1.schema_id
JOIN sys.dm_db_partition_stats ps1 on
it.object_id = ps1. object_id
and ps1.index_id in (0,1)
LEFT JOIN sys.objects sot2 on it.parent_object_id=sot2.object_id
LEFT JOIN sys.schemas AS sct2 on
sot2.schema_id=sct2.schema_id
LEFT JOIN sys.dm_db_partition_stats ps2 on
sot2.object_id = ps2. object_id
and ps2.index_id in (0,1)
WHERE it.internal_type IN (209, 210)
and (sot2.name='t1' or sot1.name='syscommittab')
I am checkpointing before running the queries.
Any tip or pointer appreciated.
Acknowledgements to https://www.brentozar.com/archive/2014/06/performance-tuning-sql-server-change-tracking/ for the SQL above.

SQL Natural Join

Okay. So the question that I got asked by the teacher was this:
(5 marks) Construct a SQL query on the dvdrental database that uses a natural join of two or more tables and an additional where condition. (E.g. find the titles of films rented by a particular customer.) Note the hints on the course news page if your query returns nothing.
Here is the layout of the database im working with:
http://www.postgresqltutorial.com/wp-content/uploads/2013/05/PostgreSQL-Sample-Database.png
The hint to us was this:
PostgreSQL hint:
If a natural join doesn't produce any results in the dvdrental DB, it is because many tables have the last update: timestamp field, and thus the natural join tries to join on that field as well as the intended field.
e.g.
select *
from film natural join inventory;
does not work because of this - it produces an empty table (no results).
Instead, use
select *
from film, inventory
where film.film_id = inventory.film_id;
This is what I did:
select *
from film, customer
where film.film_id = customer.customer_id;
The problem is I cannot get a particular customer.
I tried doing customer_id = 2; but it returns a error.
Really need help!
Well, it seems that you would like to join two tables that have no direct relation with each other, there's your issue:
where film.film_id = customer.customer_id
To find which films are rented by which customer you would have to join customer table with rental, then with inventory and finally with film.
The task description states
Construct a SQL query on the dvdrental database that uses a natural join of two or more tables and an additional where condition.quote

How do I prevent removing duplicate records from my Access query results?

The Data
I'm working in MS Access 2013. I have two tables ('Import' and 'Import-Pay'). I have a query that combines data from the two.
Import-Pay contains transaction data from a client, which includes the occasional duplicate transaction record (example: customer buys something +$50, customer returns -$50, customer changes mind and buys it again +$50). Its rare, but it happens. My issue is, when creating my clients billing report (the query), since the client is only recording date of sale in the transaction I end up with TWO +$50 records in the Import-Pay table.
The Query
I am querying the transaction data and marrying it with secondary table information on the customers via the query below:
SELECT DISTINCTROW Import.[ACCOUNT#] AS [ACCOUNT#], [Import-Pay].[Account Number], [Import-Pay].[Name], [Import-Pay].[P TRANS DT], [Import-Pay].[P Trans Amt], [Import-Pay].[Total Account Balance]
FROM Import RIGHT JOIN [Import-Pay] ON Import.[CD#] = [Import-Pay].[Account Number]
GROUP BY Import.[ACCOUNT#], [Import-Pay].[Account Number], [Import-Pay].[Name], [Import-Pay].[P TRANS DT], [Import-Pay].[P Trans Amt], [Import-Pay].[Total Account Balance];
My Issue
The tables are RIGHT joined, so ALL records from my 'Import-Pay' table should be displayed... but for some reason the duplicate records of Import-Pay are lost after the query runs giving me a different total [Trans Amt].
Troubleshooting
I've double checked my table join to make sure that's not the issue.
I've tried removing the 'Group By' statement
I've removed the 'DISTINCTROW' function
I've messed with this for two days now and I'm out of ideas. A fresh set of eyes on the problem would be greatly appreciated!
Thanks!
You don't have any Aggregate functions, so get rid of the GROUP BY clause. Also remove DISTINCTROW.
Your Import-Pay table should have a Primary Key. Include this column (if it is a composite key, all columns) in the SELECT list.
If it doesn't have a Primary Key, create one (an AutoNumber column works fine).

Aggregate functions : the correct syntax in SQL query

I have a table that contains Customer No_ and Amount on each row. I want to write a report that shows total amount by customer, by salesperson. I can inner join another table to get the salesperson. Just can't figure out the Customer No_ and Amount query.
I think you can just add sum(i.Amount) to your select and also to your group by. (Sorry, I don't have access to SQL Server at the moment to test this.)