How to count specific column in psql? - postgresql

Firstly, I'm so sorry with the basic question. I want to sum child data and count number of transaction as the following field:
amount (totally)
level (totalLevel)
number of trasanction (Transaction Times)
I have 2 table which related. One user has many transaction.
User Table
id
name
Transaction Table
id
user_id
amount
level
Here is query that I have test. But, it seem not work as expected:
const query = `
SELECT
u.*,
't.amount',
't.level'
COUNT('t.amountDiffCents') as "numberOfTransaction",
SUM('t.level') as "Total Level",
COUNT(u.*) OVER () as "totalCount"
FROM "LoyaltyUser" as u
INNER JOIN "Transaction" as t ON u.id = 't.userId'
GROUP BY u.id
LIMIT $limit
OFFSET $offset;
`;
Thank beforehand.

Aggregate the Transactions table separately, then JOIN Users only afterwards:
SELECT
u.*,
agg.txn_count,
agg.sum_level
FROM
(
SELECT
t.user_id,
COUNT(*) AS txn_count,
SUM( t.level ) AS sum_level
FROM
transaction AS t
GROUP BY
t.user_id
) AS agg
INNER JOIN loyalty_user AS u ON
u.uid = agg.user_id
Strictly speaking (ISO SQL) it is not possible to meaningfully include the total number of all transaction rows in this single result-set (at least, not without having a repeating value in every row, ew). Instead that can be trivially performed by application code - or use a second query in the same batch.

Related

Using two COUNT in SELECT returns the same values

SELECT user_posts.id,
COUNT(user_post_comments.post_id) as number_of_comments,
COUNT(user_post_reactions.post_id) as number_of_reactions
FROM user_posts
LEFT JOIN user_post_comments
ON (user_posts.id = user_post_comments.post_id)
LEFT JOIN user_post_reactions
ON (user_posts.id = user_post_reactions.post_id)
WHERE user_posts.user_id = '850e6511-2f30-472d-95a1-59a02308b46a'
group by user_posts.id
I have this query for getting the number of comments and reactions from another table by post_id
current output screenshot
To caluclate number of comments and reactions just use subqueries. No need to join and group by.
SELECT user_posts.id,
( select COUNT(*) from user_post_comments
where user_posts.id = user_post_comments.post_id
) as number_of_comments,
( select COUNT(*) from user_post_reactions
where user_posts.id = user_post_reactions.post_id
) as number_of_reactions
FROM user_posts
WHERE user_posts.user_id = '850e6511-2f30-472d-95a1-59a02308b46a'
If both joins return non-null rows, what you get for each count is the product of the number of rows from each for a given user_posts.id. One way you could fix that is by counting distinct identifiers for each table, e.g.
COUNT(DISTINCT user_post_comments.id) as number_of_comments
(Assuming "id" exists as a primary key on that table). This may not be spectacularly efficient, but is relatively simple.

GROUP BY one column, then by another column

SELECT lkey, max(votecount) FROM VOTES
WHERE ekey = (SELECT ekey FROM Elections where electionid='NR2019')
GROUP BY lkey
ORDER BY lkey ASC
Is there an easy way to get the pkey in this Statement?
Solution should look like this
Use DISTINCT ON:
SELECT DISTINCT ON (v.ikey) v.*
FROM VOTES v
INNER JOIN Elections e ON e.ekey = v.ekey
WHERE e.electionid = 'NR2019'
ORDER BY v.ikey, v.votecount DESC;
In plain English, the above query says to return the single record for each ikey value having the highest vote count.

How to make postgres (cursor?) start at particular row

I have created the following query:
select t.id, t.row_id, t.content, t.location, t.retweet_count, t.favorite_count, t.happened_at,
a.id, a.screen_name, a.name, a.description, a.followers_count, a.friends_count, a.statuses_count,
c.id, c.code, c.name,
t.parent_id
from tweets t
join accounts a on a.id = t.author_id
left outer join countries c on c.id = t.country_id
where t.row_id > %s
-- order by t.row_id
limit 100
Where %s is a number that starts at 0 and is incremented by 100 after each such query is conducted. I want to fetch all records from the database using this method, where I just increase the %s in the where condition. I found this approach on https://ivopereira.net/efficient-pagination-dont-use-offset-limit. I also included a column in my table which is corresponding to row number (I named it row_id). Now the problem is when I run this query the first time, it returns rows which have an row_id of 3 million. I would like the cursor (not sure if my terminology is correct) to start from rows with row_id 1 through 100 and so on. The table contains 7 million rows. Am I missing something obvious with which I could achieve my goal?

Table doesn't exist in nested query with aggregations

My database name is test, I have a table named HaveGoal.
I am querying this:
SELECT Rel.total
FROM (SELECT H.number, H.teamname, SUM(H.numberofgoals) AS total
FROM HaveGoal H GROUP BY H.number,H.teamname) AS Rel
WHERE Rel.total = (SELECT MAX(Rel.total) FROM Rel)
It gives:ERROR 1146 (42S02): Table 'test.Rel' doesn't exist
It seems that the last subselect cannot reference a nested query defined in the FROM clause.
In this case you have multiple solutions :
Duplicate the first subselect inside the second (and hope that performance will not be too poor)
Define a view to make the nested query available everywhere
As you are looking for the maximum, you could sort data and take only the first line
If you where not on MySQL, you could use the WITH statement
Duplicating will work in any situation :
SELECT Rel.total
FROM (
SELECT H.number, H.teamname, SUM(H.numberofgoals) AS total
FROM HaveGoal H
GROUP BY H.number,H.teamname
) AS Rel
WHERE Rel.total = (
SELECT MAX(Rel2.total)
FROM (
SELECT H.number, H.teamname, SUM(H.numberofgoals) AS total
FROM HaveGoal H
GROUP BY H.number,H.teamname
) AS Rel2
)
Taking the first line after sorting is much shorter, but the MAX is implied :
SELECT Rel.total
FROM (
SELECT H.number, H.teamname, SUM(H.numberofgoals) AS total
FROM HaveGoalTest H
GROUP BY H.number,H.teamname
) AS Rel
ORDER BY total DESC
LIMIT 1

T-SQL query one table, get presence or absence of other table value

I'm not sure what this type of query is called so I've been unable to search for it properly. I've got two tables, Table A has about 10,000 rows. Table B has a variable amount of rows.
I want to write a query that gets all of Table A's results but with an added column, the value of that column is a boolean that says whether the result also appears in Table B.
I've written this query which works but is slow, it doesn't use a boolean but rather a count that will be either zero or one. Any suggested improvements are gratefully accepted:
SELECT u.number,u.name,u.deliveryaddress,
(SELECT COUNT(productUserid)
FROM ProductUser
WHERE number = u.number and productid = #ProductId)
AS IsInPromo
FROM Users u
UPDATE
I've run the query with actual execution plan enabled, I'm not sure how to show the results but various costs are:
Nested Loops (left semi join): 29%]
Clustered Index scan (User Table): 41%
Clustered Index Scan (ProductUser table): 29%
NUMBERS
There are 7366 users in the users table and currently 18 rows in the productUser table (although this will change and could be in the thousands)
You can use EXISTS to short circuit after the first row is found rather than COUNT-ing all matching rows.
SQL Server does not have a boolean datatype. The closest equivalent is BIT
SELECT u.number,
u.name,
u.deliveryaddress,
CASE
WHEN EXISTS (SELECT *
FROM ProductUser
WHERE number = u.number
AND productid = #ProductId) THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END AS IsInPromo
FROM Users u
RE: "I'm not sure what this type of query is called". This will give a plan with a semi join. See Subqueries in CASE Expressions for more about this.
Which management system are you using?
Try this:
SELECT u.number,u.name,u.deliveryaddress,
case when COUNT(p.productUserid) > 0 then 1 else 0 end
FROM Users u
left join ProductUser p on p.number = u.number and productid = #ProductId
group by u.number,u.name,u.deliveryaddress
UPD: this could be faster using mssql
;with fff as
(
select distinct p.number from ProductUser p where p.productid = #ProductId
)
select u.number,u.name,u.deliveryaddress,
case when isnull(f.number, 0) = 0 then 0 else 1 end
from Users u left join fff f on f.number = u.number
Since you seem concerned about performance, this query can perform faster as this will cause index seek on both tables versus an index scan:
SELECT u.number,
u.name,
u.deliveryaddress,
ISNULL(p.number, 0) IsInPromo
FROM Users u
LEFT JOIN ProductUser p ON p.number = u.number
WHERE p.productid = #ProductId