SQL Server SUM GROUP BY Help - tsql

I have a table that contains an orderid, an inventoryid, and a quantity -- its a line items table. The database is SQL Server 2008.
What I need to know is how to write a SQL statement that returns the sums of quantities for that itemid at that order, not counting orders that have larger orderids than it. It must return the orderid, itemid, and total.
Any help? Thanks!

Guess:
SELECT
SUM(quantity) AS total, --"sums of quantities for that itemid at that order"
orderid, inventoryid --"It must return the orderid, itemid"
WHERE
orderid < (some larger order id value) --"not counting orders that have larger orderids"
GROUP BY
orderid, inventoryid

Related

How to update duplicate rows in a table n postgresql

I have created synthetic data for a typical call center.
Below is the screenshot of the table I have created.
Table 1:
Problem statement: Since this is completely random data, I noticed that there are some customers who are being assigned to the same agents whenever they call again.
So using this query I was able to test such a case and count the number of times agents are being repeated for each customer.
select agentid, customerid, count(customerid) from aa_dev.calls group by agentid, customerid having count(customerid) > 1 ;
Table 2
I have a separate agents table to called aa_dev.agents in which the agent's ids are stored
Now I want to replace the agentid for such cases, such that if agentid is repeated 6 times for a single customer then 5 of the times the agent id should be updated with any other agentid from the table but call time shouldn't be overlapping That means the agent we are replacing with should not be busy on the time the call is going one.
I have assigned row numbers to each repeated ones.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY agentid, customerid ORDER BY random()) rn,
COUNT(*) OVER (PARTITION BY agentid, customerid) cnt
FROM aa_dev.calls
)
SELECT agentid, customerid, rn
FROM cte
WHERE cnt > 1;
This way I could visualize the repetition clearly.
So I don't want to update row 1 but the rest.
Is there any way I can acheive this? Can I use the row number and write a query according to the row number to update rownum 2 onwards row one by one with each row having a unique agent?
If you don't want duplicates in your artificial data, it's probably better to not generate them.
But if you already have a table with duplicates and want to work on the duplicates, either updating them or deleting, here is the easy way:
You need a unique ID for each updated row. If you don't have it,
add it temporarily. Then you can use this pattern to update all duplicates
except the first one:
To add artificial id column to preexisting table, use:
ALTER TABLE calls ADD id serial;
In my case I generated a test table with 100 random rows:
CREATE TEMP TABLE calls (id serial, agentid int, customerid int);
INSERT INTO calls (agentid, customerid)
SELECT (random()*10)::int, (random()*10)::int
FROM generate_series(1, 100) n;
Define what constitutes a duplicate and find duplicates in data:
SELECT agentid, customerid, count(*), array_agg(id) id
FROM calls
GROUP BY 1,2 HAVING count(*)>1
ORDER BY 1,2;
Update all the duplicate rows except first one with NULLs:
UPDATE calls SET agentid = whatever_needed
FROM (
SELECT array_agg(id) id, min(id) idmin FROM calls
GROUP BY agentid, customerid HAVING count(*)>1
) AS dup
WHERE calls.id = ANY(dup.id) AND calls.id <> dup.idmin;
Alternatively, remove all duplicates except first one:
DELETE FROM calls
USING (
SELECT array_agg(id) id, min(id) idmin FROM calls
GROUP BY agentid, customerid HAVING count(*)>1
) AS dup
WHERE calls.id = ANY(dup.id) AND calls.id <> dup.idmin;

How do I make my RANK () OVER query work in select?

table image
I have this table that I need to sort in the following way:
need to rank Departments by Salary;
need to show if Salary = NULL - 'No data to be shown' message
need to add total salary paid to the department
need to count people in the department
SELECT RANK() OVER (
ORDER BY Salary DESC
)
,CASE
WHEN Salary IS NULL
THEN 'NO DATA TO BE SHOWN'
ELSE Salary
,Count(Fname)
,Total(Salary) FROM dbo.Employees
I get an error saying:
Column 'dbo.Employees.Salary' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Why so?
Column 'dbo.Employees.Salary' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
Why so?
The aggregate functions are returning a single value for the whole table, you can't SELECT a field alongside them it doesn't makes sense. Like say, you have a students table you apply Sum(marks) for the whole students table, and you are then also selecting student's name Select studentname in your query. Which student's name will the database engine select? Confusing
Column "invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause"
I tried this-
using inner query
SELECT RANK() OVER (ORDER BY SAL DESC) RANK,FNAME,DEPARTMENT
CASE
WHEN SAL IS NULL THEN 'NO DATA TO BE SHOWN'
ELSE SAL
END
FROM
(SELECT COUNT(FNAME) FNAME, SUM(SALARY) SAL, DEPARTMENT
FROM TESTEMPLOYEE
GROUP BY DEPARTMENT) t

most efficient query to get the first and last record id in a large dataset

I need to write a query against a large dataset to get the first and last record id, plus the first record created time. The sample of the data is as following:
In the above case, if the category "Blue" is passed into the query as parameter, I will expect to return "A12, 13:00, E66" as the result of the query.
I can using aggregate function to get the max and min time from the dataset, and join to get the first and last record. But just wondering whehter there is a more effecient way to achieve the same output?
My advice would be to try to reduce the number of scan/seek operations by comparing execution plans and to place indexes on the categoryID (for lookup) and time (for sorting) columns.
If you have SQL Server 2008 or later, you could use the following, which requires two scans/seeks:
Declare #CategoryID As Varchar(16)
Set #CategoryID = 'Blue'
Select
First_Record.RecordID,
First_Record.CreatedTime,
Last_Record.RecordID
From
(
Select Top 1
RecordID,
CreatedTime
From
<Table>
Where
CategoryID = #CategoryID
Order By
CreatedTime Asc
) First_Record
Cross Apply
(
Select Top 1
RecordID
From
<Table>
Where
CategoryID = #CategoryID
Order By
CreatedTime Desc
) Last_Record
If you have SQL Server 2012 or later, you could write the following, which requires only one scan/seek:
Select Top 1
First_Value(RecordID) Over (Partition By CategoryID Order By CreatedTime Asc),
First_Value(CreatedTime) Over (Partition By CategoryID Order By CreatedTime Asc),
First_Value(RecordID) Over (Partition By CategoryID Order By CreatedTime Desc)
From
<Table>
Where
CategoryID = #CategoryID

t-sql return multiple rows depending on field value

i am trying to run an export on a system that only allows t-sql. i know enough of php to make a foreach loop, but i don't know enough of t-sql to generate multiple rows for a given quantity.
i need a result to make a list of items with "1 of 4" like data included in the result
given a table like
orderid, product, quantity
1000,ball,3
1001,bike,4
1002,hat,2
how do i get a select query result like:
orderid, item_num, total_items,
product
1000,1,3,ball
1000,2,3,ball
1000,3,3,ball
1001,1,4,bike
1001,2,4,bike
1001,3,4,bike
1001,4,4,bike
1002,1,2,hat
1002,2,2,hat
You can do this with the aid of an auxiliary numbers table.
;WITH T(orderid, product, quantity) AS
(
select 1000,'ball',3 union all
select 1001,'bike',4 union all
select 1002,'hat',2
)
SELECT orderid, number as item_num, quantity as total_items, product
FROM T
JOIN master..spt_values on number> 0 and number <= quantity
where type='P'
NB: The code above uses the master..spt_values table - this is just for demo purposes I suggest you create your own tally table using one of the techniques here.
If you are on SQL Server 2005 or later version, then you can try a recursive CTE instead of a tally table.
;WITH CTE AS
(
SELECT orderid, 1 item_num, product, quantity
FROM YourTable
UNION ALL
SELECT orderid, item_num+1, product, quantity
FROM CTE
WHERE item_num < quantity
)
SELECT *
FROM CTE
OPTION (MAXRECURSION 0)
I'm not on a computer with a database engine where I can test this, so let me know how it goes.
Well, IF you know the maximum value for the # of products for any product (and it's not too big, say 4), you can:
Create a helper table called Nums containing 1 integer column n, with rows containing 1,2,3,4
Run
SELECT * from Your_table, Nums
WHERE Nums.n <= Your_table.quantity

Show data from quarterly records in a single row

Each quarter's sales data is contained in a row in the data source.
Account 1's 4 quarters of sales data would be in 4 separate records, each containing the account name, quarter number, and count of items purchased.
The report should show, in each detail row: account name, q1 count, q2 count, q3 count, q4 count, total year count.
I'm new to Crystal, but it seems like this should be easy; how would I do this?
I'd probably create the result list using some slightly complex sql and they just display it on the Crystal report...but if you're wanting to accomplish this entirely inside Crystal, take a look at http://aspalliance.com/1041_Creating_a_Crosstab_Report_in_Visual_Studio_2005_Using_Crystal_Reports.all.
Here's a stab at the SQL that would be required...
select
accountName,
(select sum(itemCount) from myTable where quarterName = 'q1') as q1Count,
(select sum(itemCount) from myTable where quarterName = 'q2') as q2Count,
(select sum(itemCount) from myTable where quarterName = 'q3') as q3Count,
(select sum(itemCount) from myTable where quarterName = 'q4') as q4Count,
(select sum(itemCount) from myTable) as yearCount
from myTable
group by accountName ;
If your data source has the sales date in it (and I assume it would), you can create a formula called #SalesQuarter:
if month({TableName.SalesQuarter}) in [1,2,3] then '1' else
if month({TableName.SalesQuarter}) in [4,5,6] then '2' else
if month({TableName.SalesQuarter}) in [7,8,9] then '3'
else '4'
You can then add a cross-tab to your report, and use the new #SalesQuarter field as the column header of your cross-tab.
This assumes your sales are all within the same year.
Add a group on {account}
In the group footer add a Running total for each quarter.
For each quarter, create a running total with following settings:
Running Total Name: create a unique name for each formula, for example Q1,Q2,Q3,Q4
Field to summarize: {items purchased}
Type of summary: sum
Evaluate: Use a formula - {quarter number}= --should be 1,2,3, or 4, depending on which quarter you are summing
Reset: On Change of Group {account}