Table Aggregation strategy (SSIS / Stored procedure / SSRS) - tsql

I have a table with roughly 7,000,000 records.
It's very flat and for sake of argument it has 3 columns I wish to aggregate on. This aggregation should very simply create a count /pivot of each instance of that value.
E.G
Company Status Year
Hatstand Open 2011
Hatstand Closed 2011
Moonbase Open 2011
Would produce
Count of Hatstand **2**
Count of Hatstand Open **1**
Count of Hatstand Open 2011 **1**
So it's a very simple count of each "branch" of data.
My first choice was to use a SSRS Matrix control. Which when testing with a small dataset worked really well. However when using a the full data-set would not run.
What is the "correct" way to approach this problem?
Should I pre-aggregate via a stored procedure or SSIS job?
Or should I continue with the SSRS route and try to refine my query?
Thanks

T-SQL is preferred way to go if You are using SQL Server 2005 or 2008.
If You are using SQL Server 2005 or 2008 You can try:
SELECT *, COUNT(*) FROM TBL
GROUP BY COMPANY, STATUS, YEAR
WITH ROLLUP -- or WITH CUBE
If You are using SQL Server 2008 You can try with :
SELECT *, COUNT(*) FROM A
GROUP BY
GROUPING SETS (
(COMPANY),
(COMPANY, STATUS),
(COMPANY, STATUS, YEAR)
)
For details about GROUP BY WITH ROLLUP/CUBE and GROUPING SETS take a look at GROUP BY.

Related

PostgreSQL how do I COUNT with a condition?

Can someone please assist with a query I am working on for school using a sample database from PostgreSQL tutorial? Here is my query in PostgreSQL that gets me the raw data that I can export to excel and then put in a pivot table to get the needed counts. The goal is to make a query that counts so I don't have to do the manual extraction to excel and subsequent pivot table:
SELECT
i.film_id,
r.rental_id
FROM
rental as r
INNER JOIN inventory as i ON i.inventory_id = r.inventory_id
ORDER BY film_id, rental_id
;
From the database this gives me a list of films (by film_id) showing each time the film was rented (by rental_id). That query works fine if just exporting to excel. Since we don't want to do that manual process what I need is to add into my query how to count how many times a given film (by film_id) was rented. The results should be something like this (just showing the first five here, the query need not do that):
film_id | COUNT of rental_id
1 | 23
2 | 7
3 | 12
4 | 23
5 | 12
Database setup instructions can be found here: LINK
I have tried using COUNTIF and CASE (following other posts here) and I can't get either to work, please help.
Did you try this?:
SELECT
i.film_id,
COUNT(1)
FROM
rental as r
INNER JOIN inventory as i ON i.inventory_id = r.inventory_id
GROUP BY i.film_id
ORDER BY film_id;
If there can be >1 rental_id in your data you may want to use COUNT(DISTINCT r.rental_id)

How to get a list of dates in Pervasive SQL

Our time & attendance database is a Pervasive/Actian Zen database. What I'm trying to do is create a query that just lists the next 14 days from today. I'll then cross apply this list of dates with employee records so that in effect I have a list of people/dates for the next 14 days.
I've done it with a recursive CTE on SQL server quite easily. I could also do it with a loop in SQL Server too but I can't figure it out with Pervasive SQL. Loops can only exist within Stored Procedures and triggers.
Looking around I thought that this code that I found and adapted might work, but it doesn't (and further research suggests that there isn't a recursive option within Pervasive at all.
WITH RECURSIVE cte_numbers(n, xDate)
AS (
SELECT
0, CURDATE() + 1
UNION ALL
SELECT
n+1,
dateAdd(day,n,xDate)
FROM
cte_numbers
WHERE n < 14
)
SELECT
xDate
FROM
cte_numbers;
I just wondered whether anyone could help me write an SQL query that gives me this list of dates, outside of a stored procedure.
When you create a table like this:
CREATE TABLE dates(d DATE PRIMARY KEY, x INTEGER);
And create a first record like this:
INSERT INTO dates VALUES ('2021-01-01',0);
Then you can use this statement which doubles the number of records in the table dates, every time it is executed. (so you need to run it a couple of times
When you run it 10 times the table dates will have 21 oktober 2023 as last date.
When you run it 12 times the last date will be 19 march 2032.
INSERT INTO dates
SELECT
DATEADD(DAY,m.m+1,d),
x+m.m+1
from dates
cross join (select max(x) m from dates) m
order by d;
Of course the column x can be deleted (optionally) with next statement, but you cannot add more records using the previous statement:
ALTER TABLE dates DROP COLUMN x;
Finally, to return the next 14 day from today:
SELECT d
FROM DATES
WHERE d BETWEEN CURDATE( ) AND DATEADD(DAY,13,CURDATE());

In TSQL, How do I add a count column that counts the number of rows in my query?

This can be done a number of ways, which I will explain at the end. For now, I have been given a work assignment that includes the following (simplified):
"Create a record each week to track the current status that has the following: account numbers (unique within each report), a random number (provided), their status (Green, Orange, or Blue), and make sure the record also has a column which tells me how many records their are this week."
I do not need code to generate a random number.
Columns: Account, RanNum, Status, NumberOfRowsThisWeek
How do I handle adding a column that determines the number of rows in my query and produces that number, static, within each row of that column?
I may try to tweak the request and apply a rising number. How would I go about doing it in this case?
Edit: SQL Server 2014
You are not telling us which database you are using.
In SQL Server, the newer versions at least, you have windowing function or analytical functions available, and they are also available in most other popular RDBMS
You could do what you want in SQL Server by adding this to your select
,count(*) over (partition by 1) as [NrOfRows]
An analytical function does the "standard" query, and then performs the windowing function on the result set.
The count above, counts the rows in the result set, partitioned by the constant 1, which is of course stable across all rows, so it gives the full rowcount.
It is perhaps not standard in all databases to allow a constant in that way, perhaps this would give a better result in some, I know it works in SQL Server:
,count(*) over (partition by (select 1 n)) as [NrOfRows]
it sounds like you want to do some kind of simple count() / group by query
select Account, RanNum, Status, count(*) as NumberOfRowsThisWeek
from tablename
group by Account, RanNum, Status
you my need to do
select Account, RanNum, Status, NumberOfRowsThisWeek
from (
select Account, Status, count(*) as NumberOfRowsThisWeek
from tablename
group by Account, Status
)
because the random number will confuse the group by by making every row unique.

Filter a value relevant to the maximum field

Here is my detail field with Order number and Amount.
Order Number Amount
2 3450
4 2300
8 4500
3 5100
Here the latest order is the maximum order number and I need to show it as follows in the report but not all these other records. So here I need to pick up the maximum order number and the relevant value for it. Help please.
Order Number Amount
8 4500
There are many ways to solve this one of the way is to use SQL Expression Fields.
Create a new SQL experssion field and write below formula
DB2 syntax
Select order number,amount from orders order by order number desc fetch first row only
oracle syntax:
SELECT order number,amount FROM (
select order number,amount ,ROW_NUMBER () OVER (ORDER BY order number DESC) RowNo from orders)
WHERE ROWNO<2
Now drag this to detail section.
Note: Above syntax is for DB2 if you are using oracle syntax will change..Let me know if you are using other than DB2 database

Feasibility of recreating complex SQL query in Crystal Reports XI

I have about 10 fairly complex SQL queries on SQL Server 2008 - but the client wants to be able to run them from their internal network (as opposed to from the non-local web app) through Crystal Reports XI.
The client's internal network does not allow us to (a) have write access to their proprietary db, nor (b) allow us to set up an intermediary SQL server (meaning we can not set up stored procedures or other data cleaning).
The SQL contains multiple instances of row_number() over (partition by col1, col2), group by col1, col2 with cube|rollup, and/or (multiple) pivots.
Can this even be done? Everything I've read seems to indicate that this is only feasible via stored procedure and I would still need to pull the data from the proprietary db first.
Following is a stripped back version of one of the queries (eg, JOINs not directly related to functionality, WHERE clauses, and half a dozen columns have been removed)...
select sum(programID)
, sum([a.Asian]) as [Episodes - Asian], sum([b.Asian]) as [Eps w/ Next Svc - Asian], sum([c.Asian])/sum([b.Asian]) as [Avg Days to Next Svc - Asian]
, etc... (repeats for each ethnicity)
from (
select programID, 'a.' + ethnicity as ethnicityA, 'b.' + ethnicity as ethnicityB, 'c.' + ethnicity as ethnicityC
, count(*) as episodes, count(daysToNextService) as episodesWithNextService, sum(daysToNextService) as daysToNextService
from (
select programID, ethnicity, datediff(dateOfDischarge, nextDateOfService) as daysToNextService from (
select t1.userID, t1.programID, t1.ethnicity, t1.dateOfDischarge, t1.dateOfService, min(t2.dateOfService) as nextDateOfService
from TABLE1 as t1 left join TABLE1 as t2
on datediff(d, t1.dateOfService, t2.dateOfService) between 1 and 31 and t1.userID = t2.userID
group by t1.userID, t1.programID, t1.ethnicity, t1.dateOfDischarge, t1.dateOfService
) as a
) as a
group by programID
) as a
pivot (
max(episodes) for ethnicityA in ([A.Asian],[A.Black],[A.Hispanic],[A.Native American],[A.Native Hawaiian/ Pacific Isl.],[A.White],[A.Unknown])
) as pA
pivot (
max(episodesWithNextService) for ethnicityB in ([B.Asian],[B.Black],[B.Hispanic],[B.Native American],[B.Native Hawaiian/ Pacific Isl.],[B.White],[B.Unknown])
) as pB
pivot (
max(daysToNextService) for ethnicityC in ([C.Asian],[C.Black],[C.Hispanic],[C.Native American],[C.Native Hawaiian/ Pacific Isl.],[C.White],[C.Unknown])
) as pC
group by programID with rollup
Sooooooo.... can something like this even be translated into Crystal Reports XI?
Thanks!
When you create your report instead of selecting a table or stored procedure choose add command
This will allow you to put whatever valid TSQL statement in there that you want. Using Common Table Expressions (CTE's) and inline Views I've managed to create some rather large complex statements (excess of 400 lines) against Oracle and SQL Server so it is indeed feasible, however if you use parameters you should consider using sp_executesql you'll have to figure out how to avoid SQL injection.