Grouping in a report - tsql

I am using t-sql.
I have 4 work trays and I would like a report that gives me the name of each work tray, plus the oldest item of post in it, plus a couple more fields. It needs to be limited to 4 rows - one for each work tray.
So at the moment I have this:
SELECT WorkTray, MIN(Date) AS [OldestDate], RefNo, NameofItem
FROM ...
GROUP BY WorkTray,RefNo, NameofItem
ORDER BY WorkTray,RefNo, NameofItem
However when I run this it gives me every item in each work tray, eg a report 100s of items long - I just want it to be limited to 4 rows of data, one for each work tray:
Work Tray Date RefNo NameofItem
A 1/2/15 25 Outstanding Bill
B 5/5/18 1000 Lost post
C 2/2/12 17 Misc
D 6/12/17 876 Misc
So I'm sure I'm going wrong somewhere with my GROUP BY - but I can't see where.

There is a trick for doing this that has been answered on stackoverflow before. Here it is adapted to your query:
SELECT *
FROM
(SELECT WorkTray, Date AS [OldestDate], RefNo, NameofItem, ROW_NUMBER() OVER (PARTITION BY WorkTray ORDER BY WorkTray, [Date]) AS rn
FROM MyTable
) GroupedByTray
WHERE rn = 1
The PARTITION BY tells it to count the rows for each type of tray, and the ORDER BY works similar to the normal ORDER BY clause. Assuming you have only 4 work trays (A - D), the "WHERE rn = 1" part will return only the first row for WorkTrays A - D.

Related

Compare 2 Tables When 1 Is Null in PostgreSQL

List item
I am kinda new in PostgreSQL and I have difficulty to get the result that I want.
In order to get the appropriate result, I need to make multiple joins and I have difficulty when counting grouping them in one query as well.
The table names as following: pers_person, pers_position, and acc_transaction.
What I want to accomplish is;
To see who was absent on which date comparing pers_person with acc_transaction for any record, if there any record its fine... but if record is null the person was definitely absent.
I want to count the absence by pers_person, how many times in month this person is absent.
Also the person hired_date should be considered, the person might be hired in November in October report this person should be filtered out.
pers_postition table is for giving position information of that person.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
SELECT tr.create_time::date AS Date, pers.pin, tr.dept_name, tr.name, tr.last_name, pos.name, Count(*)
FROM acc_transaction AS tr
RIGHT JOIN pers_person as pers
ON tr.pin = pers.pin
LEFT JOIN pers_position as pos
ON pers.position_id=pos.id
WHERE tr.event_no = 0 AND DATE_PART('month', DATE)=10 AND DATE_PART('month', pr.hire_date::date)<=10 AND pr.pin IS DISTINCT FROM tr.pin
GROUP BY DATE
ORDER BY DATE
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
*This is report for octeber,
*Pin is ID number
I'd start by
changing the RIGHT JOIN for a LEFT JOIN as they works the same in reverse but it's confusing to figure them both in mind :
removing for now the pers_position table as it is used for added information purpose rather than changing any returned result
there is an unknown alias pr and I'd assume it is meant for pers (?), changing it accordingly
that leads to strange WHERE conditions, removing them
"pers.pin IS DISTINCT FROM pers.pin" (a field is never distinct from itself)
"AND DATE_PART('month', DATE)=10 " (always true when run in october, always false otherwise)
Giving the resulting query :
SELECT tr.create_time::date AS Date, pers.pin, tr.dept_name, tr.name, tr.last_name, Count(*)
FROM pers_person as pers
LEFT JOIN acc_transaction AS tr ON tr.pin = pers.pin
WHERE tr.event_no = 0
AND DATE_PART('month', pers.hire_date::date)<=10
GROUP BY DATE
ORDER BY DATE
At the end, I don't know if that answers the question, since the title says "Compare 2 Tables When 1 Is Null in PostgreSQL" and the content of the question says nothing about it.

How to dynamically pivot based on rows data and parameter value?

I am trying to pivot using crosstab function and unable to achieve for the requirement. Is there is a way to perform crosstab dynamically and also dynamic result set?
I have tried using crosstab built-in function and unable to meet my requirement.
select * from crosstab ('select item,cd, type, parts, part, cnt
from item
order by 1,2')
AS results (item text,cd text, SUM NUMERIC, AVG NUMERIC);
Sample Data:
ITEM CD TYPE PARTS PART CNT
Item 1 A AVG 4 1 10
Item 1 B AVG 4 2 20
Item 1 C AVG 4 3 30
Item 1 D AVG 4 4 40
Item 1 A SUM 4 1 10
Item 1 B SUM 4 2 20
Item 1 C SUM 4 3 30
Item 1 D SUM 4 4 40
Expected Results:
ITEM CD PARTS TYPE_1 CNT_1 TYPE_1 CNT_1 TYPE_2 CNT_2 TYPE_2 CNT_2 TYPE_3 CNT_3 TYPE_3 CNT_3 TYPE_4 CNT_4 TYPE_4 CNT_4
Item 1 A 4 AVG 10 SUM 10 AVG 20 SUM 20 AVG 30 SUM 30 AVG 40 SUM 40
The PARTS value is based on a parameter passed by the user. If the user passes 2 for example, there will be 4 rows in the result set (2 parts for AVG and 2 parts of SUM).
Can I achieve this requirement using CROSSTAB function or is there a custom SQL statement that need to be developed?
I'm not following your data, so I can't offer examples based on it. But I have been looking at pivot/cross-tab features over the past few days. I was just looking at dynamic cross tabs just before seeing your post. I'm hoping that your question gets some good answers, I'll start off with a bit of background.
You can use the crosstab extension for standard cross tabs, what when wrong when you tried it? Here's an example I wrote for myself the other day with a bunch of comments and aliases for clarity. The pivot is looking at item scans to see where the scans were "to", like the warehouse or the floor.
/* Basic cross-tab example for crosstab (text) format of pivot command.
Notice that the embedded query has to return three columns, see the aliases.
#1 is the row label, it shows up in the output.
#2 is the category, what determines how many columns there are. *You have to work this out in advance to declare them in the return.*
#3 is the cell data, what goes in the cross tabs. Note that this form of the crosstab command may return NULL, and coalesce does not work.
To get rid of the null count/sums/whatever, you need crosstab (text, text).
*/
select *
from crosstab ('select
specialty_name as row_label,
scanned_to as column_splitter,
count(num_inst)::numeric as cell_data
from scan_table
group by 1,2
order by 1,2')
as scan_pivot (
row_label citext,
"Assembly" numeric,
"Warehouse" numeric,
"Floor" numeric,
"QA" numeric);
As a manual alternative, you can use a series of FILTER statements. Here's an example that summaries errors_log records by day of the week. The "down" is the error name, the "across" (columns) are the days of the week.
select "error_name",
count(*) as "Overall",
count(*) filter (where extract(dow from "updated_dts") = 0) as "Sun",
count(*) filter (where extract(dow from "updated_dts") = 1) as "Mon",
count(*) filter (where extract(dow from "updated_dts") = 2) as "Tue",
count(*) filter (where extract(dow from "updated_dts") = 3) as "Wed",
count(*) filter (where extract(dow from "updated_dts") = 4) as "Thu",
count(*) filter (where extract(dow from "updated_dts") = 5) as "Fri",
count(*) filter (where extract(dow from "updated_dts") = 6) as "Sat"
from error_log
where "error_name" is not null
group by "error_name"
order by 1;
You can do the same thing with CASE, but FILTER is easier to write.
It looks like you want something basic, maybe the FILTER solution appeals? It's easier to read than calls to crosstab(), since that was giving you trouble.
FILTER may be slower than crosstab. Probably. (The crosstab extension is written in C, and I'm not sure how smart FILTER is about reading off indexes.) But I'm not sure as I haven't tested it out yet. (It's on my to do list, but I haven't had time yet.) I'd be super interested if anyone can offer results. We're on 11.4.
I wrote a client-side tool to build FILTER-based pivots over the past few days. You have to supply the down and across fields, an aggregate formula and the tool spits out the SQL. With support for coalesce for folks who don't want NULL, ROLLUP, TABLESAMPLE, view creation, and some other stuff. It was a fun project. Why go to that effort? (Apart from the fun part.) Because I haven't found a way to do dynamic pivots that I actually understand. I love this quote:
"Dynamic crosstab queries in Postgres has been asked many times on SO all involving advanced level functions/types. Consider building your needed query in application layer (Java, Python, PHP, etc.) and pass it in a Postgres connected query call. Recall SQL is a special-purpose, declarative type while app layers are general-purpose, imperative types." – Parfait
So, I wrote a tool to pre-calculate and declare the output columns. But I'm still curious about dynamic options in SQL. If that's of interest to you, have a look at these two items:
https://postgresql.verite.pro/blog/2018/06/19/crosstab-pivot.html
Flatten aggregated key/value pairs from a JSONB field?
Deep magic in both.

Get latest one record will show, 1 week after its been added. After that it will show randomly upon page refresh

My requirement is simple , I have a repeater control webpart, and I want to apply a condition in the WHERE clause.
Condition : The latest one record will show, 1 week after its been added. After that it will show randomly upon page refresh.
Means if record is more than 1 week , then it will show latest by , upon page refresh.
I made this query but it doesn't work:
(DocumentCreatedWhen >= dateadd(day, -7, convert(date, getdate())))
I'm a little confused on the "on page refresh" portion of your request. You said in your first part that "after that it will show randomly upon page refresh" then on the 2nd part said "if the record is greater than 1 week, it will show upon page refresh"
Which do you want?
To filter out events that are at least 1 week old, you would do
DATEDIFF(day, DocumentCreatedWhen , GETDATE()) >= 7
from there you can do an ORDER BY DocumentCreateWhen asc, and a Top # of 1.
If you want to apply different logic on postback, you can use macros and the Visibility to make the "random" repeater visible on post back, and the other visible if it's not postback, or use macros to provide different WHERE conditions based on the postback status.
I could not find a default "IsPostback" macro available so you will have to create a custom macro that returns the current postback status.
Try these settings on your data source:
ORDER BY expression: age DESC, NEWID()
WHERE condition: DateDiff(day,DocumentCreatedWhen, GetDate()) >= 7
Columns: CASE WHEN DateDiff(day,DocumentCreatedWhen,GetDate()) = 7 THEN 1 ELSE 0 END AS age, *
This should mean that any document that is 7 days old appears at the top of your list, ready for you to set Select top N pages to 1. All other documents more than 7 days old will just be ordered randomly by the NEWID() function.
Obviously, where the * is in the columns, you should specify the columns that you need rather than leaving in a wildcard for performance reasons.
I just ran this out on the Dancing Goat sample and it does what you need (assuming I've understood correctly).
Edit:
Worth noting. Anything that is 7 days old will stay there until it is... well.. not exactly 7 days old. To make that work, you would somehow need track that the record has been shown so that you can then exclude it from the results set. I.e. you COLUMNS become something like this:
CASE
WHEN (DateDiff(day,DocumentCreatedWhen,GetDate()) = 7 AND DocumentHasBeenShown=0 THEN 1
ELSE 0
END) AS age
, *
What You need to use is union
SELECT TOP 1 * FROM
(
-- Get the latest record
Union
-- Get random record
) as Result
For example, if you get MenuItem:
SELECT TOP 1 * FROM
(
-- latest for this week
SELECT DocumentUrlPath, DocumentName, DocumentCreatedWhen from (
select top 1 DocumentNAme, DocumentUrlPAth, DocumentCreatedWhen FROM View_CONTENT_MenuItem_Joined
where DATEDIFF(day, DocumentCreatedWhen , GETDATE()) <= 7 Order BY DocumentCreatedWhen DESC) as LatestForThisWeek
UNION
-- random
SELECT DocumentUrlPath, DocumentName, DocumentCreatedWhen from (
select top 1 DocumentNAme, DocumentUrlPAth, DocumentCreatedWhen FROM View_CONTENT_MenuItem_Joined
ORDER BY NEWID() ) as RandomizedRecords
) as Result
There is lots of sub queries but this will give you the idea :)

SQL: Get first entry in aggregation function?

I have a simple table:
ID - JID - AMOUNT
1 - 1 - 100
2 - 2 - 50
3 - 2 - -25
4 - 3 - 100
5 - 3 - -50
I want to end up with:
JID - FIRSTBALANCE
1 - 100
2 - 50
3 - 100
Because Firebird is so insanely difficult when it comes to aggregation, this doesn't work:
SELECT jid, amount as firstBalance
FROM table
GROUP BY jid
How can I get it so it groups by JID, and automatically set the value of firstbalance to the first value in the table?
Depends on what do you mean with "automatically set the value of firstbalance to the first value in the table". From the example of the desired result you gave I thought you consider the row with lowest ID value for given JID group as "first" so
SELECT DISTINCT JID,
(SELECT amount FROM table s WHERE s.JID = o.JID ORDER BY s.ID ROWS 1)
FROM table o
should work.
Firebird does not contain a first() or a last() aggregate function. This has been requested and denied by the team due to which item would be chosen. You'd need to specify an order by clause for the items that get aggregated.
The answer you selected gets you the max(amount) not the first(amount). This is not what you asked for (though possibly it is what you wanted).
For future Googlers/Bingers here's how you get the first item. It's not a terrific solution, and it can be slow.
select distinct a.jid,
(select first 1 b.amount
from table b
where b.jid = a.jid
order by b.id) as amount
from table a
order by a.jid
It will retrieve the three JID fields and the first found amount as determined by ID order.
Don't hold your breath for this to get built into Firebird. When asked about a positional aggregate in the past, the response was:
"I have a great deal of trouble with that concept because position isn't a relational concept and the introduction of positional operators will signficantly inhibit efforts to improve performance by performing operations in parallel."
This is what I was looking for:
SELECT jid, max(amount) as firstBalance
FROM table
GROUP BY jid

SQL Sum and Group By for a running Tally?

I'm completely rewriting my question to simplify it. Sorry if you read the prior version. (The previous version of this question included a very complex query example that created a distraction from what I really need.) I'm using SQL Express.
I have a table of lessons.
LessonID StudentID StudentName LengthInMinutes
1 1 Chuck 120
2 2 George 60
3 2 George 30
4 1 Chuck 60
5 1 Chuck 10
These would be ordered by date. (Of course the actual table is thousands of records with dates and other lesson-related data but this is a simplification.)
I need to query this table such that I get all rows (or a subset of rows by a date range or by student), but I need my query to add a new column we might call PriorLessonMinutes. That is, the sum of all minutes of all lessons for the same student in lessons of PRIOR dates only.
So the query would return:
LessonID StudentID StudentName LengthInMinutes PriorLessonMinutes
1 1 Chuck 120 0
2 2 George 60 0
3 2 George 30 60 (The sum Length from row 2 only)
4 1 Chuck 60 120 (The sum Length from row 1 only)
5 1 Chuck 10 180 (The sum of Length from rows 1 and 4)
In essence, I need a running tally of the sum of prior lesson minutes for each student. Ideally the tally shouldn't include the current row, but if it does, no big deal as I can do subtraction in the code that receives the query.
Further, (and this is important) if I retrieve only a subset of records, (for example by a date range) PriorLessonMinutes must be a sum that considers rows that are NOT returned.
My first idea was to use SUM() and to GROUP BY Student, but that isn't right because unless I'm mistaken it would include a sum of minutes for all rows for each student, including rows that come after the row which aren't relevant to the sum I need.
OPTIONS I'M REJECTING: I could scan through all rows in my code that receives it, (although this would force me to retrieve all rows unnecessarily) but that's obviously inefficient. I could also put a real data field in there and populate it, but this too presents problems when other records are deleted or altered.
I have no idea how to write such a query together. Any guidance?
This is a great opportunity to use Windowed Aggregates. The trick is that you need SQL Server 2012 Express. If you can get it, then this is the query you are looking for:
select *,
sum(LengthInMinutes)
over (partition by StudentId order by LessonId
rows between unbounded preceding and 1 preceding)
as PriorLessonMinutes
from Lessons
Note that it returns NULLs instead of 0s (zeroes). If you insist on zeroes, use COALESCE function to turn NULLs into zeroes.
I suggest using a nested query to limit the number of rows returned:
select * from
(
select *,
sum(LengthInMinutes)
over (partition by StudentId order by LessonId
rows between unbounded preceding and 1 preceding)
as PriorLessonMinutes
from Lessons
) as NestedLessons
where LessonId > 3 -- this is an example of a filter
This way the filter is applied after the aggregation is complete.
Now, if you want to apply a filter that doesn't affect the aggregation (like only querying data for a certain student), you should apply the filter to the inner query, as pruning the rows that don't affect the computation early (like data for other students) will improve the performance.
I feel the following code will serve your purpose.Check it:-
select Students.StudentID ,Students.First, Students.Last,sum(Lessons.LengthInMinutes)
as TotalPriorMinutes from lessons,students
where Lessons.StartDateTime < getdate()
and Lessons.StudentID = Students.StudentID
and StartDateTime >= '20090130 00:00:00' and StartDateTime < '20790101 00:00:00'
group by Students.StudentID ,Students.First, Students.Last