SQL: Get first entry in aggregation function? - firebird

I have a simple table:
ID - JID - AMOUNT
1 - 1 - 100
2 - 2 - 50
3 - 2 - -25
4 - 3 - 100
5 - 3 - -50
I want to end up with:
JID - FIRSTBALANCE
1 - 100
2 - 50
3 - 100
Because Firebird is so insanely difficult when it comes to aggregation, this doesn't work:
SELECT jid, amount as firstBalance
FROM table
GROUP BY jid
How can I get it so it groups by JID, and automatically set the value of firstbalance to the first value in the table?

Depends on what do you mean with "automatically set the value of firstbalance to the first value in the table". From the example of the desired result you gave I thought you consider the row with lowest ID value for given JID group as "first" so
SELECT DISTINCT JID,
(SELECT amount FROM table s WHERE s.JID = o.JID ORDER BY s.ID ROWS 1)
FROM table o
should work.

Firebird does not contain a first() or a last() aggregate function. This has been requested and denied by the team due to which item would be chosen. You'd need to specify an order by clause for the items that get aggregated.
The answer you selected gets you the max(amount) not the first(amount). This is not what you asked for (though possibly it is what you wanted).
For future Googlers/Bingers here's how you get the first item. It's not a terrific solution, and it can be slow.
select distinct a.jid,
(select first 1 b.amount
from table b
where b.jid = a.jid
order by b.id) as amount
from table a
order by a.jid
It will retrieve the three JID fields and the first found amount as determined by ID order.
Don't hold your breath for this to get built into Firebird. When asked about a positional aggregate in the past, the response was:
"I have a great deal of trouble with that concept because position isn't a relational concept and the introduction of positional operators will signficantly inhibit efforts to improve performance by performing operations in parallel."

This is what I was looking for:
SELECT jid, max(amount) as firstBalance
FROM table
GROUP BY jid

Related

Tableau: Distinct count of a field which occurs more than once

I have a field customer_id and I need to track the number of unique users and repeat users. For example the table is as below:
customer_id
11
22
33
11
44
22
Here, the no. of unique users is 4 (11,22,33,44) and number of repeat users are 2 (11,22).
I am calculating unique users as COUNTD([customer_id]).
How can I calculate repeat users? It is basically the distinct count of the values which appear more than once. I tried with the following expression:
COUNTD(IF COUNT([customer_id]) > 1
THEN [customer_id]
END)
but I'm getting an error: Cannot mix aggregate and non-aggregate arguments comparisons or results in IF expressions
How else can I calculate the repeat users?
Thanks in advance.
According to your filter needs, you can rely on LOD using FIXED/INCLUDE:
{ FIXED [Customer Id] : if sum({ FIXED [Customer Id] : COUNT([Customer Id])}) > 1 then 1 end }
Basically, in the inner LOD you count the occourrences, and then you just take in consideration records having 2+ (>1) of them:
A simple alternative to Fabio's answer can also do the job. Just create a calculated field
COUNT([customer id]) >1
and add this to filter shelf.
You can filter out false candidates to remove unique users and taking returning customers only.

Applying the same row id based on partition of ROW_NUMBER

I'm trying to sequentially add number rows based on the DOC value - the only condition i have is that IF the document is the same (like three last rows) it should get the same row number.
SELECT [DOC], ROW_NUMBER() OVER(PARTITION BY [DOC] ORDER BY [DOC])
FROM [rowset_TST]
After i get excactly the opposite - the last three docs which are the same are numbered and the rest not: ZZB which is not unique should get the same row number.
ABC 1 DBS 1 DDS 1 SBC 1
SSC 1 ZZB 1 ZZB 2 ZZB 3
Any advice highly appreciated - please tell me if this would be doable with ROW_NUMBER()
Regards, Luke
I think you intend to use a rank function here, possibly DENSE_RANK:
SELECT [DOC], DENSE_RANK() OVER (ORDER BY [DOC])
FROM [rowset_TST]
ORDER BY [DOC];
This would generate the following output, assuming your same sample data:
This is the only interpretation of your question/requirement which came to mind and makes any sense. If you really want every doc, whether occurring one or more times, to have a "row number" value of 1, then I don't see the point in using ROW_NUMBER.

SQL - using the Min field to achieve desired result

Wondering the best SQL to handle below situation: Client only wants to see invoices that have been declined. I started with only show me when STATUS_ID = 2, but then realized that it was paid as it was resubmitted and accepted so that didn't work. What is the best way to handle 2 records like below where I don't want the SQL to return any records if manifest + order code have a 1. Would you do a Min on Status ID or something of that nature?
VENDOR NAME manifest ORDER_CODE STATUS_ID
VENDOR 12345 BHGSDKJF1234 RU07 2 (invoice decline)
VENDOR 12345 BHGSDKJF1234 RU07 1 (paid)
This trick can be work for you in this case, but it's not solve the general case (what happens if the STATUS_ID for paid is 3, and all possible values are 0-5?)
you can use in general SWICH-CASE clause, that gives you some 1 (true) if the client has STATUS_ID = 1, and 0 otherwise. Then, pick the MAX() for each invoice.
You can also consider another design that might work for you:
Add time\time-stamp column (Maybe, for your purpose, you can use SYSDATE time for insertion time of the record to db).
After you have a time column, you probably can choose the columns with the last time STATUS_ID for each invoice (get the STATUS_ID in the row with the max time).

SQL Sum and Group By for a running Tally?

I'm completely rewriting my question to simplify it. Sorry if you read the prior version. (The previous version of this question included a very complex query example that created a distraction from what I really need.) I'm using SQL Express.
I have a table of lessons.
LessonID StudentID StudentName LengthInMinutes
1 1 Chuck 120
2 2 George 60
3 2 George 30
4 1 Chuck 60
5 1 Chuck 10
These would be ordered by date. (Of course the actual table is thousands of records with dates and other lesson-related data but this is a simplification.)
I need to query this table such that I get all rows (or a subset of rows by a date range or by student), but I need my query to add a new column we might call PriorLessonMinutes. That is, the sum of all minutes of all lessons for the same student in lessons of PRIOR dates only.
So the query would return:
LessonID StudentID StudentName LengthInMinutes PriorLessonMinutes
1 1 Chuck 120 0
2 2 George 60 0
3 2 George 30 60 (The sum Length from row 2 only)
4 1 Chuck 60 120 (The sum Length from row 1 only)
5 1 Chuck 10 180 (The sum of Length from rows 1 and 4)
In essence, I need a running tally of the sum of prior lesson minutes for each student. Ideally the tally shouldn't include the current row, but if it does, no big deal as I can do subtraction in the code that receives the query.
Further, (and this is important) if I retrieve only a subset of records, (for example by a date range) PriorLessonMinutes must be a sum that considers rows that are NOT returned.
My first idea was to use SUM() and to GROUP BY Student, but that isn't right because unless I'm mistaken it would include a sum of minutes for all rows for each student, including rows that come after the row which aren't relevant to the sum I need.
OPTIONS I'M REJECTING: I could scan through all rows in my code that receives it, (although this would force me to retrieve all rows unnecessarily) but that's obviously inefficient. I could also put a real data field in there and populate it, but this too presents problems when other records are deleted or altered.
I have no idea how to write such a query together. Any guidance?
This is a great opportunity to use Windowed Aggregates. The trick is that you need SQL Server 2012 Express. If you can get it, then this is the query you are looking for:
select *,
sum(LengthInMinutes)
over (partition by StudentId order by LessonId
rows between unbounded preceding and 1 preceding)
as PriorLessonMinutes
from Lessons
Note that it returns NULLs instead of 0s (zeroes). If you insist on zeroes, use COALESCE function to turn NULLs into zeroes.
I suggest using a nested query to limit the number of rows returned:
select * from
(
select *,
sum(LengthInMinutes)
over (partition by StudentId order by LessonId
rows between unbounded preceding and 1 preceding)
as PriorLessonMinutes
from Lessons
) as NestedLessons
where LessonId > 3 -- this is an example of a filter
This way the filter is applied after the aggregation is complete.
Now, if you want to apply a filter that doesn't affect the aggregation (like only querying data for a certain student), you should apply the filter to the inner query, as pruning the rows that don't affect the computation early (like data for other students) will improve the performance.
I feel the following code will serve your purpose.Check it:-
select Students.StudentID ,Students.First, Students.Last,sum(Lessons.LengthInMinutes)
as TotalPriorMinutes from lessons,students
where Lessons.StartDateTime < getdate()
and Lessons.StudentID = Students.StudentID
and StartDateTime >= '20090130 00:00:00' and StartDateTime < '20790101 00:00:00'
group by Students.StudentID ,Students.First, Students.Last

SQL Server 2008: Pivot column with no aggregate function workaround

Yes I know, this question has been asked MANY times but after reading all the posts I found that there wasn't an answer that fits my need. So, Heres my question. I would like to take a column of values and pivot them into rows of 6 columns.
I want to take this...... And turn it into this.......................
G Letter Date Code Ammount Name Account
081278 G 081278 12 00123535 John Doe 123456
12
00123535
John Doe
123456
I have 110000 values in this one column in one table called TempTable. I need all the values displayed because each row is an entity to itself. For instance, There is one unique entry for all of the Letter, Date, Code, Ammount, Name, and Account columns. I understand that the aggregate function is required but is there a workaround that will allow me to get this desired result?
Just use a MAX aggregate
If one row = one column (per group of 6 rows) then MAX of a single value = that row value.
However, the data you've posted in insufficient. I don't see anything to:
associate the 6 rows per group
distinguish whether a row is "Letter" or "Name"
There is no implicit row order or number to rely upon to generate the groups
Unfortunately, the max columns in a SQL 2008 select statement is 4,096 as per MSDN Max Capacity.
Instead of using a pivot, you might consider dynamic SQL to get what you want to do.
Declare #SQLColumns nvarchar(max),#SQL nvarchar(max)
select #SQLColumns=(select '''+ColName+'''',' from TableName for XML Path(''))
set #SQLColumns=left(#SQLColumns,len(#SQLColumns)-1)
set #SQL='Select '+#SQLColumns
exec sp_ExecuteSQL #SQL,N''