Calculating MAX(DATE) for Value Groups Where Values Go Back and Forth - db2

I have another challenge that I am trying to resolve but unable to get the solution yet. Here is the scenario. Pardon the formatting if it messes up at the time of posting.
ACCT_NUM CERT_ID Code Date Desired Output
A 1 10 1/1/2007 1/1/2008
A 1 10 1/1/2008 1/1/2008
A 1 20 1/1/2009 1/1/2010
A 1 20 1/1/2010 1/1/2010
A 1 10 1/1/2011 1/1/2012
A 1 10 1/1/2012 1/1/2012
A 2 20 1/1/2007 1/1/2008
A 2 20 1/1/2008 1/1/2008
A 2 10 1/1/2009 1/1/2010
A 2 10 1/1/2010 1/1/2010
A 2 30 1/1/2011 1/1/2011
A 2 10 1/1/2012 1/1/2013
A 2 10 1/1/2013 1/1/2013
As you can see, I need to do a MAX on the date based on each group of code values (apart from ACCT_NUM and CERT_ID) before the value changes. If the same value repeats, I need to a MAX of the data again for that group separately. For example, for CERT_ID of '1', I cannot group all four rows of Code 10 to get a MAX date of 1/1/2012. I need to get the MAX for the first two rows and then another MAX for the next two rows separately since there is another code in between. I am trying to accomplish this in Cognos Framework Manager.
Gurus, please advise.

The syntax for getting the max value for CERT_ID is:
maximum(Date for CERT_ID)
If you want additional level/s for max you can use the following syntax:
maximum(Date for ACCT_NUM,CERT_ID,Code)
In general, it is best practice to group and summarize values in report, not in framework manager.

Related

Tableau display 3 measures in 1 column

I am building a report in Tableau with 4 fields. 3 of these fields are measure fields. This is what it looks like.
Name Sum Profit Loss
Emmy 15 10 5
Sara 23 18 2
Dave 10 1 2
But, I want it to look like this. I want a new column called metrics that pulls in those 3 values.
Metrics
Emily 15
10
5
Sara 23
18
2
Dave 10
1
2
May I have some guidance on how I can approach this?
The easiest way to solve this is by bringing "Measure Names" to Rows
You can hide the header names of the measures as well. hope that helps

How to average monthly data using a specific method in Matlab?

I have the following vector of monthly values (vectorA). I put the date related info next to it to help illustrate the task but I work with just the vector itself
dates month_in_q vectorA
31/01/2020 1 10
29/02/2020 2 15
31/03/2020 3 6
30/04/2020 1 8
31/05/2020 2 4
30/06/2020 3 3
How can I create a new vectorNEW according to this algorithm
In each quarter the first month is the original first month
In each quarter the second month is the average of first and second month
In each quarter the third month is the average of all three months
So that I get the following vectorNEW by manipulating the original vectorA in a loop given this the re-occuring pattern above
dates month_in_q vectorA vectorNEW
31/01/2020 1 10 10
29/02/2020 2 15 AVG(10+15)
31/03/2020 3 6 AVG(10+15+6)
30/04/2020 1 8 8
31/05/2020 2 4 AVG(8+4)
30/06/2020 3 3 AVG(8+4+3)
... ... ... ...
An elegant solution was provided by the user dpb on mathworks website.
vectorNEW=reshape(cumsum(reshape(vectorA,3,[]))./[1:3].',[],1);
Further info below
https://uk.mathworks.com/matlabcentral/answers/823055-how-to-average-monthly-data-using-a-specific-method-in-matlab

Rolling N monthly average in Redshift with multiple entries per month

I would like to use Redshift's window aggregation functions to create an 'N' month rolling average of some data. The data will have multiple unique entries per any given month. If possible, I'd like to avoid first grouping by and averaging within months before performing rolling average as this is taking an average of an average and not ideal (as this post does: 3 Month Moving Average - Redshift SQL).
This is a sample dataset of just one account (there will be more than 1).
Quote Date Account. Value
3/24/2015 acme. 3
3/25/2015 acme. 7
4/1/2015 acme. 12
4/3/2015 acme. 17
5/15/2015 acme. 1
6/30/2015 acme. 3
7/30/2015 acme. 9
And this is what I would like the result to look like for a 3 month rolling average (for an example).
Quote_Date Account. Value Month 3M_Rolling_Average
3/24/2015 acme. 3 1 3
3/25/2015 acme. 7 1 5
4/1/2015 acme. 12 2 7.33
4/3/2015 acme. 17 2 9.75
5/15/2015 acme. 1 3 8
6/30/2015 acme. 3 4 8.25
7/30/2015 acme. 9 5 4.33
The code I have tried looks like this:
avg(Value) over (partition by Account order by Quote Date rows between 2 preceding and current row)
But, this only operates over the last 2 rows (and including current row) which would work if I had one unique value for each month but as stated, this is not the case. I am open to any kind of ranking solution or nested partitioning. Any help is greatly appreciated.
Since an average is just the sum() / count(), you just need to group by month but get the sum() and count(). Then use your lag to sum 3 months of sums and divide by the sum of 3 months of counts. You are correct that average of averages is not correct but if you carry the sums and counts things work.

remove a lesser duplicate

In KDB, I have the following table:
q)tab:flip `items`sales`prices!(`nut`bolt`cam`cog`bolt`screw;6 8 0 3 0n 0n;10 20 15 20 0n 0n)
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
bolt
screw
In this table, there are 2 duplicate items (bolt). However since the first 'bolt' contains more information. I would like to remove the 'lesser' bolt.
FINAL RESULT:
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
screw
As far as I understand, If I used the 'distinct' function its not deterministic?
One way to do it is to fill forward by item, then bolt will inherit the previous values.
q)update fills sales,fills prices by items from tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
bolt 8 20
screw
This can also be done in functional form where you can pass the table and by columns:
{![x;();(!). 2#enlist(),y;{x!fills,/:x}cols[x]except y]}[tab;`items]
If "more information" means "least nulls" then you could count the number of nulls in each row and only return those rows by item that contain the fewest:
q)select from #[tab;`n;:;sum each null tab] where n=(min;n)fby items
items sales prices n
--------------------
nut 6 10 0
bolt 8 20 0
cam 0 15 0
cog 3 20 0
screw 2
Although would not recommend this approach as it requires working with rows rather than columns.
Because those two rows contain different data, they are considered distinct.
It depends on how you define "more information". You would probably need to provide more examples, but some possibilities:
Delete rows with null sales value
q)delete from tab where null sales
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
Retrieve rows with max sales value for each item
q)select from tab where (sales*prices) = (max;sales*prices) fby items
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20

How to convert hourly rainfall data into daily rainfall

I have hourly rainfall and other data for long period. I would like to get daily values from these hourly data. My daily values should start from hour 1 to hour 24.
Year Month Day Hour Rain RH Temp
1976 1 1 1 3.4 60 16
1976 1 1 2 0 80 18
1976 1 1 3 NaN 50 18
First, get rid of NaN's, because e.g. 1+NaN=NaN, so it destroys any summation, average etc.
Then you can do for example simply this:
% M_hourly is a matrix with rows [Year Month Day Hour Rain RH Temp]
[H,W]=size(M_hourly);
for ii=1:24:H
Rain=M_hourly(ii:ii+23,5);
Rain_daily(ii)=sum(Rain);
end
You should prealocate the Rain_daily, cause it grows inside a loop.
I could be more helful, if you provide us with a more specific questions. I am just guessing what is the problem.