PowerBI GROUPBY date AND identifier - group-by

I am new to PowerBI and I am trying to group by the max date within the quarter, and then by unique identifier. Is this possible? My dataset looks like:
Date
CompanyID
Sales
3/31/2018
1
100
3/31/2018
2
200
3/31/2018
3
100
6/30/2018
2
300
3/31/2018
4
100
2/28/2018
4
75
1/31/2018
4
50
6/30/2018
4
200
I'm hoping to get:
Date
CompanyID
Sales
3/31/2018
1
100
3/31/2018
2
200
3/31/2018
3
100
3/31/2018
4
100
6/30/2018
2
300
6/30/2018
4
200
Appreciate any help here! Thank you!

In Power Query (Home=>Transform)
Add extra columns for Quarter and Year
Group by ID / Quarter / Year
Extract the maximum date from each subtable, along with the associated Sales number
let
//Change code in next line to reflect however you are reading in your data,
// or refer to the table you already have
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"CompanyID", Int64.Type}, {"Sales", Currency.Type}}),
//add custom columns for Quarter and Year
#"Added Custom" = Table.AddColumn(#"Changed Type", "Quarter", each Date.QuarterOfYear([Date]),Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Year", each Date.Year([Date])),
//Group by ID / Quarter / Year
//Then return the last date for each subtable, and the corresponding Sales figure
#"Grouped Rows" = Table.Group(#"Added Custom1", {"CompanyID", "Quarter", "Year"}, {
{"Date", each List.Max([Date]), type date},
{"Sales", (t)=>Table.SelectRows(t, each [Date] = List.Max(t[Date]))[Sales]{0}, Currency.Type}
}),
//Remove the Quarter and Year Column
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Quarter", "Year"}),
//Set the desired order
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Date", "CompanyID", "Sales"}),
//Sort by Date and Company ID Ascending
#"Sorted Rows" = Table.Sort(#"Reordered Columns",{
{"Date", Order.Ascending},
{"CompanyID", Order.Ascending}
})
in
#"Sorted Rows"

Related

MySQL SELECT MIN and MAX RIGHT JOIN numeric value of the last 30 days

I need a query to return the initial and final numeric value of the number of listeners of some artists of the last 30 days ordered from the highest increase of listeners to the lowest.
To better understand what I mean, here are the tables involved.
artist table saves the information of a Spotify artist.
id
name
Spotify_id
1
Shakira
0EmeFodog0BfCgMzAIvKQp
2
Bizarrap
716NhGYqD1jl2wI1Qkgq36
platform_information table save the information that I want to get from the artists and on which platform.
id
platform
information
1
spotify
monthly_listeners
2
spotify
followers
platform_information_artist table stores information for each artist on a platform and information on a specific date.
id
platform_information_id
artist_id
date
value
1
1
1
2022-11-01
100000
2
1
1
2022-11-15
101000
3
1
1
2022-11-30
102000
4
1
2
2022-11-02
85000
5
1
2
2022-11-06
90000
6
1
2
2022-11-26
100000
Right now have this query:
SELECT (SELECT value
FROM platform_information_artist
WHERE artist_id = 1
AND platform_information_id =
(SELECT id from platform_information WHERE platform = 'spotify' AND information = 'monthly_listeners')
AND DATE(date) >= DATE(NOW()) - INTERVAL 30 DAY
ORDER BY date ASC
LIMIT 1) as month_start,
(SELECT value
FROM platform_information_artist
WHERE artist_id = 1
AND platform_information_id =
(SELECT id from platform_information WHERE platform = 'spotify' AND information = 'monthly_listeners')
AND DATE(date) >= DATE(NOW()) - INTERVAL 30 DAY
ORDER BY date DESC
LIMIT 1) as month_end,
(SELECT month_end - month_start) as diference
ORDER BY month_start;
Which returns the following:
month_start
month_end
difference
100000
102000
2000
The problem is that this query only returns the artist I specify.
And I need the information like this:
artist_id
name
platform_information_id
month_start_value
month_end_value
difference
2
Bizarrap
1
85000
100000
15000
1
Shakira
1
100000
102000
2000
The query should return the 5 artists that have grown the most in number of monthly listeners over the last 30 days, along with the starting value 30 days ago, and the current value.
Thanks for the help.

in Poweryquery/pivot Merging two tables based on multiple column with unique values in both relative to other

I have two tables: one for Sale and one for Target as shown below:
Target table:
Date
mat
Tar
01/01/2020
A
10
01/01/2020
B
12
01/01/2020
C
5
01/02/2020
A
10
01/02/2020
B
12
01/02/2020
C
5
01/03/2020
A
10
01/03/2020
B
12
01/03/2020
C
5
Sale table:
Date
mat
S
01/01/2020
A
5
01/01/2020
B
6
01/01/2020
C
8
01/01/2020
D
1
01/02/2020
A
1
01/02/2020
B
2
01/02/2020
D
12
01/03/2020
B
1
01/03/2020
C
4
01/03/2020
A
5
01/03/2020
F
2
As you can see, there are certain material date combinations in the Target table that is not in the Sale table and vice versa.
I want to combine them in such a way that any missing material date combo not in Target will be added as a row with the new material and the sales will be added as a new column. Below is the ideal output:
Date
mat
Tar
S
01/01/2020
A
10
5
01/01/2020
B
12
6
01/01/2020
C
5
8
01/02/2020
A
10
1
01/02/2020
B
12
2
01/02/2020
C
5
0
01/03/2020
A
10
5
01/03/2020
B
12
1
01/03/2020
C
5
4
01/01/2020
D
0
1
01/02/2020
D
0
12
01/03/2020
F
0
2
However, I am not getting this in PowerQuery when I choose a merge based on columns Date and mat when I choose a full outer-join to keep rows from both tables. My output is creating two date and material columns instead of one consolidated as shown above.
One way to do this would be to append both tables together and group by the Date and mat columns to get all of the combinations of those two that you need and then join Target and Sale to that new table.
let
Source = Table.Combine({Target, Sale}),
#"Grouped Rows" = Table.Group(Source, {"Date", "mat"}, {}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Date", Order.Ascending}, {"mat", Order.Ascending}}),
#"Merged Queries" = Table.NestedJoin(#"Sorted Rows", {"Date", "mat"}, Target, {"Date", "mat"}, "Target", JoinKind.LeftOuter),
#"Expanded Target" = Table.ExpandTableColumn(#"Merged Queries", "Target", {"Tar"}, {"Tar"}),
#"Merged Queries1" = Table.NestedJoin(#"Expanded Target", {"Date", "mat"}, Sale, {"Date", "mat"}, "Sale", JoinKind.LeftOuter),
#"Expanded Sale" = Table.ExpandTableColumn(#"Merged Queries1", "Sale", {"S"}, {"S"})
in
#"Expanded Sale"
Replace null with 0 if desired.
Edit: I'm being overly complicated. You can take care Tar and S in the Group By by taking the max over those columns.
let
Source = Table.Combine({Target, Sale}),
#"Grouped Rows" = Table.Group(Source, {"Date", "mat"}, {{"Tar", each List.Max([Tar])}, {"S", each List.Max([S])}})
in
#"Grouped Rows"

find max value for specific column while still seeing other columns

For tables patient and labh
patient
id lastname
19 patientone
20 patienttwo
patientid lastname loinc datetime numerical
19 patientone 4548-4 2014-05-15 00:00:00 6.5
19 patientone 4548-4 2015-05-15 00:00:00 7.5
19 patientone 4548-4 2016-05-15 00:00:00 3.5
19 patientone 4548-4 2017-05-15 00:00:00 5.5
19 patientone 5000-3 2018-05-15 00:00:00 123
20 patienttwo 4548-4 2013-05-15 00:00:00 2.5
20 patienttwo 4548-4 2012-05-15 00:00:00 1.5
20 patienttwo 4548-4 2011-05-15 00:00:00 9.5
20 patienttwo 4548-4 2010-05-15 00:00:00 3.5
Desired output:
patientid lastname datetime numerical
19 patientone 2017-05-15 00:00:00 5.5
20 patienttwo 2013-05-15 00:00:00 2.5
The labh table hold lab values(numerical), the type of lab (loinc) and when they were done (datetime). I'd like to query for the most recent value of loinc=4548-4 , and i'd like the output to show both the date and the value.
i've tried this below and it shows the most recent dates, but I can't see the values (numerical) at the same time. when I add the numerical column, the it shows all the values, not just the most recent.
Select Distinct patient.id, patient.lastname, Max(Date_Trunc('day', labh.datetime)) As "Date" From patient Inner Join labh On patient.id = labh.patientid Where labh.loinc = '4548-4' Group By patient.id, patient.lastname, patient.firstname Order By patient.id
you haven't selected the numerical column in your query. You can use CTE to store the data temporarily through ranking on pratition over patient id and ordering each partition on the basis of date.
So, according to this, you can try:
WITH summary AS (
SELECT p.id as "Patient ID",
p.lastname as "Patient Name",
l.datetime As "Date",
l.numerical as "Numerical",
ROW_NUMBER() OVER (PARTITION BY p.id
ORDER BY l.datetime DESC) AS rank
FROM patient p
Inner Join labh l
On p.id = l.patientid)
SELECT "Patient ID",
"Patient Name",
"Date",
"Numerical"
FROM summary
WHERE rank = 1;
And this will give you:
Patient ID
Patient Name
Date
Numerical
19
patientone
2017-05-15T00:00:00.000Z
5.5
20
patienttwo
2013-05-15T00:00:00.000Z
2.5
UPDATE
As you've updated the question and changed the expectation, the modified query will be nothing but adding a where condition inside cte construction:
WITH summary AS (
SELECT p.id as "Patient ID",
p.lastname as "Patient Name",
l.datetime As "Date",
l.numerical as "Numerical",
ROW_NUMBER() OVER (PARTITION BY p.id
ORDER BY l.datetime DESC) AS rank
FROM patient p
Inner Join labh l
On p.id = l.patientid
where l.loinc = '4548-4') -- Added this line
SELECT "Patient ID",
"Patient Name",
"Date",
"Numerical"
FROM summary
WHERE rank = 1;
This will give you the same result:
Patient ID
Patient Name
Date
Numerical
19
patientone
2017-05-15T00:00:00.000Z
5.5
20
patienttwo
2013-05-15T00:00:00.000Z
2.5
In order to achieve what you're looking for in Postgres (and other SQL RDBMSes), you need to essentially identify the max value and its corresponding primary key, then join it with the rest of the data set you are looking to retrieve:
SELECT patient.*, labh.*
FROM patient
JOIN labh
ON patient.id = labh.patientid
JOIN (SELECT patientid, max(datetime)
FROM labh
GROUP BY patientid) maxvals
ON maxvals.patientid = labh.patientid AND
maxvals.datetime = labh.datetime

Disaggregating data from months to weeks in SAS

I am stuck with a problem where I have two tables, one at the months and one at the weeks. Here's the format of the tables:
Table1
Customer Date1 Sales
1 Jan2018 1110
1 Feb2018 1245
1 Mar2018 1320
1 Apr2018 1100
...
Table2
Customer Date2
1 01Jan2018
1 08Jan2018
1 15Jan2018
1 22Jan2018
1 29Jan2018
1 05Feb2018
1 12Feb2018
1 19Feb2018
1 26Feb2018
1 05Mar2018
...
I want to create a new column for sales in Table2 that will hold the disaggregated values of sales from Table1. I want to divide the sales by the number of days in that month and then assign the values to the weeks accordingly. Thus the sales in week 01Jan2018 is (1110/31)*7. The weeks that are in transition will get values from both the months. For example 29Jan2018 has 3 days in Jan2018 and 4 days in Feb2018. The sales of one day in Jan2018 is 1110/31 and the sales of one day in Feb2018 is 1245/28.
So the sales in week 29Jan2018 will be 3*(1110/31) + 4*(1245/28)
I want to do this for each distinct customer.
The resulting table should be
Result Table
Customer Date Sales
1 01Jan2018 250.6 i.e (1110/31)*7
1 08Jan2018 250.6
1 15Jan2018 250.6
1 22Jan2018 250.6
1 29Jan2018 282.27
1 05Feb2018 311.25
1 12Feb2018 311.25
1 19Feb2018 311.25
1 26Feb2018 133.39 + 170.32
Thanks!
In DATA Step programming you will be needing some 'FORWARD' data instead of some 'LAG' data. A forward value can be emulated by creating a view to the same data starting one observation forward (obs=2). After understanding the renaming semantics, it is only a matter of some easy 'bookkeeping'.
data customer_months;
attrib Customer length=8 Date1 informat=monyy. format=monyy7.; input
Customer Date1 Sales; datalines;
1 Jan2018 1110
1 Feb2018 1245
1 Mar2018 1320
1 Apr2018 1100
run;
* week data, also with computation for month the week is in;
data customer_weeks;
attrib Customer length=8 Date2 informat=date9. format=date9.; input
Customer Date2;
Date1 = intnx('month', Date2, 0);
datalines;
1 01Jan2018
1 08Jan2018
1 15Jan2018
1 22Jan2018
1 29Jan2018
1 05Feb2018
1 12Feb2018
1 19Feb2018
1 26Feb2018
1 05Mar2018
run;
* next months sales keyed on prior month value;
data customer_next_months_view / view=customer_next_months_view;
set customer_months;
Date1 = intnx('month',Date1,-1); * the month this record will be a forward for;
rename Sales=Sales_next_month;
if _n_ > 1;
run;
* merge original and forward data, rename for making clear the variable roles;
data combined;
length disag_sales 8;
merge
customer_months (rename=Sales=Sales_this_month)
customer_next_months_view
customer_weeks
;
by Date1;
days_in_this_month = intck('day',intnx('month',Date1,0),intnx('month',Date1,1));
days_in_next_month = intck('day',intnx('month',Date1,1),intnx('month',Date1,2));
day_rate_this_month = Sales_this_month / days_in_this_month;
day_rate_next_month = Sales_next_month / days_in_next_month;
if Date2 then
if month(Date2) = month(Date2+6) then
week_days_this_month = 7;
else
week_days_this_month = intck('day', Date2, intnx('month', Date2, 1));
week_days_next_month = 7 - week_days_this_month;
dollars_this_week_this_month = week_days_this_month * day_rate_this_month;
dollars_this_week_next_month = week_days_next_month * day_rate_next_month;
* desired estimated disaggregated sales;
disag_sales = sum (dollars_this_week_this_month,dollars_this_week_next_month);
run;

group by substraction power bi

I have data like this and I want to group on rows with substraction between dates
Customer Date price
Jane 01/01/2018 10
Jane 01/02/2018 14
Joe 01/01/2018 10
Joe 01/02/2018 15
I need to obtain:
Customer price
Jane 4
Joe 5
How to perform this in power Bi ?
Try adding this into a calculated column:
Difference =
var LatestDate = Table[Date]
var LatestValue = Table[Price]
var PreviousDate = Dateadd(Table[Date],-1, day)
var PreviousValue = CALCULATE(FIRSTNONBLANK(Table[Price],1),
FILTER(Table, Table[Date]=PreviousDate))
RETURN IF(CONTAINS(Table,Table[Date],PreviousDate), LatestValue-PreviousValue , 0)