I have column in which 3-4 different values and data like this
companyname servicestatus employee
abc inservice abc_e1
abc inservice abc_ew1
abc resigned abc_2
abc retired abc_3
abc retired abc_23
def inservice def_12
def inservice def_13
def inservice def_14
def resigned def_3
def resigned def_4
def resigned def_5
def retired def_3
so currently this show data like this
companyname servicestatus count
abc inservice 2
abc resigned 1
abc retired 2
def inservice 3
def resigned 3
def retired 1
so on Power BI I want to show like this
companyname joiners leavers
abc 2 3
def 3 4
means in joiners I want to show only inservice but on leavers I want to show resigned and retired both
How do I do this in Power BI
Starting from the top table, add a Custom Column (under the Add Column tab).
= if [servicestatus] = "inservice" then "joiners" else "leavers"
Then group by [companyname] and the new [Custom] column.
Finally, select [Custom] column and choose Pivot Column under the Transform tab. Choose Count as the Values Column on the drop down.
One method using PowerBI desktop is as follows:
Add the necessary tables to the data model using "Get Data"
Setup the relationships from the designer, if needed
To create the grouping, click on Edit Queries and choose Group By
Add Group by selecting appropriate tables and then click Close & apply button
Before Grouping:
After Grouping:
= Table.Group(#"Renamed Columns", {"type Num of active days"}, {{"Fitness Distinct users", each List.Sum([Distict Users]), type number}, {"Number of events", each List.Sum([Number of Events]), type number}})
Related
This question already has answers here:
Multiple rows values to one cell in MySQL
(2 answers)
Closed 4 months ago.
Can any one please advise on Spark-SQL query that is used to combine multiple rows based on column1, date(order asc) into one row by considering the column1 unique values. Below is the data
This is the enrollment table that has data in this way:
column1 column2 timeStamp
abc enrolled 2022/09/01
abc changed 2022/09/02
abc registered 2022/09/04
abc blocked 2022/09/05
abc left 2022/09/06
def enrolled 2022/09/20
def changed 2022/09/21
def changed 2022/09/21
def changed 2022/09/24
def left 2022/09/25
ghi registered 2022/09/01
ghi changed 2022/09/02
ghi left 2022/09/03
ghi returned 2022/10/03
Needed the output of the query to be like below:
out_column1 out_column2
abc enrolled-changed-registered-blocked-left
def enrolled-changed-changed-left
ghi registered-changed-left-returned
group_concat function is not available in pySpark
select column1 as out_column1
,group_concat(column2 order by timeStamp separator '-') as out_column2
from t
group by column1
out_column1
out_column2
abc
enrolled-changed-registered-blocked-left
def
enrolled-changed-changed-changed-left
ghi
registered-changed-left-returned
Fiddle
I have a filter that includes cities and I have another filed that is region and each city goes with a region. For example:
New York -> RegionID = 1
Seattle -> RegionID = 2
Boston -> RegionID = 2
Nashville -> RegionID = 3
Now I want to filter my sheets and dashboard with cities but I want this to be hierarchy that user can select RegionID and Cities corresponding with that reagion appears to select from.
How can I do such thing?
Tableau can do this with "Values from Hierarchy" in the filter options. In fact, this is the default behavior now. https://help.tableau.com/current/pro/desktop/en-us/filtering.htm#set-options-for-filter-card-interaction-and-appearance
There is also an Extension that can display this in a tree view. https://github.com/tableau/extension-hierarchy-navigator-sandboxed
Say I have a table called list, where there are items like these (the ids are random uuids):
id rank text
--- ----- -----
x 0 Hello
x 1 World
x 2 Foo
x 3 Bar
x 4 Baz
I want to maintain the property that rank column always goes from 0 to n-1 (n being the number of rows)---if a client asks to insert an item with rank = 3, then the pg server should push the current 3 and 4 to 4 and 5, respectively:
id rank text
--- ----- -----
x 0 Hello
x 1 World
x 2 Foo
x 3 New Item!
x 4 Bar
x 5 Baz
My current strategy is to have a dedicated insertion function add_item(item) that scans through the table, filter out items with rank equal or greater than that of the item being inserted, and increment those ranks by one. However, I think this approach will run into all sorts of problems---like race conditions.
Is there a more standard practice or more robust approach?
Note: The rank column is completely independent of rest of the columns, and insertion is not the only operation I need to support. Think of it as the back-end of a sortable to-do list, and the user can add/delete/reorder the items on the fly.
Doing verbatim what you suggest might be difficult or not possible at all, but I can suggest a workaround. Maintain a new column ts which stores the time a record is inserted. Then, insert the current time along with rest of the record, i.e.
id rank text ts
--- ----- ----- --------------------
x 0 Hello 2017-12-01 12:34:23
x 1 World 2017-12-03 04:20:01
x 2 Foo ...
x 3 New Item! 2017-12-12 11:26:32
x 3 Bar 2017-12-10 14:05:43
x 4 Baz ...
Now we can easily generate the ordering you want via a query:
SELECT id, rank, text,
ROW_NUMBER() OVER (ORDER BY rank, ts DESC) new_rank
FROM yourTable;
This would generate 0 to 5 ranks in the above sample table. The basic idea is to just use the already existing rank column, but to let the timestamp break the tie in ordering should the same rank appear more than once.
you can wrap it up to function if you think its worth of:
t=# with u as (
update r set rank = rank + 1 where rank >= 3
)
insert into r values('x',3,'New val!')
;
INSERT 0 1
the result:
t=# select * from r;
id | rank | text
----+------+----------
x | 0 | Hello
x | 1 | World
x | 2 | Foo
x | 3 | New val!
x | 4 | Bar
x | 5 | Baz
(6 rows)
also worth of mention you might have concurrency "chasing condition" problem on highly loaded systems. the code above is just a sample
You can have a “computed rank” which is a double precision and a “displayed rank” which is an integer that is computed using the row_number window function on output.
When a row is inserted that should rank between two rows, compute the new rank as the arithmetic mean of the two ranks.
The advantage is that you don't have to update existing rows.
The down side is that you have to calculate the displayed ranks before you can insert a new row so that you know where to insert it.
This solution (like all others) are subject to race conditions.
To deal with these, you can either use table locks or serializable transactions.
The only way to prevent a race condition would be to lock the table
https://www.postgresql.org/docs/current/sql-lock.html
Of course this would slow you down if there are lots of updates and inserts.
If can somehow limit the scope of your updates then you can do a SELECT .... FOR UPDATE on that scope. For example if the records have a parent_id you can do a select for update on the parent record first and any other insert who does the same select for update would have to wait till your transaction is done.
https://www.postgresql.org/docs/current/explicit-locking.html#:~:text=5.-,Advisory%20Locks,application%20to%20use%20them%20correctly.
Read the section on advisory locks to see if you can use those in your application. They are not enforced by the system so you'll need to be careful of how you write your application.
I am trying to create a Crystal Report. One of the parameters that it has, is a Boolean flag which change the groupings of the Report.
What I'm trying to do is to add 3 levels of grouping from a formula.
The code that I wrote is:
if {?summarized_detailed} = true then
'Store Code : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.WhsCode}
else
if {?season_supplier} = true then
'Store Code : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.WhsCode} + ChrW(10) + 'Season : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.SEASON} + ChrW(10) + 'Brand : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.BRAND}
else
'Store Code : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.WhsCode} + ChrW(10) + 'Supplier : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.CardCode} + ChrW(10) + 'Brand : ' + {SNV_SP_ProfitabilityAndBreakEvenPerStore;1.BRAND}
{?summarized_detailed} is the parameter (the flag for grouping).
if he chooses true, the report must have the columns -> Store Code , Season or Supplier (it based on another flag-parameter) and Brand.
Output Layer that I have :
Name of Col1 Name of Col2 Name of Col3
Store: Value for Store Code, Season or Supplier: Value for Season or Supplier, Brand: Value for Brand | total Brand Col1 total Brand Col2 total Brand Col3 ....
With this Code, I take the result that I want. All database's rows are separated according to those groups.
The problem is, that I want to have total sum for every column that I have in my report. But I can't do this, because the above formula, created only ONE union group. So, I can have a summary only for the details within every brand.
Output Layer that I want :
Name of Col1 Name of Col2 Name of Col3
Store: Value for Store Code |
Seas or Sup: Value for Season or Supplier |
Brand: Value for Brand | total Brand Col1 total Brand Col2 total Brand Col3
//when supplier or season within all groups changes (not included in report)
--------------------------------------------------------------------------------------------
Total Supplier or Season: total Sup/Season Col1 total Sup/Season Col2 total Sup/Seas Col3
//when store within all groups changes (not included in report)
Total Store: total Store Col1 total Store Col2 total Store Col3
---------------------------------------------------------------------------------------------
With total Brand Col1 I want to do a summary for all rows of details for each field of the same brand.
With total Supplier Col1 I want to sum all total Brands for each field of the same supplier.
With total Store Col1 I want to sum all total Suppliers for each field for every shop.
How can I compute those totals and display them only when Season or Supplier changes, and when Store changes with no regarding the change of the union Group ?
Or is there an easier and better way to make those 3 groups from formula according to the parameter but let them be separated so that I can control them and do my summarizes ?
This is how I handle this type of requirement:
If you want a maximum of 3 groups, but potentially less, then you need to create two different formula fields. In the report design, you set grouping to group on Store_Code first, then the other two formula fields.
If the refresh/runtime parameter will be FALSE for when you want the three-level grouping, then the formulas should look something like this:
Group1_Formula1: if {?param} = TRUE then '' else {Season}
Then the next:
Group2_Formula: if {?param} = TRUE then '' else {Brand}
If you want totals/subtotal the just place them as summary fields in the group headers, with conditional display to suppress them as necessary.
The result will be that, even though grouping on all three levels is always taking place, for the subgroups below the always-active "Store_Code" each subgroup will have only a single group of all records if the {?Param} = TRUE, effectively meaning there is no grouping happening.
One caveat: This presumes you aren't using print-time evaluation, e.g. before/after printing, etc., in any of the related formulas necessary to perform your grouping.
I have 3 tables
items,
item_units,
order_items
First table items has the list of items that can be ordered.
Second table item_units has units for the items as well as the amount of those items in this unit
Third table has items that were ordered... ie... item_code , unit, qty
Here are the columns for items
[item_code]
,[desc]
Here are the columns for item_units
,[item_code]
,[unit]
,[amount]
,[price]
,[default_sprd_sht]
Here are the columns for order_items
,[order_id]
,[item_code]
,[item_desc]
,[unit]
,[qty]
,[price]
Note the [default_sprd_sht]. This field is a boolean. If it's set to true this unit is never put into order_items table. This field will be used as calculation field.
For example:
If 1 customer orders 2 6 packs of bread and another orders 3 dozens of bread, the baker needs to know how many pans of bread to make.
Now a 6 pack unit has 6 breads as amount, meaning 2 * 6 = 12. And a dozen unit has 12 breads.. 12 * 3 = 36. A pan bread unit has 20 breads. So i need to add up all the bread units amounts and divide them by the pan amount like so
((2*6) + (12 * 3)) / 20 = 2.4
So the first thing I did to create a report for the baker was
Create a group for order_items.item_code and then order_item.unit.
This needs to be done since the same item and unit cobination will be repeated in different orders. The baker needs to see how many bagels or breads he needs to bake in total.
in the order_item.unit group header I created a formula field that multiplies the order_item.unit by item.amount
Sum ({order_items.qty}, {order_items.unit}) * {items_units.amount}
That was easy.
But I aslo need to group all order items if there exists a record in the items_units with the same item_code and with [default_sprd_sht] set to true
This would look like so
(Sum ({order_items.qty}, {order_items.unit}) * {items_units.amount}) / (get amount for unit with the same item_code and [default_sprd_sht] = 1)
I have two problems accomplishing this.
How to check if this order item has a unit with same item_code and
[default_sprd_sht] = 1?
How to further group order items only if there is a unit with same
item_code and [default_sprd_sht] = 1?