I have a dataframe like this :
user_id items
1 item1
1 item2
1 item3
2 item1
2 item5
3 item4
3 item2
If I put user_id as row and items as columns, I get this :
user_id number_of_items
1 3
2 2
3 2
Now I would like to group this result again, like this :
number_of_user_id number_of_items
1 3
2 2
How can I do this, as a calculated field or in a graph(maybe an histogramm?)
First create the following calculated field, called users_per_item
{ fixed items : countd(user_id) }
Then highlight the new measure you just created in the data pane, users_per_item and right click to create Bins. Set the bin size to 1 or whatever value you like. That will create a dimension called users_per_item (bin)
Finally, you can now use the bin field to create the view you want, say place users_per_item (bin) on the columns shelf and CNTD(items) on the rows shelf.
A natural use for LOD calculations for a 2 stage analysis.
Related
I am trying to get a result in my report, which I beleive, requires a where clause and did not work for me with the select expert section.
I have 2 tables. Lets call them table 1 and table 2.
Table 1 contains unique records.
Table 2 contains multiple records for the same uniqueKey as table 1.
there are 3 fields in table 2 that play a roll for each uniqueKey from table 1.
QTY_ORD
QTY_SHIPPED
ITEM_CANCEL
Lets assume for item # 1 from table 1, there are 5 records in table 2. Each record has a values for the 3 above mentioned fields. I need to display the SUM of all the records where ITEM_CANCEL = 0 of QTY_SHIPPED - QTY_ORD.
It could be that 3 of the records have ITEM_CANCEL = 1 (We can ignore these records), but for the other 2 reocrds where ITEM_CANCEL = 0, I need the SUM of QTY_SHIPPED - SUM of QTY_ORD.
the current code I have is as follows"
if {current_order1.ITEM_CANCEL} = 0 then
sum({current_order1.QTY_ORD})-sum({current_order1.QTY_SHIPPED}) else
0
but this result gives me the sum of ALL the records, including the ones where ITEM_CANCEL = 1.
If I use ITEM_CANCEL = 0 in the select expert, then it removes ALL the results that have no value in table 2. I even tried the code without using the SUM function, but this provided the result of only 1 of the records in table 2 where ITEM_CANCEL = 0, and not the total difference of the 2 records in table 2 that I require.
Any suggestions on this?
Start with a detail-level formuls (no SUM):
if {current_order1.ITEM_CANCEL} = 0 then {current_order1.QTY_ORD} - {current_order1.QTY_SHIPPED} ELSE 0
Then, SUM that formula at whatever Group or Report levels you require.
I am trying to create a report which has addresses in form of house Nbr and street Name. I want to group all address by street name and then order them by house nbr which is a string but should sort like a number. Ideally i would like the odd ascending and then the evens descending so that my list would look like
1,3,5,7,9 .... 8,6,4,2
How would i go about this ? I created first group on street name and then 2nd group on house number with a formula for sorting of nbrs
i created a Formula Field OddEven with
ToNumber({tbl_FarmMaster.sano}) MOD 2
but i am having hard time applying that to my group
Create two formulas like below. Let's call them oddFirst and negativeEven.
oddFirst formula:
ToNumber({tbl_FarmMaster.sano}) MOD 2 == 1 then
1 //it is odd
else
2 //it is even
negativeEven formula:
if ToNumber({tbl_FarmMaster.sano}) MOD 2 == 1 then
ToNumber({tbl_FarmMaster.sano}) //it is odd
else
-ToNumber({tbl_FarmMaster.sano}) //it is even, note the negative sign
Then create two groups to sort:
first by the formula oddFirst
second by the formula negativeEven
Show the {tbl_FarmMaster.sano} field.
Say I have a table called list, where there are items like these (the ids are random uuids):
id rank text
--- ----- -----
x 0 Hello
x 1 World
x 2 Foo
x 3 Bar
x 4 Baz
I want to maintain the property that rank column always goes from 0 to n-1 (n being the number of rows)---if a client asks to insert an item with rank = 3, then the pg server should push the current 3 and 4 to 4 and 5, respectively:
id rank text
--- ----- -----
x 0 Hello
x 1 World
x 2 Foo
x 3 New Item!
x 4 Bar
x 5 Baz
My current strategy is to have a dedicated insertion function add_item(item) that scans through the table, filter out items with rank equal or greater than that of the item being inserted, and increment those ranks by one. However, I think this approach will run into all sorts of problems---like race conditions.
Is there a more standard practice or more robust approach?
Note: The rank column is completely independent of rest of the columns, and insertion is not the only operation I need to support. Think of it as the back-end of a sortable to-do list, and the user can add/delete/reorder the items on the fly.
Doing verbatim what you suggest might be difficult or not possible at all, but I can suggest a workaround. Maintain a new column ts which stores the time a record is inserted. Then, insert the current time along with rest of the record, i.e.
id rank text ts
--- ----- ----- --------------------
x 0 Hello 2017-12-01 12:34:23
x 1 World 2017-12-03 04:20:01
x 2 Foo ...
x 3 New Item! 2017-12-12 11:26:32
x 3 Bar 2017-12-10 14:05:43
x 4 Baz ...
Now we can easily generate the ordering you want via a query:
SELECT id, rank, text,
ROW_NUMBER() OVER (ORDER BY rank, ts DESC) new_rank
FROM yourTable;
This would generate 0 to 5 ranks in the above sample table. The basic idea is to just use the already existing rank column, but to let the timestamp break the tie in ordering should the same rank appear more than once.
you can wrap it up to function if you think its worth of:
t=# with u as (
update r set rank = rank + 1 where rank >= 3
)
insert into r values('x',3,'New val!')
;
INSERT 0 1
the result:
t=# select * from r;
id | rank | text
----+------+----------
x | 0 | Hello
x | 1 | World
x | 2 | Foo
x | 3 | New val!
x | 4 | Bar
x | 5 | Baz
(6 rows)
also worth of mention you might have concurrency "chasing condition" problem on highly loaded systems. the code above is just a sample
You can have a “computed rank” which is a double precision and a “displayed rank” which is an integer that is computed using the row_number window function on output.
When a row is inserted that should rank between two rows, compute the new rank as the arithmetic mean of the two ranks.
The advantage is that you don't have to update existing rows.
The down side is that you have to calculate the displayed ranks before you can insert a new row so that you know where to insert it.
This solution (like all others) are subject to race conditions.
To deal with these, you can either use table locks or serializable transactions.
The only way to prevent a race condition would be to lock the table
https://www.postgresql.org/docs/current/sql-lock.html
Of course this would slow you down if there are lots of updates and inserts.
If can somehow limit the scope of your updates then you can do a SELECT .... FOR UPDATE on that scope. For example if the records have a parent_id you can do a select for update on the parent record first and any other insert who does the same select for update would have to wait till your transaction is done.
https://www.postgresql.org/docs/current/explicit-locking.html#:~:text=5.-,Advisory%20Locks,application%20to%20use%20them%20correctly.
Read the section on advisory locks to see if you can use those in your application. They are not enforced by the system so you'll need to be careful of how you write your application.
I have 3 tables
items,
item_units,
order_items
First table items has the list of items that can be ordered.
Second table item_units has units for the items as well as the amount of those items in this unit
Third table has items that were ordered... ie... item_code , unit, qty
Here are the columns for items
[item_code]
,[desc]
Here are the columns for item_units
,[item_code]
,[unit]
,[amount]
,[price]
,[default_sprd_sht]
Here are the columns for order_items
,[order_id]
,[item_code]
,[item_desc]
,[unit]
,[qty]
,[price]
Note the [default_sprd_sht]. This field is a boolean. If it's set to true this unit is never put into order_items table. This field will be used as calculation field.
For example:
If 1 customer orders 2 6 packs of bread and another orders 3 dozens of bread, the baker needs to know how many pans of bread to make.
Now a 6 pack unit has 6 breads as amount, meaning 2 * 6 = 12. And a dozen unit has 12 breads.. 12 * 3 = 36. A pan bread unit has 20 breads. So i need to add up all the bread units amounts and divide them by the pan amount like so
((2*6) + (12 * 3)) / 20 = 2.4
So the first thing I did to create a report for the baker was
Create a group for order_items.item_code and then order_item.unit.
This needs to be done since the same item and unit cobination will be repeated in different orders. The baker needs to see how many bagels or breads he needs to bake in total.
in the order_item.unit group header I created a formula field that multiplies the order_item.unit by item.amount
Sum ({order_items.qty}, {order_items.unit}) * {items_units.amount}
That was easy.
But I aslo need to group all order items if there exists a record in the items_units with the same item_code and with [default_sprd_sht] set to true
This would look like so
(Sum ({order_items.qty}, {order_items.unit}) * {items_units.amount}) / (get amount for unit with the same item_code and [default_sprd_sht] = 1)
I have two problems accomplishing this.
How to check if this order item has a unit with same item_code and
[default_sprd_sht] = 1?
How to further group order items only if there is a unit with same
item_code and [default_sprd_sht] = 1?
Need some stack overflow love. Read the other questions related to group by and sort by and those answers don't seem to work for me. I am not really strong in the finer details of db statements and think there is some dependency or conflict between group and order by statements I am missing.
I have a table (itemOptions) that holds all the options of an item and all the possible values for those options. Think multiple select/drop down boxes for an item and the list of values for each drop down. And unfortunately I can't change the db structure, as a web service is providing the sqlite file.
Here's the structure:
absTtemId optionName optionPosition valueName valuePosition
item1 size 1 small 1
item1 size 1 medium 2
item1 size 1 large 3
item1 color 2 white 1
item1 color 2 red 2
item1 color 2 yellow 3
item2 name 1 willis 1
item2 name 1 smith 2
item2 name 1 bowman 3
The query needs to return optionsNames for a given item ordered by optionPosition, and then a list of valueNames for each option ordered by valueposition, like this
option valueNames
size small, medium, large
color white, red, yellow
I am able to get the grouping of valueNames by option to work, but when I try to add sorting anywhere, sqlite throws errors.
Here's my current sql statement that returns these results, however option and valueNames are never in order.
SELECT optionName, group_concat(valueName)
FROM itemOptions
WHERE absItemId = 'item1'
GROUP BY optionName
option valueNames
size medium, small, large
color yellow, red, white
Here's some of my failed attempts at adding sorting for valueName and optionName
SELECT optionName, group_concat(valueName ORDER BY valuePosition DESC)
FROM itemOptions
WHERE absItemId = 'item1'
GROUP BY optionName
EDIT: sporting for optionName grouping is working now with this. Only valueName sorting within the group_concat not working.
SELECT optionName, group_concat(valueName)
FROM itemOptions
WHERE absItemId = 'item1'
GROUP BY optionName
ORDER BY optionPosition
try
SELECT a.optionName, group_concat(a.valueName)
FROM (SELECT * FROM itemOptions ORDER BY valuePosition ASC) As a
WHERE a.absItemId = 'item1'
GROUP BY a.optionName
ORDER BY a.optionName