Cross tab with a list of values instead of summation - crystal-reports

I want a Cross tab that lists field values and counts them instead of just giving a count for the summation. I know I could make this with groups but I cant list the values vertically that way. From my research I believe I have to use a Display String Formula.
SQL Field Data
-------------------------------------------------
| Play # | Formation |Back Set | R/P | PLAY |
-------------------------------------------------
| 1 | TREY | FG | R | TRUCK |
-------------------------------------------------
| 2 | T | FG | R | RHINO |
-------------------------------------------------
| 3 | D | FG | P | 5 STEP |
-------------------------------------------------
| 4 | D | FG | P | 5 STEP |
-------------------------------------------------
| 5 | K JET | NG | R | DOG |
-------------------------------------------------
Desired report structure:
-----------------------------------------------------------
| Backet & Formation | Run | Pass |
-----------------------------------------------------------
| NG K JET | BULLA 1 | |
| | HELL 3 | |
-----------------------------------------------------------
| FG D | | 5 STEP 2 |
-----------------------------------------------------------
| NG K JET | DOG | |
-----------------------------------------------------------
| FG T | RHINO | |
-----------------------------------------------------------

Don't see why a Crosstab is necessary for this - especially if the entire body of the report is just that table.
Group your records by Bracket and Formation - If that's not
something natively configured in your table, make a new Formula field
and group on that.
Drop the 3 relevant fields into whichever section you need to display. (It might be a Footer, based on whether or not you want repeats
Write a formula to determine whether or not Run or Pass are displayed, and place it in their suppression field. (Good luck getting a Crosstab to do that for you! It tends to prefer 0s over blanks.)
If there's more to the report than just this table, you can cheat the system by placing your "table" into a subreport. And of course you can stretch Line objects across the sections and it will stretch to form the table outlines

Related

SQL parameter table

I suspect this question is already well-answered but perhaps due to limited SQL vocabulary I have not managed to find what I need. I have a database with many code:description mappings in a single 'parameter' table. I would like to define a query or procedure to return the descriptions for all (or an arbitrary list of) coded values in a given 'content' table with their descriptions from the parameter table. I don't want to alter the original data, I just want to display friendly results.
Is there a standard way to do this?
Can it be accomplished with SELECT or are other statements required?
Here is a sample query for a single coded field:
SELECT TOP (5)
newid() as id,
B.BRIDGE_STATUS,
P.SHORTDESC
FROM
BRIDGE B
LEFT JOIN PARAMTRS P ON P.TABLE_NAME = 'BRIDGE'
AND P.FIELD_NAME = 'BRIDGE_STATUS'
AND P.PARMVALUE = B.BRIDGE_STATUS
ORDER BY
id
I want to produce 'decoded' results like:
| id | BRIDGE_STATUS |
|--------------------------------------|------------ |
| BABCEC1E-5FE2-46FA-9763-000131F2F688 | Active |
| 758F5201-4742-43C6-8550-000571875265 | Active |
| 5E51634C-4DD9-4B0A-BBF5-00087DF71C8B | Active |
| 0A4EA521-DE70-4D04-93B8-000CD12B7F55 | Inactive |
| 815C6C66-8995-4893-9A1B-000F00F839A4 | Proposed |
Rather than original, coded data like:
| id | BRIDGE_STATUS |
|--------------------------------------|---------------|
| F50214D7-F726-4996-9C0C-00021BD681A4 | 3 |
| 4F173E40-54DC-495E-9B84-000B446F09C3 | 3 |
| F9C216CD-0453-434B-AFA0-000C39EFA0FB | 3 |
| 5D09554E-201D-4208-A786-000C537759A1 | 1 |
| F0BDB9A4-E796-4786-8781-000FC60E200C | 4 |
but for an arbitrary number of columns.

How to handle redistribution/allocation algorithm using Spark in Scala

Let's say I have a bunch of penguins around the country and I need to allocate food provisioning (which are distributed around the country as well) to the penguins.
I tried to simplify the problem as solving :
Input
The distribution of the penguins by area, grouped by proximity and prioritized as
+------------+------+-------+--------------------------------------+----------+
| PENGUIN ID | AERA | GROUP | PRIORITY (lower are allocated first) | QUANTITY |
+------------+------+-------+--------------------------------------+----------+
| P1 | A | A1 | 1 | 5 |
| P2 | A | A1 | 2 | 5 |
| P3 | A | A2 | 1 | 5 |
| P4 | B | B1 | 1 | 5 |
| P5 | B | B2 | 1 | 5 |
+------------+------+-------+--------------------------------------+----------+
The distribution of the food by area, also grouped by proximity and prioritized as
+---------+------+-------+--------------------------------------+----------+
| FOOD ID | AERA | GROUP | PRIORITY (lower are allocated first) | QUANTITY |
+---------+------+-------+--------------------------------------+----------+
| F1 | A | A1 | 2 | 5 |
| F2 | A | A1 | 1 | 2 |
| F3 | A | A2 | 1 | 7 |
| F4 | B | B1 | 1 | 7 |
+---------+------+-------+--------------------------------------+----------+
Expected output
The challenge is to allocate the food to the penguins from the same group first, respecting the priority order of both food and penguin and then take the left food to the other area.
So based on above data we would first allocate within same area and group as:
Stage 1: A1 (same area and group)
+------+-------+---------+------------+--------------------+
| AREA | GROUP | FOOD ID | PINGUIN ID | ALLOCATED_QUANTITY |
+------+-------+---------+------------+--------------------+
| A | A1 | F2 | P1 | 2 |
| A | A1 | F1 | P1 | 3 |
| A | A1 | F1 | P2 | 2 |
| A | A1 | X | P2 | 3 |
+------+-------+---------+------------+--------------------+
Stage 1: A2 (same area and group)
+------+-------+---------+------------+--------------------+
| AREA | GROUP | FOOD ID | PINGUIN ID | ALLOCATED_QUANTITY |
+------+-------+---------+------------+--------------------+
| A | A2 | F3 | P3 | 5 |
| A | A2 | F3 | X | 2 |
+------+-------+---------+------------+--------------------+
Stage 2: A (same area, food left from Stage 1:A2 can now be delivered to Stage 1:A1 penguin)
+------+---------+------------+--------------------+
| AREA | FOOD ID | PINGUIN ID | ALLOCATED_QUANTITY |
+------+---------+------------+--------------------+
| A | F2 | P1 | 2 |
| A | F1 | P1 | 3 |
| A | F1 | P2 | 2 |
| A | F3 | P3 | 5 |
| A | F3 | P2 | 2 |
| A | X | P2 | 1 |
+------+---------+------------+--------------------+
and then we continue do the same for Stage 3 (across AERA), Stage 4 (across AERA2 (by train), which is a different geography cut than AERA (by truck) so we can't just re-aggregate), 5...
What I tried
I'm well familiar how to do it efficiently with a simple R code using a bunch of For loop, array pointer and creating output row by row for each allocation. However with Spark/Scala i could only end up with big and none-efficient code for solving such a simple problem and i would like to reach the community because its probably just that i missed a spark functionality.
I can do it using a lot of spark row transformation as [withColumn,groupby,agg(sum),join,union,filters] but the DAG creation end up being so big that it start to slow the DAG build up after 5/6 stages. I can go around that by saving the output as a file after each stage but then i got an IO issue as i have millions of records to save per stage.
I can also do it running a UDAF (using .split() buffer) for each stage, explode result then join back to the original table to update each quantities per stage. It does make the DAG much more simple and fast to build but unfortunately likely due to the string manipulation inside the UDAF it is too slow for few partitions.
In the end both of the above method feel wrong as they are more like hacks and there must be a more simple way to solve this issue. Ideally i would prefer use transformation to not loose the lazy-evaluations as this is just one step among many other transformations
Thanks a lot for your time. I'm happy to discuss any suggested approach.
This is psuedocode/description, but my solution to Stage 1. The problem is pretty interesting, and I thought you described it quite well.
My thought is to use spark's window, struct, collect_list (and maybe a sortWithinPartitions), cumulative sums, and lagging to get to something like this:
C1 C2 C3 C4 C5 C6 C7 | C8
P1 | A | A1 | 5 | 0 | [(F1,2), (F2,7)] | [F2] | 2
P1 | A | A1 | 10 | 5 | [(F1,2), (F2,7)] | [] | -3
C4 = cumulative sum of quantity, grouped by area/group, ordered by priority
C5 = lag of C4 down a row, and null = 0
C6 = structure of food / quantity, with a cumulative sum of food quantity
C7/C8 = remaining food/food ids
Now you can use a plain udf to return the array of food groups that belong to a penguin, since you can find the first instance where C5 < C6.quantity and the first instance where C4 > C6.quantity. Everything in between is returned. If C4 is never larger than C6.quantity, then you can append X. Exploding this result of this array will get you all penguins and if a penguin does not have food.
To determine whether there is extra food, you can have a udf which calculates the amount of "remaining food" for each row and use a window and row_number to get the the last area that is fed. If remaining food > 0, those food ids have left over food, it will be reflected in the array, and you can also make it struct to map to the number of food items left over.
I think in the end I'm still doing a fair number of aggregations, but hopefully grouping some things together into arrays makes it faster to do comparisons across each individual item.

How to make a Tableau graph with multiple dimensions in the same line?

I have the following table with names on column 1 and various questions that are answered 'Y' or 'N' and I want to create a graph as given in the link below. I want the Ys to show up in the graph
I tried IF-ELSE calculation but it gives me the the first condition that passes and ignores the rest and my viz now has just one mark per line item.
http://imgur.com/a/2G52b
*I've replaced the 'N' with blanks in this table here
+--------+----+----+----+----+----+----+
| Name | Q1 | Q2 | Q4 | Q5 | Q6 | Q7 |
+--------+----+----+----+----+----+----+
| Bhansa | | Y | | | | |
| Chaga | Y | Y | | | | Y |
| Chang | | | | Y | Y | |
| Cooke | | Y | | Y | | |
+--------+----+----+----+----+----+----+
As user Ben mentioned, the trick here is to do a pivot. You can do that by selecting the Question columns from the data source tab and right clicking at any of the header.
Once you have pivoted the data, you can create the chart as shown below. Please note that we are using a filter on 'Pivot Field Values' to filter out the 'N' values

Tableau - Show multiple discrete string (dropdown) dimensions side-by-side in a single table

I have a list of survey results that looks similar to the following:
| Email | Question 1 | Question 2 |
| ----------------- | ---------- | ---------- |
| test#example.com | Always | Sometimes |
| test2#example.com | Always | Always |
| test3#example.com | Sometimes | Never |
Question 1 and Question 2 (and a few others) have the same discrete set of values (from a dropdown list on the survey).
I want to show the data in the following format in Tableau (a table is fine, but a heatmap or highlight table would be best):
| | Always | Sometimes | Never |
| ---------- | ------ | --------- | ----- |
| Question 1 | 2 | 1 | 0 |
| Question 2 | 1 | 1 | 1 |
How can I achieve this? I've tried various combinations of rows and columns and I just can't seem to get close to this layout. Do I need to use a calculated value?
As far as I know - it is not natively possible with Tableau, because what you have is kind of a pivot table.
What you can do is unpivot the whole table as explained here https://stackoverflow.com/a/20543651/5130012, then you can load the data into Tableau and create the table you want.
I did some dummy data and tried it.
That's my "unpivoted" table:
Row,Column,Value
test,q1,always
test,q2,sometimes
test1,q1,sometimes
test1,q2,never
test10,q1,always
test10,q2,always
test11,q1,sometimes
test11,q2,never
And that's how it looks in Tableau:

Tableau to create single chart from multiple parameters

I have tableau workbook online
Before, I had filter for single Principal, and applied to all CUSIPs, and I was able to plot all the inflation-adjusted principals based on Index ratios for a particular date, (refer tab Inflation-Adjusted Trend) i.e.
Now, I have multiple filters based on multiple Principals, i.e. buy one CUSIP for $1500, buy another for $900, etc (refer tab Infl-Adjusted Trend 2)
These were the columns and rows
But I do not like the format of this graph.
I wish to have all the lines together in one graph, just like the single-principal tab below ..... how to fix this? How to bring all the values into one chart?
You currently have six calculated fields calculating your inflation-adjusted principals, one for each CUSIP. Here's what that table might end up looking like:
+-----------+-------------+-------------+-------------+-------------+-----+
| CUSIP | 912828H45 P | 912828NM8 P | 912828PP9 P | 912828QV5 P | ... |
+-----------+-------------+-------------+-------------+-------------+-----+
| 912828H45 | $100 | NULL | NULL | NULL | ... |
| 912828NM8 | NULL | $455 | NULL | NULL | ... |
| 912828PP9 | NULL | NULL | $132 | NULL | ... |
| 912828QV5 | NULL | NULL | NULL | $553 | ... |
| ... | ... | ... | ... | ... | ... |
+-----------+-------------+-------------+-------------+-------------+-----|
There's definitely a better way. Your fields are set up like this:
IF [Cusip] = "912828H45"
THEN
[912828H45 Principal] * [Index Ratio]
END
Instead of setting up one field per CUSIP, make a single field that calculates that value for each CUSIP.
IF [Cusip] = "912828H45"
THEN
[912828H45 Principal] * [Index Ratio]
ELSEIF [Cusip] = "912828NM8"
THEN
[912828NM8 Principal] * [Index Ratio]
...
END
Now your table looks like this.
+-----------+------------------------------+-----+
| CUSIP | Inflation-Adjusted Principal | ... |
+-----------+------------------------------+-----+
| 912828H45 | $100 | ... |
| 912828NM8 | $455 | ... |
| 912828PP9 | $132 | ... |
| 912828QV5 | $553 | ... |
| ... | ... | ... |
+-----------+------------------------------+-----+
That's a LOT easier to work with. Drag that single field into Rows and color by [Cusip].