Creating Calculated Fields in Google Datastudio - categories

I would like to create categories based on a count of variable.
CASE
WHEN COUNT(variable) = 1 THEN "1"
WHEN COUNT(variable) = 2 THEN "2"
WHEN COUNT(variable) = 3 THEN "3"
WHEN COUNT(variable) = 4 THEN "4"
WHEN COUNT(variable) >= 5 THEN ">5"
END
I get an error that says that my formula is not valid. However, I cannot see where the mistake is and Google does not offer help in this regard.

This takes a little getting used to in Data Studio, but you can't use all functions inside of a CASE statement (as noted in the documentation).
Here's how you can work around this limitation:
Create a new calculated field with the value of COUNT(variable)
Set the new field's aggregation type to Sum in the field list
Then create your CASE statement formula referencing that new field
If you don't want this extra field showing up in reports, you can disable it in the data source (it can still be used by your other formula).
Also note that the input of COUNT itself cannot be an aggregate value (e.g. result of SUM or a metric with the aggregation type set).
This is an incredibly frustrating bit of Data Studio, as you end up with a lot of these fields floating around and it adds an extra step. The unhelpful error message definitely doesn't help either.

Related

No range function with step in azure data factory

I have a Set Variable activity which uses the logic:
#range(int(pipeline().parameters.start),int(pipeline().parameters.end))
It is wierd that I cant find any logic in documents where I can mention a step so that I can generate few numbers as shown below
1,3,5,7,9,...
Is there work around to it, other than introducin a new parameter that is equal to step and generate next number using logic last = last+step.
It is possible to do this using the Filter activity and the range function. Use the range function to generate all numbers and then the Filter condition with mod to get odd numbers, ie
Property
Value
Items
#range(1,10)
Condition
#equals(mod(item(),2),1)
A screenprint of the results:
The other way to do it would be just use a Lookup activity and query a numbers table.
I agree with you that it's a shame range does not have a step argument, and that generally the ADF expression language isn't a bit more fully featured.

GROUP BY CLAUSE using SYNCSORT

I have some content in a file on which I must generate statistics such as how many of records are of type - 1, type - 2 etc. Number of types can change and is unknown to the code until file arrives. In a SQL system, I can do this using COUNT and GROUP BY clause. But I am not sure if I can do this using SYNCSORT or COBOL program. Would anyone here have an idea on how I can implement 'GROUP BY' type query on a file using SYNCSORT.
Sample Data:
TYPE001 SUBTYPE001 TYPE01-DESC
TYPE001 SUBTYPE002 TYPE01-DESC
TYPE001 SUBTYPE003 TYPE01-DESC
TYPE002 SUBTYPE001 TYPE02-DESC
TYPE002 SUBTYPE004 TYPE02-DESC
TYPE002 SUBTYPE008 TYPE02-DESC
I want to get the information such as TYPE001 ==> 3 Records, TYPE002 ==> 3 Records. What the code doesn't know until runtime is the TYPENNN value
You show data already in sequence, so there is no need to sort the data itself, which makes SUM FIELDS= with SORT a poor solution if anyone suggests it (plus code for the formatting).
MERGE with a single input file and SUM FIELDS= would be better, but still require the code for formatting.
The simplest way to produce output which may suit you is to use OUTFIL reporting functions:
OPTION COPY
OUTFIL NODETAIL,
REMOVECC,
SECTIONS=(1,7,
TRAILER3=(1,7,
' ==> ',
COUNT=(M10,LENGTH=3),
' Records'))
The NODETAIL says "remove all the data lines". The REMOVECC says "although it is a report, don't use printer-control characters on position one of the output records". The SECTIONS says "we're going to use control-breaks, and here they (it in this case) are". In this case, your control-field is 1,7. The TRAILER3 defines the output which will be produced at each control-break: COUNT here is the number of records in that particular break. M10 is an editing mask which will change leading zeros to blanks. The LENGTH gives a length to the output of COUNT, three is chosen from your sample data with sub-types being unique and having three digits as the unique part of the data. Change to whatever suits your actual data.
You've not been clear, and perhaps you want the output "floating" (3bb instead of bb3, where b represents a blank)? That would require more code...

Tableau: Create a table calculation that sums distinct string values (names) when condition is met

I am getting my data from denormalized table, where I keep names and actions (apart from other things). I want to create a calculated field that will return sum of workgroup names but only when there are more than five actions present in DB for given workgroup.
Here's how I have done it when I wanted to check if certain action has been registered for workgroup:
WINDOW_SUM(COUNTD(IF [action] = "ADD" THEN [workgroup_name] END))
When I try to do similar thing with count, I am getting "Cannot mix aggregate and non-aggregate arguments":
WINDOW_SUM(COUNTD(IF COUNT([Number of Records]) > 5 THEN [workgroup_name] END))
I know that there's problem with the IF clause, but don't know how to fix it.
How to change the IF to be valid? Maybe there's an easier way to do it, that I am missing?
EDIT:
(after Inox's response)
I know that my problem is mixing aggregate with non-aggregate fields. I can't use filter to do it, because I want to use it later as a part of more complicated view - filtering would destroy the whole idea.
No, the problem is to mix aggregated arguments (e.g., sum, count) with non aggregate ones (e.g., any field directly). And that's what you're doing mixing COUNT([Number of Records]) with [workgroup_name]
If your goal is to know how many workgroup_name (unique) has more than 5 records (seems like that by the idea of your code), I think it's easier to filter then count.
So first you drag workgroup_name to Filter, go to tab conditions, select By field, Number of Records, Count, >, 5
This way you'll filter only the workgroup_name that has more than 5 records.
Now you can go with a simple COUNTD(workgroup_name)
EDIT: After clarification
Okay, than you need to add a marker that is fixed in your database. So table calculations won't help you.
By definition table calculation depends on the fields that are on the worksheet (and how you decide to use those fields to partition or address), and it's only calculated AFTER being called in a sheet. That way, each time you call the function it will recalculate, and for some analysis you may want to do, the fields you need to make the table calculation correct won't be there.
Same thing applies to aggregations (counts, sums,...), the aggregation depends, well, on the level of aggregation you have.
In this case it's better that you manipulate your data prior to connecting it to Tableau. I don't see a direct way (a single calculated field that would solve your problem). What can be done is to generate a db from Tableau (with the aggregation of number of records for each workgroup_name) then export it to csv or mdb and then reconnect it to Tableau. But if you can manipulate your database outside Tableau, it's usually a better solution

Calculate sum in script in ABBYY Flexicapture

I would like to perform the function of a Calculate Sum rule with a Script rule in ABBYY Flexicapture, because I want to only perform the calculation based on evaluation of an if statement.
I have tried the following in a C# script rule:
IFields AllTaxes = Context.Field("cu_location_taxes").Rows;
which gives me the error "Field is not a table."
I have also tried
IFields AllTaxes = Context.Field("cu_location_taxes").Children;
which gives me the error "Cannot access fields not specified in rule settings." Even though I have added the repeating group cu_location_taxes to the c# script rule.
Once I am able to get them in some kind of array or list or IFields variable, I would like to sum the children values in some way. I am open to doing this with JScript or C#.
The reasons of the errors you are facing can be found in ABBYY FlexiCapture Help.
In the description of IField class you can find the following descriptions of properties:
Rows - A set of table rows. Unavailable for non-table fields.
Well, it seems to be that "cu_location_taxes" is not a table. You said, it is a repeating group.
Children - Child items of the field (cells for tables). Unavailable in script rules.
But as I understand, you are exactly using script rules.
To achieve correct results try to use Items property exactly of the fields that you are summing.
For example, you have a repeating group that contains number fields field_1 and field_2. And you want to calculate the sum of all field_1 instances.
Then you can use the following code (JScript):
sum = 0
for (i = 0; i < this.Field("field_1").Items.Count; ++i)
{
sum += this.Field("field_1").Items.Item(i).Value
}
Also do not forget to add the field_1 in available fields of your rule settings.
Hope this will help.

Between... And... To work without values

I've tried to do this in a million different ways. At first I couldn't get it to work at all, but now I've managed to get it to work if I put in values.
What I need to happen is for my query to filter my records based on what I put into my form.
I've used this code in the 'Criteria' section of my MovieYear column, and when I put in numbers into my MovieYear1 and MovieYear2 text boxes in my form, it filters correctly.
Between [Forms]![SearchForm]![MovieYear1] And [Forms]![SearchForm]![MovieYear2]
But if I don't put in any values, it doesn't come up with any records at all. Any help?
I've tried pretty much everything (well, at least I think I have). I've tried using wildcards "*" but then I found out you can't actually use them with Between functions...
I've also trying doing Me.Filter in VBA, but it didn't seem to work. Maybe I just missed something?
This is my form.
Thanks in advance! :)
You can add a check for a Null in the form to the query, for example
SELECT *
FROM table
WHERE Between [Forms]![SearchForm]![MovieYear1]
And [Forms]![SearchForm]![MovieYear2]
OR [Forms]![SearchForm]![MovieYear1] Is Null
This will return all records if the first year is null. The second year will be ignored.
You could use a conditional query builder where after checking the value of the boxes you could build the query as per the following cases :
if only MovieYear1 is given then data from all years after MovieYear1 that is date>MovieYear1.
if only MovieYear2 is given then data from all years after MovieYear2 that is date<MovieYear2.
if both are given then use the between clause to get the data.
This can be implemented using CASE WHEN along the lines of following
CASE WHEN MovieYear2 IS NULL then date>MovieYear1
else when MovieYear1 IS NULL then date<MovieYear2
else date between MovieYear1 and MovieYear2