Can I create a parameter in a Local? - azure-data-factory

I created a Derived Column with a Expression
(dummy sample)
iif(columnX=='true',1,0)
This expression will be util in anothers Derived Columns, so I'd like create a Local with this Expression, but in the place of columnX I'll put a parameter for another column
Is it possible? How?

I tried creating a data flow parameter (param2) and added your expression to it using a different parameter (param1) instead of ColumnX.
But I was not able to change the parameter value in the expression later, it was only taking the default value. Also did not find any related documents to assign a column value to a parameter in the data flow.
The only way I could think of is, using the expression multiple times in different derived columns taking different columns in place of ColumnX.
Derived column1: Added expression to the new column (col1)
Preview of Derived column1: Evaluating expression against Sample1 column.
Derived column2: Reusing same column name Col1 to evaluate the expression against Sample2 column. If the previous value is needed, you can assign the previous value of Col1 to the new column (previous_col1) in derived column2 as shown in the below snip.
Preview of derived column2

Related

reduce function not working in derived column in adf mapping data flow

I am trying to create the derived column based on the condition that met the value and trying to do the summation of multiple matching column values dynamically. So I am using reduce function in ADF derived column mapping data flow. But the column is not getting created even the transformation is correct.
Columns from source
Derived column logic
Derived column data preview without the new columns as per logic
I could see only the fields from source but not the derived column fields. If I use only the array($$) I could see the fields getting created.
Derived column data preview with logic only array($$)
How to get the derived column with the summation of all the fields matching the condition?
We are getting data of 48 weeks forecast and the data to be prepared on monthly basis.
eg: Input data
Output data:
JAN
----
506 -- This is for first record i.e. (94 + 105 + 109 + 103 + 95)
The problem is that the array($$) in the reduce function has only one element, so that the reduce function can not accumulate the content of the matching columns correctly.
You can solve this by using two derived columns and a data flow parameter as follows:
Create derived columns with pattern matching for each month-week you did it before, but put the reference $$ into the value field, instead of the reduce(...) function.
This will create derived columns like jan0, jan1, etc. containing the copy of the original values. For example Week 0 (1 Jan - 7 Jan) => 0jan with value 95.
This step gives you a predefined set of column names for each week, which you can use to summarize the values with specific column names.
Define Data Flow parameters for each month containing the month-week column names in a string array, like this:
ColNamesJan=['0jan' ,'1jan', etc.] ColNamesFeb=['0feb' ,'1feb', etc.] and so on.
You will use these column names in a reduce function to summarize the month-week columns to monthly column in the next step.
Create a derived column for each month, which will contain the monthly totals, and use the following reduce function to sum the weekly values:
reduce(array(byNames($ColNamesJan)), 0, #acc + toInteger(toString(#item)),#result)
Replace the parameter name accordingly.
I was able to summarize the columns dynamically with the above solution.
Please let me know if you need more information (e.g. screenshots) to reproduce the solution.
Update -- Here are the screenshots from my test environment.
Data source (data preview):
Derived columns with pattern matching (settings)
Derived columns with pattern matching (data preview)
Data flow parameter:
Derived column for monthly sum (settings):
Derived column for monthly sum (data preview):

how to replace null values in dynamic table with 'mean' or 'unknown' as per the column data type in azure data factory?

I have data from two data sources i.e SQL and PostgreSQL. For every table want to replace the column having 'Null values' with MEAN if column type is integer and by 'Unknown' if column type is string.
I have tried using derived column but i am not sure how to pass on dynamic column values.
I created a pipeline with the 'LookUp' activity and 'ForEach' activity and calling a dataflow.
The migration is happening from SQL to Postgres so need to validate tables as well null values.
you have 2 cases here, the first one is replacing a null values in a string column with 'unknown' and the second case is replacing null values in an integer column with the mean of the values in the same column.
Main idea:
add a derived column , replace the null values in a string with unknown
fix the null values in an integer column,replace null with zeros so we will replace these zeros with the mean value when we calculate it by using a window activity.
Here is a quick demo that i built in ADF.
First, i created a dataset with 3 columns (name,height,address), height type is integer and address is a string like so:
ADF:
Derived Column activity:
modified address and height column as mentioned above.
Window activity:
in window activity, the idea is to replace the zeros with the mean value, to see the difference, i added a new column named it 'newHeight' just we can see the difference but you can override the original height column
in window settings -> window columns :
added a new column newHeight with the value :
case(height == 0 ,divide(sum(height),count(height)),toLong(height))
Output:
please read more about window transformation here:
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-window

Is there a way of creating a Serial Number based on other inputs on a MS access form?

I have some samples I need to take.
In order to create a good identifier/serial number for the samples, I want it to be a product of its characteristics.
For example, if the sample was taken from India and the temperature was 40 degrees then I would click dropdowns in the form to create those two entries and then a serial number would be spat out in the form "Ind40".
Assuming that your form is bound to a table, you can create a calculated column in the table that concatenates the values from other columns into a single value.
For instance, create a new column and give it a name (for example, SerialNbr). Then for Data Type select "Calculated". An expression builder window will appear:
Enter the columns you'd like to concatenate and separate them with &. Here is an example of how the expression could look:
Left([Country],3) & [Temperature]
This expression takes the first 3 chars from the Country column and combines it with the value from Temperature column to create the value in column SerialNbr. The calculated column will automatically update when values are entered into the other fields. I'd also suggest adding another value to the calculated expression to help avoid duplicates, such as date/time of submission.

Display date from Date Dimension in SSIS Derived Column

I created a derived column to include a Fiscal Year in an ssis package. The package includes a DateDimension with a FiscalYear column. The data in the column is displayed as “SFY2018Q1”. The Column name is displayed as “[[$DATE_DIM].[FQUARTER]]
The expression I created should display only the year “2018” from the DateDimension. However, is not resolving “is red” in the derived column. Below is the expression I created.
LEFT(RIGHT([$Date_DimFQuarter],3),2)
I also attempted the expression by excluding the “$”, and by adding the Table name DateDim. Neither of those modifications work.
Any assistance on what I am doing wrong is greatly appreciated.
just double click derived column toolbox then drag and drop your column from columns tree, your expression must be LEFT(RIGHT([YourColumn],3),2).
so try not to write the column by your self, just drag and drop it.

SSRS - Expression using different dataset fields

I have a report with multiple data-sets. Different fields from different data-sets are used in different locations of the report.
In one part of the report, I need to do a calculation using fields from two different data-sets. Is this possible within an expression?
Can I somehow reference the data-set the field is in, in the expression?
For example, I'd like to do something like this:
=Fields.Dataset1.Field / Fields.Dataset2.Field
You can achieve that by specifying the scope of you fields like this:
=First(Fields!fieldName_A.Value, "Dataset1") / First(Fields!fieldName_B.Value, "Dataset2")
Assuming A is 10 and B is 2 and they are of type numeric then you will have the result of 5 when the report renders.
When you are in the expression builder you can choose the Category: Datasets, your desired dataset highlighted under Item: and then double click the desired field under Value: and it will appear in your expression string with the scope added.
Using same logic you can concatenate two fields like so:
=First(Fields!fieldName_A.Value, "Dataset1") & “ “ & First(Fields!fieldName_B.Value, "Dataset2")
As PerPlexSystem writes, asuming you only want to compare the first value from a dataset with values from another dataset, you can use the First function.
However, if you want to compare the values of each row from one dataset with with the values from each row of another dataset, then you will need to use a subreport - see here for further details.
Another option is to use a parameter as a variable. This is helpful if you want to create a calculated field in one of the datasets. This is best applied when the parameter value comes from a dataset with a single record.