How to format the negative values in dataflow? - azure-data-factory

I have below column in my table
I need an output as below
I am using Dataflow in the Azure data factory and unable to get the above output. I used derived column but no success. I used replace function, but it's not coming correct. Can anyone advise how to format this in dataflow?

Source is taken in data flow with data as in below image.
Derived column transformation is added next to source.
New column is added and the expression is given as
iif(left(id,1)=='-', replace(replace(id,"USD",""),"-","-$"), concat("$", replace(id,"USD","")))
Output of Derived Column activity

Related

How to map Data Flow parameters to Sink SQL Table

I need to store/map one or more data flow parameters to my Sink (Azure SQL Table).
I can fetch other data from a REST Api and is able to map these to my Sink columns (see below). I also need to generate some UUID's as key fields and add these to the same table.
I would like my EmployeeId column to contain my Data Flow Input parameter, e.g. named param_test. In addition to this I need to insert UUID's to other columns which are not part of my REST input fields.
How to I acccomplish that?
You need to use a derived column transformation, and there edit the expression to include the parameters.
derived column transformation
expression builder
Adding to #Chen Hirsh, use the same derived column to get uuid values to the columns after REST API Source.
They will come into sink mapping:
Output:

Azure Data Factory schema mapping not working with SQL sink

I have a simple pipeline that loads data from a csv file to an Azure SQL db.
I have added a data flow where I have ensured all schema matches the SQL table. I have a specific field which contains numbers with leading zeros. The data type in the source - projection is set to string. The field is mapped to the SQL sink showing as string data-type. The field in SQL has nvarchar(50) data-type.
Once the pipeline is run, all the leading zeros are lost and the field appears to be treated as decimal:
Original data: 0012345
Inserted data: 12345.0
The CSV data shown in the data preview is showing correctly, however for some reason it loses its formatting during insert.
Any ideas how I can get it to insert correctly?
I had repro’d in my lab and was able to load as expected. Please see the below repro details.
Source file (CSV file):
Sink table (SQL table):
ADF:
Connect the data flow source to the CSV source file. As my file is in text format, all the source columns in the projection are in a string.
Source data preview:
Connect sink to Azure SQL database to load the data to the destination table.
Data in Azure SQL database table.
Note: You can all add derived columns before sink to convert the value to string as the sink data type is a string.
Thank you very much for your response.
As per your post the DF dataflow appears to be working correctly. I have finally discovered an issue with the transformation - I have an Azure batch service which runs a python script, which does a basic transformation and saves the output to a csv file.
Interestingly, when I preview the data in the dataflow, it looks as expected. However, the values stored in SQL are not.
For the sake of others having a similar issue, my existing python script used to convert a 'float' datatype column to string-type. Upon conversion, it used to retain 1 decimal number but as all of my numbers are integers, they were ending up with .0.
The solution was to convert values to integer and then to string:
df['col_name'] = df['col_name'].astype('Int64').astype('str')

Need recommendation in adf pipeline source properties while loading delimited text files from azure blob to snowflake

We are trying to load a delimited file which has blank data for few columns located in azure blob and would like to get a value like NA in our target snowflake table whenever we encounter a blank value in source csv file. We have been trying to provide a NA against the Null option but it is not working, any suggestions?
Here is the screenshot of what i have mentioned above.
I have used data flow activity in Azure data factory to resolve this issue.
Source file with NULL value in “Name” column.
Now use Derived Column transformation. In Derived column's settings Select column name and use iifNull({Name}, 'NA') expression.
In data preview, Null value in Name column is replaced with NA.
You can follow the above steps to replace Null values and Sink data from blob storage to Snowflake.

How to Validate Data issue for fixed length file in Azure Data Factory

I am reading a fixed-width file in mapping Data Flow and loading it to the table. I want to validate the fields, datatype, lengths of the field that I am extracting in the Derived column using substring.
How to Achieve this in ADF
Use a Conditional Split and add a condition for each property of the field that you wish to test for. For data type checking, we literally just landed new isInteger(), isString() ... functions today. The docs are still in the printing press, but you'll find them in the expression builder. For length use length().

Column defined in source Dataset could not be found in the actual source

I have an ADF Copy Data flow and I'm getting the following error at runtime:
My source is defined as follows:
In my data set, the column is defined as shown below:
As you can see from the second image, the column IsLiftStation is defined in the source. Any idea why ADF cannot find the column?
I've had the same error. You can solve this by either selecting all columns (*) in the source and then mapping those you want to the sink schema, or by 'clearing' the mapping in which case the ADF Copy component will auto map to columns in the sink schema (best if columns have the same names in source and sink). Either of these approaches works.
Unfortunately, clicking the import schema button in the mapping tab doesn't work. It does produce the correct column mappings based on the columns in the source query but I still get the original error 'the column could not be located in the actual source' after doing this mapping.
could you check that is there a column named 'ae_type_id' in your schema? If that's the case, could you remove that column and try again? The columns in the schema must be aligned with columns in the query.
The issue is caused by an incomplete schema in one of the data sources. My solution is:
Step through the data flow selecting the first schema, Import projection
Go to the flow and Data Preview
Repeat for each step.
In my case, there were trailing commas in one of the CSV files. This caused automated column names to be created in the import allowing me to fix the data file.