I need to store/map one or more data flow parameters to my Sink (Azure SQL Table).
I can fetch other data from a REST Api and is able to map these to my Sink columns (see below). I also need to generate some UUID's as key fields and add these to the same table.
I would like my EmployeeId column to contain my Data Flow Input parameter, e.g. named param_test. In addition to this I need to insert UUID's to other columns which are not part of my REST input fields.
How to I acccomplish that?
You need to use a derived column transformation, and there edit the expression to include the parameters.
derived column transformation
expression builder
Adding to #Chen Hirsh, use the same derived column to get uuid values to the columns after REST API Source.
They will come into sink mapping:
Output:
Related
A question concerning Azure Data Factory.
I need to persist the iterator value from a lookup activity (an Id column from a sql table) to my sink together with other values.
How to do that?
I thought that I could just reference the iterator value as #{item().id} as source and a destination column name from from my sql table sink. That doesn’t seems to work. The resulting value in the destination column is NULL.
I have used 2 look up activities, one for id values and the other for remaining values. Now, to combine and insert these values to sink table, I have used the following:
The ids look up activity output is as following:
I have one more column to combine with above id values. The following is the look up output for that:
I have given the following dynamic content as the items value in for each as following:
#range(0,length(activity('ids').output.value))
Inside for each activity, I have given the following script activity query to insert data as required into sink table:
insert into t1 values(#{activity('ids').output.value[item()].id},'#{activity('remaining rows').output.value[item()].gname}')
The data would be inserted successfully and the following is the reference image of the same:
my source is SQLDB
SINK :BLOB
SQL table have columns
in target file which i have creating blob initially no Header right. so customer given some Predefined Names so that data from sql column sholud be mapped those fileds.
in copy activity at mapping i need to map WITH proper data type and name which customer given
defaut its coming but i need ti map as i stated
HoW will i resolve it can some one help me
You can simply edit the sink header names, since its a TSV anyways
For addressing DataType mapping,
See, Data type mapping
Currently such data type conversion is supported when copying between
tabular data. Hierarchical sources/sinks are not supported, which
means there is no system-defined data type conversion between source
and sink interim types.
We are using Azure Data Factory Mapping data flow to read from Common Data Model (model.json).
We use dynamic pattern – where Entity is parameterised and we do not project any columns and we have selected Allow schema drift.
Problem: We are having issue with “Source” in mapping data flow (Source Type is Common Data Model). All the datetime/timestamp columns are read as null in source activity.
We also tried in projection tab Infer drifted column types where we provide a format for timestamp columns, However, it satisfies only certain timestamp columns - since in the source each datetime column has different timestamp format.
11/20/2020 12:45:01 PM
2020-11-20T03:18:45Z
2018-01-03T07:24:20.0000000+00:00
Question: How to prevent datetime columns becoming null? Ideally, we do not want Mapping Data Flow to typecast any columns - is there a way to just read all columns as string?
Some screenshots
In Projection tab - we do not specify schema - to allow schema drift and to dynamically load more than 1 entities.
In Data Preview tab
ModifiedOn, SinkCreatedOn, SinkModifiedOn - all these are system columns and will definitely have values in it.
This is now resolved on a separate conversation with Azure Data Factory team.
Firstly there is no way to 'stringfy' all the columns in Source, because CDM connector gets its metadata from model.json (if needed this file can be manipulated, however not ideal for my scenario).
To solve datetime/timestamp columns becoming null - under Projection tab we need to select Infer drifted column types and then you can add "multiple" time formats that you expect to come from CDM. You can either select from dropdown - if your particular datetime format is not listed in the dropdown (which was my case) then you can edit the code behind the data flow (i.e. data flow script), to add your format (see second screenshot).
The Copy Data activity in Azure Data Factory appears to be limited to copying to only a single destination table. I have a spreadsheet containing rows that should be expanded out to multiple tables which reference each other - what would be the most appropriate way to achieve that in Data Factory?
Would multiple copy tasks running sequentially be able to perform this task, or does it require calling a custom stored procedure that would perform the inserts? Are there other options in Data Factory for transforming the data as described above?
If the columnMappings in your source and sink dataset don't against the error conditions mentioned in this link,
1.Source data store query result does not have a column name that is specified in the input dataset "structure" section.
2.Sink data store (if with pre-defined schema) does not have a column name that is specified in the output dataset "structure" section.
3.Either fewer columns or more columns in the "structure" of sink dataset than specified in the mapping.
4.Duplicate mapping.
you could connect the copy activities in series and execute them sequentially.
Another solution is Stored Procedure which could meet your custom requirements.About configuration,please refer to my previous detailed case:Azure Data Factory mapping 2 columns in one column
I have a hierarchical XML file received from client, i need to store it in Hbase database, as i am new to the Hbase i not able to understand how to approach, can you please guide me how should i proceed for this hierarchical data storage to Hbase.
Thanks in advance
Hbase stores data in Column wise format. Each record must have a unique key. The sub columns can be created on the fly but not the main columns.
For example condider this xml.
<X1>
<X2 name = "uniqueid">1</X2>
<X3>
<X4>value1</X4>
<X5>value2</X5>
<X6>
<X7>value3</X7>
<X8>value4</X8>
</X6>
</X3>
<X7>value5</X7>
</X1>
In this case, the main column family would be X3 and X7. Row Id can be taken from X2.
You can construct a Hbase entry equivalent to this using java api like,
Put p = new Put("/*put the unique row id */ ".getBytes() );
p.add("X3".getBytes(), "X4".getBytes(), value1.getBytes());
where the first argument is the column family and the second one is called the column qualifier(sub column).
You can also use 2 argument constructor like,
p.add("X3:X6:X7".getBytes(),value3);
then table.put(p). Thats it!!!