Substring of column name in Copy Activity in ADF v2 - azure-data-factory

Is there a way in the V2 Copy Activity to operate upon one of the input columns (of type string) with an expression? Before I load rows to the destination, I need to limit the number of characters in the column.
My hope was to simply switch from something like this:
"ColumnMappings": "inColumn: outColumn"
to something like this:
"ColumnMappings": "#substring(inColumn, 1, 300): outColumn"
If anyone can point me to where I can read-up on where & when string expressions can be used, I could use the guidance.

This is the official documentation on expressions and functions: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions
And this is the documentation on mappings: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping
Also remember that if you are using a defined query in the copy activity, you can use sql functions like CAST([fieldName] as varchar(300)) to limit the amount of characters on a particular field.
Hope this helped!

When you don't have a SQL Source, but your destination is a SQL sink, you can use a Stored Procedure to insert your data into the final table. That way, you can define these kinds of transformations in the stored procedure. I don't think the Data Factory can handle these kinds of activities, it is more intended as an orchestrator.
Have a look here:
https://learn.microsoft.com/en-us/azure/data-factory/connector-sql-server#invoke-stored-procedure-from-sql-sink

Related

How can I alias labels (using a query) in Grafana?

I'm using Grafana v9.3.2.2 on Azure Grafana
I have a line chart with labels of an ID. I also have an SQL table in which the IDs are mapped to simple strings. I want to alias the IDs in the label to the strings from the SQL
I am trying to look for a transformation to do the conversion.
There is a transformation called “rename by regex”, but that will require me to hardcode for each case. Is there something similar with which I don't have to hardcode for each case.
There is something similar for variables - https://grafana.com/blog/2019/07/17/ask-us-anything-how-to-alias-dashboard-variables-in-grafana-in-sql/. But I don't see anything for transformations.
Use 2 queries in the panel - one for data with IDs and seconds one for mapping ID to string. Then add transformation Outer join and use that field ID to join queries results into one result.
You may need to use also Organize fields transformation to rename, hide unwanted fields, so only right fields will be used in the label at the end.

Creating spectrum table in matillion for csv file with comma inside quotes

I have a scenario for creating spectrum table in redshift using matillion.
my CSV file data is like this:-
column1,column2,column3
abc,"qwety,pqr",xyz
but in spectrum table i am seeing data
as
column1 column2 column3
abc qwerty pqr
Matillion is not taking quotes value as one.
can you please suggest how to achieve this using matillion's EXTERNAL TABLE component.
Basically you would like to specify a quote parameter for your CSV data.
Redshift has 2 ways of specifying external tables (see Redshift Docs for reference):
using the default built-in SerDes and properties like ROW FORMAT DELIMITED, FIELDS TERMINATED BY
explicitly specifying a SerDe with ROW FORMAT SERDE, WITH SERDEPROPERTIES
I don't think it's possible to specify a quote parameter using the built-in SerDes.
It is possible to specify them using org.apache.hadoop.hive.serde2.OpenCSVSerde (look here for details on it's properties), but beware that there are know problems with it, as one described in this SO question.
Now for Metillion:
I have never used Matillion, but looking at their Redshift External Table documentation page, looks like it's only possible to specify the FORMAT and the FIELD TERMINATOR, but not to specify a SerDe and it's properties, hence it's not possible to specify the quote parameters for the external table - unless there are some undocumented means to specify a custom SerDe.
Personal note:
We have experienced many problems with ingesting data stored as CSV, and we basically try to avoid it. There's no standard for CSV, each tool implements it's own version of support for it, and it's very difficult convince all your tools to see data the same way.

Updating the table through tOracleOutput in Talend using an additional SQL query

I have a job where I am getting a flow into tOracleOutput where I am updating the table. Now, I have to update that table using an SQL statement, which I guess we have option in Advanced settings of tOracleOuptut, but I don't know how to use it or you can say that I am not getting the settings properly. I referred to official documentation but could not understand. Can any one explain the fields like Name, SQL expression, Position, Reference Column in a better way?
the SQL query which I am using is:
update set COL1=SOMETHING1
where COL2=SOMETHING2
Now, value for COL1 is coming from the flow but COL2 is some column in the table which is not coming from the flow.
Have a look to tOracleRow for such a case.
Hope this helps.
TRF
Using tOracleOutput is helpful when a ready data source (table or file (...) with same columns as destination) the more elaborate your query is, the more you should do as TRF said (and use tOracleRow), but here's an example to your question:
file contain 3 column,
DB table of destination contains 4 column, where the 4th is the date of update, (the first 3 are identical to the input)
so you add the destination's column's name in Name and put the SQL function for the date (eg: SYSDATE) and where to put it (Position) in reference to a column of your choice (Reference Column)
In my view it helps avoid using tMap for a miserable additional column when you want to Insert, but you want to Update, in which case the component doesn't offer the additional column section, plus I don't think you can add the WHERE clause here
Hope it helps

ormlite select count(*) as typeCount group by type

I want to do something like this in OrmLite
SELECT *, COUNT(title) as titleCount from table1 group by title;
Is there any way to do this via QueryBuilder without the need for queryRaw?
The documentation states that the use of COUNT() and the like necessitates the use of selectRaw(). I hoped for a way around this - not having to write my SQL as strings is the main reason I chose to use ORMLite.
http://ormlite.com/docs/query-builder
selectRaw(String... columns):
Add raw columns or aggregate functions
(COUNT, MAX, ...) to the query. This will turn the query into
something only suitable for using as a raw query. This can be called
multiple times to add more columns to select. See section Issuing Raw
Queries.
Further information on the use of selectRaw() as I was attempting much the same thing:
Documentation states that if you use selectRaw() it will "turn the query into" one that is supposed to be called by queryRaw().
What it does not explain is that normally while multiple calls to selectColumns() or selectRaw() are valid (if you exclusively use one or the other),
use of selectRaw() after selectColumns() has a 'hidden' side-effect of wiping out any selectColumns() you called previously.
I believe that the ORMLite documentation for selectRaw() would be improved by a note that its use is not intended to be mixed with selectColumns().
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectColumns("emailAddress"); // This column is not selected due to later use of selectRaw()!
qb.selectRaw("COUNT (emailAddress)");
ORMLite examples are not as plentiful as I'd like, so here is a complete example of something that works:
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectRaw("emailAddress"); // This can also be done with a single call to selectRaw()
qb.selectRaw("COUNT (emailAddress)");
qb.groupBy("emailAddress");
GenericRawResults<String[]> rawResults = qb.queryRaw(); // Returns results with two columns
Is there any way to do this via QueryBuilder without the need for queryRaw(...)?
The short answer is no because ORMLite wouldn't know what to do with the extra count value. If you had a Table1 entity with a DAO definition, what field would the COUNT(title) go into? Raw queries give you the power to select various fields but then you need to process the results.
With the code right now (v5.1), you can define a custom RawRowMapper and then use the dao.getRawRowMapper() method to process the results for Table1 and tack on the titleCount field by hand.
I've got an idea how to accomplish this in a better way in ORMLite. I'll look into it.

Using table names as parameters in t-sql (eg from #tblname)

Is it possible to use the name of a table as a parameter in t-sql?
I want to insert data into a table, but I want one method in C# which has a parameter for the table.
Is this a good approach? I think if I have one form and I am choosing the table and fields to insert data into, I am essentially looking to write my own dynamic sql query built on the fly. This is another thing altogether which I am sure has its catches?
Thanks
Not directly. The only way to do this is through dynamic SQL - either EXEC or sp_ExecuteSQL. The latter has the advantage of query cache/re-use, and avoiding injection via parameters for the values - but you will have to concatenate the table-name itself into the query (you can't parameterise it), so be sure to white-list it against a list of known-good table names.