Using data (not METAdata) from a SSMS DB to drive biml in SSIS dtsx creation - biml

I need to be able to generate tasks in a data flow of a SSIS dtsx based on data stored in a SSMS DB table.
Basically, I have N (unknown, variable) sources and M (unknown, variable) destinations and I have a table T0 with MxN rows. Each row, using a bit B, specifies if I really need to send data from that specific source to that specific destination.
T0 is stored in a basic Database DB0 that is used for configuration.
Ideally, using BIML I want to script a dtsx that has a simple data flow (OLEDB Source -> OLEDB destination) FOR EACH row of T0 with B = 1.
This dtsx generation must be driven by the data i find in T0.
Online I can find how to read METADATA from a DB but I can't find anyone talking about the much basic "reading data from a DB" and using these data in the BIML.
Do I need to handle the DB connection in the C# code in BIMLScript? How do I use/address the connection to DB0 in the c# code? (usually I would have this connection in a config file... in this case I need to refer to the connMgr from the C# script).

Related

Need to join oracle and sql server tables in oledb source without using linked server

My ssis package has an oledb source which joins oracle and sql server to get source data and loads it into sql server oledb destination. Earlier we were using linked server for this purpose but we cannot use linked server anymore.
So I am taking the data from sql server and want to return it to the in clause of the oracle query which i am keeping as sql command oledb source.
I tried parsing an object type variable from sql server and putting it into the in clause of oracle query in oledb source but i get error that oracle cannot have more than 1000 literals in the in statement. So basically I think I have to do something like this:
select * from oracle.db where id in (select id from sqlserver.db).
Since I cannot use linked server so i was thinking if I could have a temp table which can be used throughout the package.
I tried out another way of using merge join in ssis. but my source data set is really large and the merge join is returning fewer rows than expecetd. I am badly stuck at this point. I have tried a number if things nothung seems to be working.
Can someone please help. Any help will be greatly appreciated.
A couple of options to try.
Lookup:
My first instinct was a Lookup Task, but that might not be a great solution depending on the size of your data sets, since all of the records from both tables have to pulled over the wire and stored in memory on the SSIS server. But if you were able to pull off a Merge Join, then a Lookup should also work, but it might be slow.
Set an OLE DB Source to pull the Oracle data, without the WHERE clause.
Set a Lookup to pull the id column from your SQL Server table.
On the General tab of the Lookup, under Specify how to handle rows with no matching entries, select Redirect rows to no-match output.
The output of the Lookup will just be the Oracle rows that found a matching row in your SQL Server query.
Working Table on the Oracle server
If you have the option of creating a table in the Oracle database, you could create a Data Flow Task to pipe the results of your SQL Server query into a working table on the Oracle box. Then, in a subsequent Data Flow, just construct your Oracle query to use that working table as a filter.
Probably follow that up with an Execute SQL Task to truncate that working table.
Although this requires write access to Oracle, it has the advantage of off-loading the heavy lifting of the query to the database machine, and only pulling the rows you care about over the wire.

how to update tabular data from source tables

I have a simple test setup:
A SQL Server (2017) with one database, with one table
A SQL Server Analysis Server (2017, with compatibility level 1400)
I have created a simple tabular model in Visual Studio with one datasource (the database with one table) and one table
This is my power query:
let
Source = #"SQL/MYCOMPUTER\SQLDEV;SampleDatabase",
dbo_testTable = Source{[Schema="dbo",Item="testTable"]}[Data]
in
dbo_testTable
I have deployed this tabular model to my SSAS instance...
Now my question: if the table in my SQL Server is updated (added records), how can I see these updates reflected in the Tabular Model? Do I have to rerun the Tabular Model somehow?
I have tried "Process Table" in SSMS on the Tabular model table, but it does not get the new records...
Processing a table processes whichever dimension or fact table you selected and this will only read data from the database objects used by this table. What processing is actually performed will depend on the type of processing that you used. As far as the question in the answer you posted, Process Full on an entire Tabular model will remove all data from the deployed model, then reload everything and process the hierarchies and measures as well, so yes the new data from the underlying tables will now be in the model for all tables within it after you processed it using this option. There are multiple processing types that can either be done at the database, table, or partition level. You can view additional details on these via the Microsoft reference.
I have found that on the level of the Database in the SSAS instance, there is an option "Process Database" that has an option "Process Full", which does update all the underlying tables.
But maybe there is a better way to do this?

Talend Open Studio Big Data - Iterate and load multiple files in DB

I am new to talend and need guidance on below scenario:
We have set of 10 Json files with different structure/schema and needs to be loaded into 10 different tables in Redshift db.
Is there a way we can write generic script/job which can iterate through each file and load it into database?
For e.g.:
File Name: abc_< date >.json
Table Name: t_abc
File Name: xyz< date >.json
Table Name: t_xyz
and so on..
Thanks in advance
With Talend Enterprise version one can benefit of dynamic schema. However based on my experiences with json-s they are somewhat nested structures usually. So you'd have to figure out how to flatten them, once thats done it becomes a 1:1 load. However with open studio this will not work due to the missing dynamic schema.
Basically what you could do is: write some java code that transforms your JSON into CSV. Use either psql from commandline or if your Talend contains new enough PostgreSQL JDBC driver then invoke the client side \COPY from it to load the data. If your file and the database table column order matches it should work without needing to specify how many columns you have, so its dynamic, but the data newer "flows" through talend.
Really not cool but also theoretically possible solution: If Redshift supports JSON (Postgres does) then one can create a staging table, with 2 columns: filename, content. Once the whole content is in this staging table, INSERT-SELECT SQL could be created that transforms the JSON into tabular format that can be inserted into the final table.
However, with your toolset you probably have no other choice than to load these files with 1 job per file. And I'd suggest 1 dedicated job to each file. They would each look for their own files and triggered / scheduled individually or be part of a bigger job where you scan the folders and trigger the right job for the right file.

Import from one DB to another DB

I need to import data from the inner joined tables of one db to a table of another db using Talend ETL tool.How can i do that?
Iam just new to talend.
How can i inner join the tables using condition in talend
Based on your requirement there would be multiple ways to achieve this.
One approach -
Use tMSSqlInput (for Sql Server - this would change based on your source database) and mention the required attributes to make the connection. In "Query" section - write your complete query involving the three different tables -
Once done, use tMap (to transform your data as per destination table) if required and then tMSSqlOutput (for Sql Server - this would change based on your destination database) to write the data to your table which would reside in another database. In connection properties make sure your configure the database correctly.
For tMSSqlOutput do check the properties - Use Batch/Batch Size & Commit every.
Sample job flow -
Now, another approach could be using bulk feature. You would be able to use tMSSqlOutputBulk to output the data retrieved from your source database into a file and then use tMSSqlBulkExec to bulk load the data from file into your destination table in your destination database.
Sample flow -
Note: Always compare which would be solution would be best fit by comparing the performance of all the available solutions.

how to load data (excel file) into sql-server using talend

I am very new to using Talend, could some body tell me how to load a excel sheet to SQL-Server 2012?
I tried searching in Google but no help..
I can easily do it using ssis tasks, but much needed using Talend, I took an input as excel and tried making a DB connection but i dont know the connection string or I dont understand what to put as port, Database, schema, Additional parameters.
I tried following these links:
see here
to solve your problem you need 3 components :
-Your Excel input
-A tmap
-A toutput
Join the excel input to the tmap and then the tmap to the dboutput.
Port, Database, Schema, Additional params refers to your database informations
What Talend Open Studio is ?
Talend Open Studio for Data Integration is an open source graphical development
environment for creating and deploying custom integrations between systems. It
comes with over 600 pre-built connectors that make it quick and easy to connect
databases, transform files, load data, move, copy and rename files, and connect
individual components in order to define complex integration processes.
and to answer to the question.
a raw file(excel,csv,xml,etc) put in a tfileInputdelimited then connected to tMSSQLoutput
the connection looks as shown in figure.
then to confiqure the sql server, goto to metadata,db connections-->right click -->create connection-->DB type-->sql server--> it works on sql server authentication so you need to create a user say user1 with some password-->then login-->password-->server_name-->port(1433 default in sql server)-->others left null(is ok)
file is successfully executed
This is something to be understood when trying Talend for the first time :D