Just wanted to know which will be faster on extracting the data from multiple tables (same database). Is it better to create a view with all the tables joined as an input source, or have all the tables and connect them via talend components?
Regards,
Yin
Related
I have a core layer where I have some tables and I want to find out by what tables in the source layer are these tables made up of. Like the tables in core layer are made by joining some of the tables of source layer. I want to generate an excel sheet using code so that I am able to display the core tables are made from which tables.
I am using PySpark on Databricks and the codes are written for creating the tables in notebooks.
Any help on how to approach this will be beneficial.
This is possible when you use Databricks Unity Catalog - as part of it, there is a feature called Data Lineage that tracks what tables & columns were used to create a specific table and who are consumers of it as well. It also includes Lineage API that could be used for exporting of the lineage data.
I created an entity-relationship diagram (logical model) and engineered it to a relational model.
The tables were generated. Now I need to use them from the connection XE as you see in the picture.
The tables I made can only be seen on the data modeler design view in the "Browser", how do I get them on the connection "XE" to generate data dictionary, etc?
There are three possibilities:
you just need to expand the tables item in the tree to see your tables
You are looking in the wrong schema/user - go down to Other Users, and find the schema those tables belong to
The tables do not exist in the current database
If #3 is the issue, you would need to create them, possibly using the information in your Data Model - that is, you can generate the DDL/SQL scripts for those tables.
Then taking those scripts, run them while connected to the appropriate database/schema.
Disclaimer: I'm an Oracle employee and the product manager for these tools.
I need to import data from the inner joined tables of one db to a table of another db using Talend ETL tool.How can i do that?
Iam just new to talend.
How can i inner join the tables using condition in talend
Based on your requirement there would be multiple ways to achieve this.
One approach -
Use tMSSqlInput (for Sql Server - this would change based on your source database) and mention the required attributes to make the connection. In "Query" section - write your complete query involving the three different tables -
Once done, use tMap (to transform your data as per destination table) if required and then tMSSqlOutput (for Sql Server - this would change based on your destination database) to write the data to your table which would reside in another database. In connection properties make sure your configure the database correctly.
For tMSSqlOutput do check the properties - Use Batch/Batch Size & Commit every.
Sample job flow -
Now, another approach could be using bulk feature. You would be able to use tMSSqlOutputBulk to output the data retrieved from your source database into a file and then use tMSSqlBulkExec to bulk load the data from file into your destination table in your destination database.
Sample flow -
Note: Always compare which would be solution would be best fit by comparing the performance of all the available solutions.
I have around 25 tables to load to target with same structure and which use the same logic for loading. I have prepared one job which does that, but it's a long process to design all the tables.
Is there any way to pass the table name and load to target, basically a small job (in size).
I am using Talend open studio.
Check my answer to a similar question where I proposed a generic solution for loading a MySQL table to another MySQL table.
You just need to modify the queries that retrieve the tables' metadata (columns) depending on your database type.
I'm trying to integrate multiple databases using talend and in turn have an SOR_id for each table for auditing purposes. is it possible to map between multiple source tables simultaneously to destination table having an SOR_id which is meant to be auto incremented? Would I have incremental values for each source tables rows
I have approached this using another way as shown in the image so that my SOR_id can be accounted for.