Dynamic loading not working in talend - talend

Not able to load multiple tables, getting error:
Exception in component tMysqlInput_1 (MYSQL_DynamicLoading)
java.sql.SQLException: Bad format for Timestamp 'GUINESS' in column 3
One table works fine. Basically after first iteration the second table trying to use the schema
of the first table. Please help, how to edit the component to make it
correct. Trying to load actor & country table from sakila DB mysql to
a another DB on the same server. Above image is for successful one table
dynamic loading.

you should not use tMysqlInput if output schemas differ. For this case there is no way around tJavaRow and custom code. I however cannot guess what happens in tMap, so you should provide some more details about what you want to achieve.

If all you need is to load data from one table to another without any transformations, you can do one of the following:
If your tables reside in 2 different databases on the same server, you can use a tMysqlRow and execute a query "INSERT INTO catalog.table SELECT * from catalog2.table2..". You can do some simple transformations in SQL if needed.
If your tables live in different servers, check the generic solution I suggested for a similar question here. It may need some tweaking depending on your use case, but the general idea is to replicate the functionality of INSERT INTO SELECT when the tables are not on the same server.

Related

Referring to an Access Database table from a TSQL Query

I am rebuilding a system so instead of using multiple Access and Excel files the business I am working for, uses SSRS for reporting requirements. Most things are coming along great but I have one sticking point.
One of the Access databases has a table within itself rather than replying on the server data, which is there to keep the grade level of staff up to date (as it is a very complex method on how staff go up a grade).
Now I could easily build a new table in the SQL Server, but I do not want management relying on me for updating this particular table. I could also rebuild the Access database to upload the data to the server which is probably what I will do, but what I wanted to ask first is there a way to join to the table in the Access Database from the T-SQL query, as if it was another table in the main database?
Yes. Attach the database file under Server Objects as a Linked Server.
To ease referencing the table in this, create a view in your database that "hides" the needed weird triple-dot syntax, like:
SELECT FIELD1, FIELD2, FIELDN
FROM THELINKEDSERVERNAME...YourTable AS LinkedYourTable
Then use this view to read the table.

Loadind multiple table from source into multiple table into target in Talend

I have around 25 tables to load to target with same structure and which use the same logic for loading. I have prepared one job which does that, but it's a long process to design all the tables.
Is there any way to pass the table name and load to target, basically a small job (in size).
I am using Talend open studio.
Check my answer to a similar question where I proposed a generic solution for loading a MySQL table to another MySQL table.
You just need to modify the queries that retrieve the tables' metadata (columns) depending on your database type.

SSIS or TSQL for SQL/MySQL table comparrison

I am new to SSIS and am after some assistance in creating an SSIS package to do a specific task. My data is stored remotely within a MySQL Database and this is downloaded to a SQL Server 2014 Database. What I want to do is the following, create a package where I can enter 2 dates that can be compared against the create date/date modified per record on a number of tables to give me a snap shot and compare the MySQL Data to the SQL Data so that I can see if there are any rows that are missing from my local SQL Database or if any need to be updated. Some tables have no dates so I just want to see a record count on what is missing if anything between the 2. If this is better achieved through TSQL I am happy to hear about other suggestions or sites to look at where things have been done similar.
In relation to your query Tab :
"Hi Tab, What happens at the moment is our master data is stored in a MySQL Database, the data was then downloaded to a SQL Server Database as a one off. What happens at the moment is I have a SSIS package that uses the MAX ID which can be found on most of the tables to work out which records are new and just downloads them or updates them. What I want to do is run separate checks on the tables to make sure that during the download nothing has been missed and everything is within sync. In an ideal world I would like to pass in to a SSIS package or tsql stored procedure a date range, shall we say calender week, this would then check for any differences between the remote MySQL database tables and the local SQL tables. It does not currently have to do anything but identify issues, correcting them may come later or changes would need to be made to the existing sync package. Hope his makes more sense."
Thanks P
To do this, you need to implement a Type 1 Slowly Changing Dimension type data flow in SSIS. There are a number of ways to do this, including a built in transformation aptly called the Slowly Changing Dimension transformation. Whilst this is easy to set up, it is a pain to maintain and it runs horrendously slowly.
There are numerous ways to set this up using other transformations or even SQL merge statements which are detailed here: https://bennyaustin.wordpress.com/2010/05/29/alternatives-to-ssis-scd-wizard-component/
I would recommend that you use Lookup transformations as they perform better than the Slowly Changing Dimension transformation but offer better diagnostics and error handling than the better performing SQL merge statement.
Before you do this you will need to add a Checksum or Hashbytes column to your SQL data for ease of comparison with the incoming MySQL data.
In short, calculate some sort of repeatable checksum as the data is downloaded into your SQL Server, then use this in an SSIS Lookup, matching on the row key, to check for changes. Where the checksum value is different for the same row it needs updating and where there is no matching row key in your SQL Data you need to insert the new row.

Is it possible to query data from tables by object_id?

I was wondering whether it is possible to query tables by specifying their object_id instead of table names in SELECT statements.
The reason for this is that some tables are created dynamically, and their structure (and names) are not known before, and yet I would like to be able to write sprocs that are capable of querying these tables and working on their content.
I know I can create dynamic statements and execute it, but maybe there are some better ways, and I would be grateful if someone could share how to approach it.
Thanks.
You have to query sys.columns and build a dynamic query based on that.
There are no better ways: SQL isn't designed for adhoc or unknown sturctures.
I've never worked on an application in 20 years where I don't know what my data looks like. Either your data is persisted or it should be in XML or JSON or such if it's transient-

Q's on pgsql

I'm a newbie to pgsql. I have few questionss on it:
1) I know it is possible to access columns by <schema>.<table_name>, but when I try to access columns like <db_name>.<schema>.<table_name> it throwing error like
Cross-database references are not implemented
How do I implement it?
2) We have 10+ tables and 6 of have 2000+ rows. Is it fine to maintain all of them in one database? Or should I create dbs to maintain them?
3) From above questions tables which have over 2000+ rows, for a particular process I need a few rows of data. I have created views to get those rows.
For example: a table contains details of employees, they divide into 3 types; manager, architect, and engineer. Very obvious thing this table not getting each every process... process use to read data from it...
I think there are two ways to get data SELECT * FROM emp WHERE type='manager', or I can create views for manager, architect n engineer and get data SELECT * FROM view_manager
Can you suggest any better way to do this?
4) Do views also require storage space, like tables do?
Thanx in advance.
Cross Database exists in PostGreSQL for years now. You must prefix the name of the database by the database name (and, of course, have the right to query on it). You'll come with something like this:
SELECT alias_1.col1, alias_2.col3 FROM table_1 as alias_1, database_b.table_2 as alias_2 WHERE ...
If your database is on another instance, then you'll need to use the dblink contrib.
This question doe not make sens. Please refine.
Generally, views are use to simplify the writing of other queries that reuse them. In your case, as you describe it, maybe that stored proceudre would better fits you needs.
No, expect the view definition.
1: A workaround is to open a connection to the other database, and (if using psql(1)) set that as your current connection. However, this will work only if you don't try to join tables in both databases.
1) That means it's not a feature Postgres supports. I do not know any way to create a query that runs on more than one database.
2) That's fine for one database. Single databases can contains billions of rows.
3) Don't bother creating views, the queries are simple enough anyway.
4) Views don't require space in the database except their query definition.