Filling table from several input files - amazon-redshift

I have the following scenario: several csv files contain different columns of the same table. Can I fill the redshift table from them somehow, and, ideally, with the help of the data pipeline? I couldn't find the way I can achieve this. Can anyone help with the solution or maybe simple example if it's possible?

You can do it by converting your csv files into json format prior to their load. Then particular Json tag will not be found in the file: copy will just dismiss it.

Related

Load COPY (Cobol) file in Talend tool

I would like to load a file in Talend which is supposed to have compress data inside. I don't know how to do that, I mean I don't know neither load a COPY file nor a COPY file with compress data. May someone help me please?
These are sample files (one of them is the schema): https://www.dropbox.com/sh/bqvcw0dk56hqhh2/AABbs1GRKjo7rycQrcUM_dgta?dl=0
P.S.: I know how to load csv, Excel, data from SQL databases, among others. However, I don't know how to handle this kind of files.
Thanks in advance.

Is it possible to copy just highlighted numbers from Tableau?

I have a Tableau workbook that connects to a database and then has several sheets that reorganize the data into different tables and graphs that I need.
If I make a sheet that has 2 rows and 1 field for example, I can't highlight the numbers and just copy them without also copying the row names for each item.
Is there a way I can copy just the numbers, nothing else?
It does not appear to be possible :(
As can be seen from the following Tableau threads:
Copy data from Text tables to clipboard
Copy single cell from view data
various incarnations of your request have already been asked of the development team that have yet to make it into Tableau. I also couldn't find anything in the user documents that describes a workaround.
There's a way to do this using Python and probably Autohotkey if that's of interest - both options are hackish.

writing Log files to Database using talend

This is the actual JobI am having job which executes log files as output. I am doing this using tLogRow, but I want to write them into Database, Is that possible with talend? I have already used tMysqlRow and tMysqlOutput but those are throwing some errors. I have searched in google but there isn't any clear answer for this.
This is the actual output when using TlogRow
You can save the error information directly from tLogCatcher to database table. Please make sure that table name and column names from table should match the tMysqlOutput schema. if you don't want to save the complete information, you can use a tXmlMap for filter the columns required.

Dynamically create table from csv

I am faced with a situation where we get a lot of CSV files from different clients but there is always some issue with column count and column length that out target table is expecting.
What is the best way to handle frequently changing CSV files. My goal is load these CSV files into Postgres database.
I checked the \COPY command in Postgres but it does have an option to create a table.
You could try creating a pg_dump compatible file instead which has the appropriate "create table" section and use that to load your data instead.
I recommend using an external ETL tool like CloverETL, Talend Studio, or Pentaho Kettle for data loading when you're having to massage different kinds of data.
\copy is really intended for importing well-formed data in a known structure.

Write to different Sheets in a single excel file from multiple tables

Can someone please give me a technical design overview of how I should implement this scenario :
I am using spring batch to import data from CSV files to different tables and once they are imported I run some validations on these tables and now I need to write all those data from 3 different tables into three different Sheets of a single Excel file. Can someone please help me how I should use ItemReaders and Itemwriters to solve this problem ?
If I'm asked I would implement as follows. create xls file from your code or first step which would be method invoker. which would create the file. and pass the file job parameters.
Step 1/2 would do a chunk reading from table 1 and in the itemwriter I would use the custom Item writer which would use POI and I would write to first sheet.
Step 2 would do a chunk reading from table 2 and in the itemwriter would read second sheet.
Since you have single file you can never achieve the advantage of spring batch performance like multithread, partitioning etc. Rather than its better to write to different file with independent task