I have two tables in PostgreSQL and I want to join them with where condition. After I joined them, I want to convert to CSV file using copy function. Is it possible to join and generate the CSV file using COPY function? Or is it have another method?
Yes, it is possible and very easy.
Let's suppose we have two tables, merchant_position and merchant_timeline. In (mp_sc_id, mp_merchant_id, mp_rank, mp_tier, mp_updated_at), all these fields are from the merchant_position table but (mt_name) is in merchant_timeline table and foreign key is mt_id and mp_merchant_id.
\copy (select mp_sc_id, mp_merchant_id, mp_rank, mp_tier, mp_updated_at, mt_name from merchant_position INNER JOIN merchant_timeline ON mt_id = mp_merchant_id) TO '/Users/Desktop/mercahnt_rank.csv' DELIMITER ',' CSV HEADER
Related
I'm trying to execute the below query to export the results to CSV. I'm able to export the data to CSV but the headers were missing in the file. Is there anyway that we can achieve this? Im executing the file in the form "db2 -tvmf D:\Db.sql"
connect to ****** user ***** using ******
export to "D:\Vikas.csv" OF DEL MESSAGES
select
'ROW_NUM',
'DETAIL_TYPE_CD',
'ADMIN_FEES_TICKET',
'ADMINISTRATIVE_FEES',
'BASE_RENT',
'CITATIONS',
'COLLECTION_REPO_FEES',
'DESC',
'EFFECTIVE_DATE',
'LATE_CHARGE',
'MISC_FEE',
'STATUS_CD',
'ROW_ID',
'ROW_ID',
'BUILD',
'REVERSE_FLG',
'NSF_FLG',
'PR_CON_ID',
'PROC_DATE',
'PROPERTY_TAX',
'REGISTRATION_FEES',
'REPAIR_FEES',
'SALES_TAX',
'TERMINATION_FEES',
'TOTAL_TRANS',
'TRANSACTION_TYPE'
from sysibm.sysdummy1
UNION ALL (select
T1.ROW_NUM,
T5.DETAIL_TYPE_CD,
T1.ADMIN_FEES_TICKET,
T1.ADMINISTRATIVE_FEES,
T1.BASE_RENT,
T1.CITATIONS,
T1.COLLECTION_REPO_FEES,
T1.DESC,
T1.EFFECTIVE_DATE,
T1.LATE_CHARGE,
T1.MISC_FEE,
T2.STATUS_CD,
T4.ROW_ID,
T3.ROW_ID,
T2.BUILD,
T1.REVERSE_FLG,
T1.NSF_FLG,
T2.PR_CON_ID,
T1.PROC_DATE,
T1.PROPERTY_TAX,
T1.REGISTRATION_FEES,
T1.REPAIR_FEES,
T1.SALES_TAX,
T1.TERMINATION_FEES,
T1.TOTAL_TRANS,
T1.TRANSACTION_TYPE
FROM
SIEBEL.LSE_INPHIST_VIEW T1
LEFT OUTER JOIN SIEBEL.S_ASSET T2 ON T1.ACCOUNT_NUM = T2.ASSET_NUM
LEFT OUTER JOIN SIEBEL.S_ASSET_CON T3 ON T2.ROW_ID = T3.ASSET_ID AND
T3.RELATION_TYPE_CD = 'Obligor'
LEFT OUTER JOIN SIEBEL.S_ASSETCON_ADDR T4 ON T3.ROW_ID =
T4.ASSET_CON_ID AND T4.USE_TYPE_CD = 'Bill To'
LEFT OUTER JOIN SIEBEL.S_PROD_INT T5 ON T2.PROD_ID = T5.ROW_ID
WHERE
(T1.ACNT_ID = '01003501435'))
ORDER BY
T1.ACNT_ID DESC,T1.PROC_DATE DESC WITH UR
I have included the updated query now in the post.
The Db2-LUW export command lacks the ability to add columns headers to the output file. It only exports whatever is in the SELECT statement.
So when you want to have column-headers in the CSV file you have different options.
One way to do it (when there is no order by) is to make the SELECT statement into a UNION of two queries, the first query returns one row which is the list of column names, then union this with your real query. It means you must hand-craft the column-names of the first query to match the real second query. In your case for example it might look like:
SELECT 'row_num', 'detail_type_cd', ....
from sysibm.sysdummy1
UNION
SELECT t1.ROW_NUM, T5.DETAIL_TYPE_CD, ...
(you have to manually make the column-names , put them in single-quotes etc. But if you want Db2 to work out the column names you can use a technique like here ).
If you have an order by you can run two separate export commands (i.e. no union) outputting to two separate output files, and then use operating system functions to concatenate the output files like this:
export to headers.csv select 'colname1','colname2'...from sysibm.sysdummy1;
export to data.csv select ...
-- for MS-windows
!copy /a headers.csv + data.csv data_with_headers.csv ;
Another (possibly simpler) way to do it , with v11.5 (and higher) versions of Db2-LUW , is to not use the export command, but instead to create an external table, which lets you specify an option includeheader on among many other options for CSV files. You can search this site for examples, and reference the documentation.
The goal of my code is to try to drop a column each time it shows up. I know there is a way to drop columns without using a for loop. The reason that method does not work is that the columns are dynamic. The problem is that the .drop command is not dropping the column indicated. So here is some pseudocode.
for column_name in column_name_list:
# create data_frame1 with the column name
# join data_frame with other data_frame2
# Here I drop column_name in data_frame1
data_frame = data_frame.drop(column_name)
The problem is after the drop, the column_name is re-appearing during the second iteration. My guess is that I am dropping the column on a copy and it's not "saving" the data_frame with the dropped column. Thank you for all the help.
Maybe you have 2 columns with same name after join. If you join with same name You can easly join in this way easly:
dataframe1.join(dataframe2, col_name) # no need to dataframe1.col_name == dataframe2.col_name
if You already do and code above not working(i use this code and worked) you can use :
data_frame.select(*set(dataframe.columns) - set(column_names_list))
I'm currently writing a script which will allow me to input a file (generally .sql) and it'll generate a list of every table that's used in that file. the process is simple as it opened the input file, checks for a substring and if that substring exists outputs the line to the screen.
the substring that being checked is tsql keywords that is indicative of a selected table such as INTO, FROM and JOIN. not being a T-SQL wizard those 3 keywords are the only ones i know of that are used to select a table in a query.
So my question is, in T-SQL are INTO, FROM an JOIN the only ways to get a table? or are these others?
There're many ways to get a table, here're some of them:
DELETE
FROM
INTO
JOIN
MERGE
OBJECT_ID (N'dbo.mytable', N'U') where U is the object type for table.
TABLE, e.g. ALTER TABLE, TRUNCATE TABLE, DROP TABLE
UPDATE
However, by using your script, you'll not only get real tables, but maybe VIEW and temporary table. Here're 2 examples:
-- Example 1
SELECT *
FROM dbo.myview
-- Example 2
WITH tmptable AS
(
SELECT *
FROM mytable
)
SELECT *
FROM tmptable
I have a table that has a PKey, a FKey, a LineNum, and a TextLine.
In my table, I have multiple results from the FKey. It's a 1 to many relationship.
What I want to do is have the TextLines that match the FKey be concatenated into a single row. (The reason for this is that we're converting from an old COBOL database to T-SQL, and transferring the information to a new database with a different structure, where these "Comments" will all be handled by a single field)
My end query will look something like this:
SELECT Fkey, Line1 + Line2...,
FROM Table1
The issue is that there is a non-consistent number of lines. In addition, I'm trying to avoid any dynamic queries, because I want un-trained/basic users to be able to modify and customize this query. Is there any way to do this?
You could do something like this to get all the data in a single row:
select
t.FKey,
STUFF((SELECT ',' + textline
from Table1 where FKey = t.FKey
FOR XML PATH('')), 1, 1, '') as ConcatTextLines
from
Table1 t
group by t.FKey
There will be some size limitations on the ConcatTextLines column so this may not be applicable if you have thousands of line for some foreign keys.
Is there a way to make a Redshift Copy while at the same time generating the row_number() within the destination table?
I am basically looking for the equivalent of the below except that the group of rows does not come from a select but from a copy command for a file on S3
insert into aTable
(id,x,y,z)
select
#{startIncrement}+row_number() over(order by x) as id,
x,y,z,
from anotherTable;
Thx
I understand from your question is that, you need to insert an additional column id into table and that id was not there in the csv file. If my understanding is right, please follow the below approach,
Copy the data from the csv file to a temp table say "aTableTemp" which has the schema without column "id". Then insert data from "aTableTemp" into "aTable" as follows
Insert into aTable
Select #{startIncrement}+row_number() over(order by x) as id, * from aTableTemp
I hope this should solve your problem
Maybe copy into a table with a identity column just don’t copy into that field?