Export CSV From Postgres VIA Command Line

Export CSV From Postgres VIA Command Line - postgresql

Hello Stack Overflowers!
I'm currently exporting a Postgres table as a .csv using a C# application I developed. I'm able to export them no problem with the following command...
set PGPASSWORD=password
psql -U USERNAME Database_Name
\copy (SELECT * FROM table1) TO C:\xyz\exportfile.csv CSV DELIMITER ',' HEADER;
The problem I am running into is the .csv is meant to be used with Tableau, however, when importing to excel I run into the same issue. It turns text fields into integers in both Tableau and Excel. This causes issues specifically on joining serial numbers on the Tableau side.
I know I can change these fields in Tableau/Excel manually but I am trying to find a way to make sure the end-user wouldn't need to do this. I'd like for them to just drag and drop the updated .csv postgresql data extracts and be able to start Tableau no problem. They don't seem real tech-savvy. I know you can connect Tableau directly to Postgres but in this particular case, I am not allowed to due to limitations beyond my control.
I'm using PostgreSQL 12 and Tableau v2019.4.0
EDIT: As request providing example data! Both of the fields are TEXT inside of PostgreSQL but the export doesn't specify.
Excel Formatting
ASSETNUM,ITEMNUM
1834,8.11234E+12
1835,8.11234E+12
Notepad Formatting
ASSETNUM,ITEMNUM
1834,8112345673294
1835,8112345673295
Note: If you select the specific cell in Excel it shows the full number.

CSV files don't have any type information, so programs like Excel/Tableau are free to interpret the data how they like.
However, #JorgeCampos's link provides useful information. For example
"=""123""","=""123"""
gets interpreted differently than
123,123
when you load it into Excel.
If you want to add quotes to your data, the easiest way is to use PostgreSQL's string functions, e.g.
SELECT '"=""' || my_column || '"""' FROM my_database

Related

PostgreSQL: Import columns into table, matching key/ID

I have a PostgreSQL database. I had to extend an existing, big table with a few more columns.
Now I need to fill those columns. I tought I can create an .csv file (out of Excel/Calc) which contains the IDs / primary keys of existing rows - and the data for the new, empty fields. Is it possible to do so? If it is, how to?
I remember doing exactly this pretty easily using Microsoft SQL Management Server, but for PostgreSQL I am using PG Admin (but I am ofc willing to switch the tool if it'd be helpfull). I tried using the import function of PG Admin which uses the COPY function of PostgreSQL, but it seems like COPY isn't suitable as it can only create whole new rows.
Edit: I guess I could write a script which loads the csv and iterates over the rows, using UPDATE. But I don't want to reinvent the wheel.
Edit2: I've found this question here on SO which provides an answer by using a temp table. I guess I will use it - although it's more of a workaround than an actual solution.

PostgreSQL can import data directly from CSV files with COPY statements, this will however only work, as you stated, for new rows.
Instead of creating a CSV file you could just generate the necessary SQL UPDATE statements.
Suppose this would be the CSV file
PK;ExtraCol1;ExtraCol2
1;"foo",42
4;"bar",21
Then just produce the following
UPDATE my_table SET ExtraCol1 = 'foo', ExtraCol2 = 42 WHERE PK = 1;
UPDATE my_table SET ExtraCol1 = 'bar', ExtraCol2 = 21 WHERE PK = 4;
You seem to work under Windows, so I don't really know how to accomplish this there (probably with PowerShell), but under Unix you could generate the SQL from a CSV easily with tools like awk or sed. An editor with regular expression support would probably suffice too.

Export tables to Flat File with some logic

I'm writing scripts to export some tables to flat files every day. I'm looking at the BCP utility, but I'm not sure it has the kind of features I really need.
For example, I need to output the fields out of order. That is, the 15th field in the MSSQL database should be the 2nd field in the flat file, et.c
More importantly, some of the fields need to be altered. For example, if a certain field is null or contains some special values, I need to replace them with codes.
Is BCP the right tool for this? My gut tells me to do this in Perl instead.

You can write a stored procedure and do all data transformations there.
Then feed this stored procedure to bcp.
It will surely be faster than Perl.
SSIS is fast too; could be an option in case transformations are very complex.

You can use a query to order and format the columns directly with BCP
bcp Utility
"query"
Is a Transact-SQL query that returns a result set.
example:
bcp "SELECT Name FROM AdventureWorks.Sales.Currency" queryout Currency.Name.dat -T -c

How should I open a PostgreSQL dump file and add actual data to it?

I have a pretty basic database. I need to drop a good size users list into the db. I have the dump file, need to convert it to a .pg file and then somehow load this data into it.
The data I need to add are in CSV format.

I assume you already have a .pg file, which I assume is a database dump in the "custom" format.
PostgreSQL can load data in CSV format using the COPY statement. So the absolute simplest thing to do is just add your data to the database this way.
If you really must edit your dump, and the file is in the "custom" format, there is unfortunately no way to edit the file manually. However, you can use pg_restore to create a plain SQL backup from the custom format and edit that instead. pg_restore with no -d argument will generate an SQL script for insertion.

As suggested by Daniel, the simplest solution is to keep your data in CSV format and just import into into Postgres as is.
If you're trying to to merge this CSV data into a 3rd party Postgres dump file, then you'll need to first convert the data into SQL insert statements.
One possible unix solution:
awk -F, '{printf "INSERT INTO TABLE my_tab (\"%s\",\"%s\",\"%s\");\n",$1,$2,$3}' data.csv

TSQL - export query to xls /xslx / csv

I have a complicated dynamic query in TSQL that I want to export to Excel.
[The result table contains fields with text longer than 255 chars, if it matters]
I know I can export result using the Management Studio menus but I want to do it automatically by code. Do you know how?
Thanks in advance.

You could have a look at sp_send_dbmail. This allows you to send an email from your query after it's run, containing an attached CSV of the resultset. Obviously the viability of this method would be dependent on how big your resultset is.
Example from the linked document:
EXEC msdb.dbo.sp_send_dbmail
#profile_name = 'AdventureWorks2008R2 Administrator',
#recipients = 'danw#Adventure-Works.com',
#query = 'SELECT COUNT(*) FROM AdventureWorks2008R2.Production.WorkOrder
WHERE DueDate > ''2006-04-30''
AND DATEDIFF(dd, ''2006-04-30'', DueDate) < 2' ,
#subject = 'Work Order Count',
#attach_query_result_as_file = 1 ;

One way is to use bcp which you can call from the command line - check out the examples in that reference, and in particular see the info on the -t argument which you can use to set the field terminator (for CSV). There's this linked reference on Specifying Field and Row Terminators.
Or, directly using TSQL you could use OPENROWSET as explained here by Pinal Dave.
Update:
Re;: 2008 64Bit & OPENROWSET - I wasn't aware of that, quick dig throws up this on MSDN forums with a link given. Any help?
Aside from that, other options include writing an SSIS package or using SQL CLR to write an export procedure in .NET to call directly from SQL. Or, you could call bcp from TSQL via xp_cmdshell - you have to enable it though which will open up the possible "attack surface" of SQL Server. I suggest checking out this discussion.

Some approaches here: SQL Server Excel Workbench

I needed to accept a dynamic query and save the results to disk so I can download it through the web application.
insert into data source didn't work out for me because of continued effort in getting it to work.
Eventually I went with sending the query to powershell from SSMS
Read my post here
How do I create a document on the server by running an existing storedprocedure or the sql statement of that procedure on a R2008 sql server
Single quotes however was a problem and at first i didn't trim my query and write it on one line so it had line breaks in sql studio which actually matters.

OpenRowSet command in TSQL is returning NULLS

Been investigating for a while now and keep hitting a brick wall. I am importing from xls files into temp tables via the OpenRowset command. Now I have a problem where I’m trying to import a certain column has a range values but the most common are the following. Columns structured as long numbers i.e. 15598 and the some columns as strings i.e. 15598-E.
Now the openrowset is reading the string version no problem but is reporting the number version as a NULL. I read (http://www.sqldts.com/254.aspx ) that openrowset has that issue and the author speaks of implementing “HDR=YES;IMEX=1” into the query string but that’s not working for me at all.
Have any of you guys every encountered this?
Just some more info as well. I may not do this with the JET engine (Microsoft.Jet.OLEDB.4.0) so this is what my query looks like:
SELECT *
FROM
OPENROWSET('MSDASQL'
, 'Driver=Microsoft Excel Driver (*.xls);HDR=YES;IMEX=1;DBQ=C:\ImportFile.xls;'
, 'SELECT * FROM [Sheet1$]')

I notice you are using the Excel ODBC driver. Have you tried the JET OLEDB Provider with the equivalent connection string?
select * from openrowset(
'Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\ImportFile.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1"',
'SELECT * FROM [Sheet1$]')
EDIT: Sorry, just noticed your last paragraph. Surely the Excel ODBC driver still goes via the JET engine, so what difference would it make?
EDIT: I have looked at the KB194124 link, and the registry values it recommends are the default values on my machine, which I have never changed. I have used the above method several times myself without problems. Maybe it's an environmental issue?

If you don't mind opening the file in Excel, take the columns that have the problem, select the column, and do
Data -> Text to Columns -> Next -> Next -> Text
Save the spreadsheet and they should all come in as Text in OPENROWSET
I've found using .CSV files instead of Excel, opened by setting up a Linked Server, and setting up the format of the files in schema.ini a more practical approach for handling imports like this, with that method you can explicitly choose each column's format.

We've come across the same issue. Unfortunately we've not found a solution either. There's more information here which indicates that there might be a registry fix.

I had the same problem. I fixed it cuting and pasting a row that contains a column with the string/numeric value (for example 123ABC) in the first row position of the sheet. For some reason T-SQL reads the first row and assumes that all the values are numeric.

Response by SqlACID in this link worked great [https://wikigurus.com/Article/Show/185717/OpenRowSet-command-in-TSQL-is-returning-NULLS] :-
If you don't mind opening the file in Excel, take the columns that have the problem, select the column, and do
Data -> Text to Columns -> Next -> Next -> Text
Save the spreadsheet and they should all come in as Text in OPENROWSET
I've found using .CSV files instead of Excel, opened by setting up a Linked Server, and setting up the format of the files in schema.ini a more practical approach for handling imports like this, with that method you can explicitly choose each column's format.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Export CSV From Postgres VIA Command Line - postgresql

Related

PostgreSQL: Import columns into table, matching key/ID

Export tables to Flat File with some logic

How should I open a PostgreSQL dump file and add actual data to it?

TSQL - export query to xls /xslx / csv

OpenRowSet command in TSQL is returning NULLS

Categories

Resources