Export tables to Flat File with some logic - tsql

I'm writing scripts to export some tables to flat files every day. I'm looking at the BCP utility, but I'm not sure it has the kind of features I really need.
For example, I need to output the fields out of order. That is, the 15th field in the MSSQL database should be the 2nd field in the flat file, et.c
More importantly, some of the fields need to be altered. For example, if a certain field is null or contains some special values, I need to replace them with codes.
Is BCP the right tool for this? My gut tells me to do this in Perl instead.

You can write a stored procedure and do all data transformations there.
Then feed this stored procedure to bcp.
It will surely be faster than Perl.
SSIS is fast too; could be an option in case transformations are very complex.

You can use a query to order and format the columns directly with BCP
bcp Utility
"query"
Is a Transact-SQL query that returns a result set.
example:
bcp "SELECT Name FROM AdventureWorks.Sales.Currency" queryout Currency.Name.dat -T -c

Related

Db2 for I: Cpyf *nochk emulation

In the IBM i system there's a way to copy a from a structured file to one without structure using Cpyf *nochk.
How can it be done with sql?
The answer may be "You can't", not if you are using DDL defined tables anyway. The problem is that *NOCHK just dumps data into the file like a flat file. Files defined with CRTPF, whether they have source, or are program defined, don't care about bad data until read time, so they can contain bad data. In fact you can even read bad data out of a file if you use a program definition for that file.
But, an SQL Table (one defined using DDL) cannot contain bad data. No matter how you write it, the database validates the data at write time. Even the *NOCHK option of the CPYF command cannot coerce bad data into an SQL table.
There really isn't an easy way
Closest would be to just build a big character string using CONCAT...
insert into flatfile
select mycharfld1
concat cast(myvchar as char(20))
concat digits(zonedFld3)
from mytable
That works for fixed length, varchar (if casted to char) and zoned decimal...
Packed decimal would be problematic..
I've seen user defined functions that can return the binary character string that make up a packed decimal...but it's very ugly
I question why you think you need to do this.
You can use QSYS2.QCMDEXC stored procedure to execute OS commands.
Example:
call qsys2.qcmdexc ( 'CPYF FROMFILE(QTEMP/FILE1) TOFILE(QTEMP/FILE2) MBROPT(*replace) FMTOPT(*NOCHK)' )

How can I ensure Redshift Unload Copy columns are in correct order?

I am trying to use the UnloadCopyUtility to migrate an instance to an encrypted instance but some of the tables fail because it is trying to insert the values into the wrong columns. Is there are a way I can ensure the columns are mapped to the values correctly? I can adjust the python script locally if need be
I feel, this should be possible in UnloadCopy utility as well.
But here I'm trying to answer more of generic solution withput UnloadCopy utility, so that it may be helpful to others as an alternate solution.
In unload command you could specify the columns like C1,C2,C3,...
Use same sequence columns in copy command while loading data in RedShift.
Unload command example.
unload ('select C1,C2,C3,... from venue') to 's3://mybucket/tickit/unload/venue_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' parallel off;
Copy command example with specific columns sequence of above unloaded files.
copy table (C1,C2,C3,...) from 's3://<your-bucket-name>/load/key_prefix' credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>' options;

PostgreSQL: Import columns into table, matching key/ID

I have a PostgreSQL database. I had to extend an existing, big table with a few more columns.
Now I need to fill those columns. I tought I can create an .csv file (out of Excel/Calc) which contains the IDs / primary keys of existing rows - and the data for the new, empty fields. Is it possible to do so? If it is, how to?
I remember doing exactly this pretty easily using Microsoft SQL Management Server, but for PostgreSQL I am using PG Admin (but I am ofc willing to switch the tool if it'd be helpfull). I tried using the import function of PG Admin which uses the COPY function of PostgreSQL, but it seems like COPY isn't suitable as it can only create whole new rows.
Edit: I guess I could write a script which loads the csv and iterates over the rows, using UPDATE. But I don't want to reinvent the wheel.
Edit2: I've found this question here on SO which provides an answer by using a temp table. I guess I will use it - although it's more of a workaround than an actual solution.
PostgreSQL can import data directly from CSV files with COPY statements, this will however only work, as you stated, for new rows.
Instead of creating a CSV file you could just generate the necessary SQL UPDATE statements.
Suppose this would be the CSV file
PK;ExtraCol1;ExtraCol2
1;"foo",42
4;"bar",21
Then just produce the following
UPDATE my_table SET ExtraCol1 = 'foo', ExtraCol2 = 42 WHERE PK = 1;
UPDATE my_table SET ExtraCol1 = 'bar', ExtraCol2 = 21 WHERE PK = 4;
You seem to work under Windows, so I don't really know how to accomplish this there (probably with PowerShell), but under Unix you could generate the SQL from a CSV easily with tools like awk or sed. An editor with regular expression support would probably suffice too.

Dump subset of records in an OpenEdge database table in the ".d" file format

I am looking for the easiest way to manually dump a subset of records in an OpenEdge database table in the Progress ".d" file format.
The best way I can imagine is creating an extra test database with the identical schema as the source database, and then copying the subset of records over to the test database using FOR EACH and BUFFER-COPY statements. Then just export the data from the test database using the Dump Data and Definitions Table Contens (.d file )... menu option.
That seems like a lot of trouble. If you can identify the subset of records in order to do the BUFFER-COPY than you should also be able to:
OUTPUT TO VALUE( "table.d" ).
FOR EACH table NO-LOCK WHERE someCondition:
EXPORT table.
END.
OUTPUT CLOSE.
Which is, essentially, what the dictionary "dump data" .d file is less a few lines of administrivia at the bottom which can be safely omitted for most purposes.

How should I open a PostgreSQL dump file and add actual data to it?

I have a pretty basic database. I need to drop a good size users list into the db. I have the dump file, need to convert it to a .pg file and then somehow load this data into it.
The data I need to add are in CSV format.
I assume you already have a .pg file, which I assume is a database dump in the "custom" format.
PostgreSQL can load data in CSV format using the COPY statement. So the absolute simplest thing to do is just add your data to the database this way.
If you really must edit your dump, and the file is in the "custom" format, there is unfortunately no way to edit the file manually. However, you can use pg_restore to create a plain SQL backup from the custom format and edit that instead. pg_restore with no -d argument will generate an SQL script for insertion.
As suggested by Daniel, the simplest solution is to keep your data in CSV format and just import into into Postgres as is.
If you're trying to to merge this CSV data into a 3rd party Postgres dump file, then you'll need to first convert the data into SQL insert statements.
One possible unix solution:
awk -F, '{printf "INSERT INTO TABLE my_tab (\"%s\",\"%s\",\"%s\");\n",$1,$2,$3}' data.csv