Store .fmt file in SQL Server - tsql

Is it possible to store .fmt file right in database like stored procedure, not in separate file?
Imported files varies, format file is constant for the procedure. No BLOBs, no FILESTREAM used.
...
FROM OPENROWSET (
BULK 'd:\path\some_variable_file.txt',
FIRSTROW = 2,
FORMATFILE = 'd:\path\importformat.fmt'
) AS import

OPENROWSET does not support any source besides the file system for FORMATFILEs. One option you could do is store the format file data (in either the non-xml or xml format) in a database and extract it with a powershell script

Related

aws_s3.query_export_to_s3 PostgreSQL RDS extension exporting all multi-part CSV files to S3 with a header

I'm using the aws_s3.query_export_to_s3 function to export data from an Amazon Aurora Postgresql database to S3 in CSV format with a header row.
This works.
However, when the export is large and outputs to multiple part files, the first part file has the CSV header row, and subsequent part files do not.
SELECT * FROM aws_s3.query_export_to_s3(
'SELECT ...',
aws_commons.create_s3_uri(...),
options:='format csv, HEADER true'
);
How can I make this export add the header row to all CSV file parts?
I'm using Apache Spark to load this CSV data and it expects a header row in each individual part file.
How can I make this export add the header row to all part filess?
It's not possible, unfortunately.
The aws_s3.query_export_to_s3 function uses the PostgreSQL COPY command under the hood & then chunks the files appropriately depending on size.
Unless the extension picks up on the HEADER true option, caches the header & then provides an option to apply that to every CSV file generated, you're out of luck.
The expectation is that the files are then combined at destination when downloaded or the file processor has some mechanism of reading files in parts or the file processor only needs the header once.

How to convert list of dictionaries into bytes stream and load it to the database

My standard way to run bulk upload of CSV files to Postgres database is to use copy_expert()
method:
cursor.copy_expert("copy %s from STDIN CSV HEADER NULL 'NULL' QUOTE '\"';" % (table_name), file=f)
Very often, before loading file to the database, I run some pre-processing of CSV file.
The results are always kept in list of dictionaries.
Following my standard way of loading I have to offload list of dictionaries to temporary CSV file and
only that file give to copy_expert() method as source of data.
What I would like to do is to replace source as a file with source as a some stream of bytes converted
from list of dictionaries. And then pass it to copy_expert() as a source.
I would like to exclude step of writing temporary CSV file and load data straight from memory.

Lift load Dateformat issue from csv file

we are migrating db2 data to db2 on cloud. We are using below lift cli operation for migration.
Extracting a database table to a CSV file using lift extract from source database.
Then loading the extracted CSV file to db2 on cloud using 'lift load'
ISSUE:
We have created some tables using ddl on the target db2oncloud which have some columns with DATA TYPE "TIMESTAMP"
while load operation(lift load), we are getting below error"
"MESSAGE": "The field in row \"2\", column \"8\" which begins with
\"\"2018-08-08-04.35.58.597660\"\" does not match the user specified
DATEFORMAT, TIMEFORMAT, or TIMESTAMPFORMAT. The row will be
rejected.", "SQLCODE": "SQL3191W"
If you use db2 as a source database, then use either:
the following property during export (to export dates, times, timestamps as usual for db2 utilities - without double quotes):
source-database-type=db2
try to use the following property during load, if you have already
exported timestamps surrounded by double quotes:
timestamp-format="YYYY-MM-DD-HH24.MI.SS.FFFFFF"
If the data was extracted using lift extract then for sure you should load the data with source-database-type=db2. Using this parameter will preconfigure all the necessary load details automatically.

SAS macro to read multiple rawdata files and create multiple SAS dataset for each raw data file

SAS macro to read multiple rawdata files and create multiple SAS dataset for each raw data file
Hi there
My name is Chandra. I am not very good at SAS macro especially the looping part and resolving &&. etc. Here is my problem statement.
Problem statement:
I have large number of raw data files (.dat files) stored in a folder in a SAS server. I need a macro that can read each of these raw data file and create SAS data set for each raw data file and store them in a separate target folder in the SAS server. All these raw data files have same file layout structure. I need to automate this operation so that every week, the macro reads raw data files from the source folder and creates the corresponding SAS dataset and stores them in the target folder in the SAS server. For example, if there are 200 raw data files in a source folder, I want to read them and create 200 SAS datasets one for each raw data file and save them in a target folder. I am not very good at constructing looping statement and also resolving && or &&& etc. How do I do it?
I would highly appreciate your kind assistance in this regard.
Respectfully
Chandra
You don't need to necessarily use a macro or a loop in case you have files with same fields. You can try pipe option and the filename keyword. Here is the link
You do not need a macro for this type of processing.
The INFILE statment will accept a file specification that includes operating system wildcards.
This example creates 200 text files in the work folder and then reads them back in in a single step.
I highly recommend not creating 200 separate data sets. Instead keep the filename, or a unique portion thereof, as a categorical variable that can be used later in a CLASS or BY statement, or as part of the criteria of a sub-setting WHERE.
%let workpath = %sysfunc(pathname(WORK));
* create something to input;
data _null_;
do i = 0 to 1999;
if mod(i,10) = 0 then filename = cats("&workpath./",'sample',i/10,'.txt');
file sample filevar=filename;
x = i; y = x**2;
put i x y;
end;
run;
* input data from 200 different files that have the same layout;
data samples;
length filename $250;
infile "&workpath.\*.txt" filename=filename; %* <-- Here be the wildcards;
input i x y;
source = filename;
run;

How should I open a PostgreSQL dump file and add actual data to it?

I have a pretty basic database. I need to drop a good size users list into the db. I have the dump file, need to convert it to a .pg file and then somehow load this data into it.
The data I need to add are in CSV format.
I assume you already have a .pg file, which I assume is a database dump in the "custom" format.
PostgreSQL can load data in CSV format using the COPY statement. So the absolute simplest thing to do is just add your data to the database this way.
If you really must edit your dump, and the file is in the "custom" format, there is unfortunately no way to edit the file manually. However, you can use pg_restore to create a plain SQL backup from the custom format and edit that instead. pg_restore with no -d argument will generate an SQL script for insertion.
As suggested by Daniel, the simplest solution is to keep your data in CSV format and just import into into Postgres as is.
If you're trying to to merge this CSV data into a 3rd party Postgres dump file, then you'll need to first convert the data into SQL insert statements.
One possible unix solution:
awk -F, '{printf "INSERT INTO TABLE my_tab (\"%s\",\"%s\",\"%s\");\n",$1,$2,$3}' data.csv