I am trying to export / import a set of tables from a PostgreSQL database.
I am using psql's copy from with stdin from a script. I have read that data (formerly produced using copy to with stdout) can be read and delimited using the command escape \..
What I didn't get from the documentation clearly is what would happen if \. appears in the formerly exported data.
Specifcally this section of the documentation (emphasis mine) isn't very clear about that.
For \copy ... from stdin, data rows are read from the same source that
issued the command, continuing until \. is read or the stream reaches
EOF. This option is useful for populating tables in-line within a SQL
script file. For \copy ... to stdout, output is sent to the same place
as psql command output, and the COPY count command status is not
printed (since it might be confused with a data row). To read/write
psql's standard input or output regardless of the current command
source or \o option, write from pstdin or to pstdout.
Can / must a \. appearing in the data escaped somehow?
I am currently using utf8 encoded text format for the export / import.
I think I found the relevant information in the documentation of the SQL COPY command (TEXT Format section, again emphasis mine):
End of data can be represented by a single line containing just backslash-period (\.). An end-of-data marker is not necessary when reading from a file, since the end of file serves perfectly well; it is needed only when copying data to or from client applications using pre-3.0 client protocol.
Related
I'd like to ask if my syntax is correct in loading a csv format file to a DB2 Database. I cannot confirm as I'm having problems in configuring DB2 in my local. I'd also like to confirm the placement of double quote is correct for both dateformat and timeformat?
Below is my code snippet.
LOGFILE=/mnt/bin/log/myLog.txt
db2 "load from /mnt/bin/test.csv of del modified by coldel noeofchar noheader dateformat=\"YYYY-MM-DD\" timeformat=\"HH:MM:SS\" usedefaults METHOD P(1,2,3,4,5) messages $LOGFILE insert_update into myuser.desctb(DESC_ID,START_DATE,START_TIME,END_DATE,END_TIME)"
If you use modified by coldel then you should also specify the delimiter character. If the delimiter really is a comma, then omit the coldel option.
Additionally insert_update is for the IMPORT command (not for load command), but import is a logged action which reduces insert throughput. You can use ... replace into ... with the LOAD command. Study the docs for the details.
The quoting seems OK, but correctness of the formats depends on data file values.
Refer to the LOAD documentation for details, you should study this page and the related pages.
An alternative to LOAD is to use INGEST command (available in current Db2-clients) which has insert, replace, merge and other options and is high throughput (compared to import).
I'm trying to import a .txt file into PostgreSQL. The txt file has 6 columns:
Laboratory_Name Laboratory_ID Facility ZIP_Code City State
And 213 rows.
I'm trying to use \copy to put the contents of this file into a table called doe2 in PostgreSQL using this command:
\copy DOE2 FROM '/users/nathangroom/desktop/DOE_inventory5.txt' (DELIMITER(' '))
It gives me this error:
missing data for column "facility"
I've looked all around for what to do when encountering this error and nothing has helped. Has anyone else encountered this?
Three possible causes:
One or more lines of your file has only 4 or fewer space characters (your delimiter).
One or more space characters have been escaped (inadvertently). Maybe with a backslash at the end of an unquoted value. For the (default) text format you are using, the manual explains:
Backslash characters (\) can be used in the COPY data to quote data
characters that might otherwise be taken as row or column delimiters.
Output from COPY TO or pg_dump would not exhibit any of these faults when reading from a table with matching layout. But maybe your file has been edited or is from a different, faulty source?
You are not using the file you think you are using. The \copy meta-command of the psql command-line interface is a wrapper for COPY and reads files local to the client. If your file lives on the server, use the SQL command COPY instead.
Check the file carefully. In my case, a blank line at the end of the file caused the ERROR: missing data for column. Deleted it, and worked fine.
Printing the blank lines might reveal something interesting:
cat -e $filename
I had a similar error. check the version of pg_dump that was used in exporting the data and the version of the database you are want to insert it into. make sure they are same. Also, if copy export fails then export the data by insert
I hope you are all well.
So my question is about the procedure to open multiple raw data files that are compressed.
My files' names are ordered so I have for example : o_equities_20080528.tas.zip o_equities_20080529.tas.zip o_equities_20080530.tas.zip ...
Thank you all in advance.
How much work this will be depends on whether:
You have enough space to extract all the files simultaneously into one folder
You need to be able to keep track of which file each record has come from (i.e. you can't tell just from looking at a particular record).
If you have enough space to extract everything and you don't need to track which records came from which file, then the simplest option is to use a wildcard infile statement, allowing you to import the records from all of your files in one data step:
infile "c:\yourdir\o_equities_*.tas" <other infile options as per individual files>;
This syntax works regardless of OS - it's a SAS feature, not shell expansion.
If you have enough space to extract everything in advance but you need to keep track of which records came from each file, then please refer to this page for an example of how to do this using the filevar option on the infile statement:
http://www.ats.ucla.edu/stat/sas/faq/multi_file_read.htm
If you don't have enough space to extract everything in advance, but you have access to 7-zip or another archive utility, and you don't need to keep track of which records came from each file, you can use a pipe filename and extract to standard output. If you're on a Linux platform then this is very simple, as you can take advantage of shell expansion:
filename cmd pipe "nice -n 19 gunzip -c /yourdir/o_equities_*.tas.zip";
infile cmd <other infile options as per individual files>;
On windows it's the same sort of idea, but as you can't use shell expansion, you have to construct a separate filename for each zip file, or use some of 7zip's more arcane command-line options, e.g.:
filename cmd pipe "7z.exe e -an -ai!C:\yourdir\o_equities_*.tas.zip -so -y";
This will extract all files from all of the matching archives to standard output. You can narrow this down further via the 7-zip command if necessary. You will have multiple header lines mixed in with the data - you can use findstr to filter these out in the pipe before SAS sees them, or you can just choose to tolerate the odd error message here and there.
Here, the -an tells 7-zip not to read the zip file name from the command line, and the -ai tells it to expand the wildcard.
If you need to keep track of what came from where and you can't extract everything at once, your best bet (as far as I know) is to write a macro to process one file at a time, using the above techniques and add this information while you're importing each dataset.
Im trying to use BCP to dump data from CDC function into a .dat file. Im using the following query (which works in Server 2008 R2):
USE LEESWIJZER
DECLARE #begin_time datetime
, #end_time datetime
, #from_lsn binary(10)
, #to_lsn binary(10)
SET #end_time = '2013-07-05 12:00:00.000';
SELECT #to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', #end_time);
SELECT #from_lsn = sys.fn_cdc_get_min_lsn('dbo_LWR_CONTRIBUTIES')
SELECT sys.fn_cdc_map_lsn_to_time(__$start_lsn) AS ChangeDTS
, *
FROM cdc.fn_cdc_get_net_changes_dbo_LWR_CONTRIBUTIES (#from_lsn, #to_LSN, 'all')
(edited for readability, used in BCP as single string)
my BCP string is:
BCP "Query above" queryout "C:\temp\LWRCONTRIBUTIES.dat" -w -t ";|" -r \n -T -S {server\\instance} -o "C:\temp\LWRCONTRIBUTIES.log"
As you can see I want a resulting .dat file in unicode, and a log file. I'm guessing the "ChangeDTS" column added to the function outcome is causing my problem. Error message reads: "[Microsoft][SQL Native Client]Host-file columns may be skipped only when copying into the Server".
It may be resolved using a format file, but since this code needs to run daily, likely more than once a day, and the tables are subject to change, I'm reluctant to constantly adjust my format files (there are 100's of tables needing the same procedure).
Furthermore, this is run on a clients database, who wont like me creating views in their database.
Anybody got any idea how I can create a text file (.dat) with a selected number of columns from a cdc function?
Found the answer, regardless of which version of bcp used, bcp cant handle declarations, it seems. If i edit those out, works like a charm.
However, according to someone on a different forum, BCP should be to handle declarations of variables. So happy it works for me now, but still confused why it does now and didnt before.
I am trying to query a Sybase ASA database using the dbisqlc.exe command-line on a Windows system and would like to collect the column headers along with the associated table data.
Example:
dbisqlc.exe -nogui -c "ENG=myDB;DBN=dbName;UID=dba;PWD=mypwd;CommLinks=tcpip{PORT=12345}" select * from myTable; OUTPUT TO C:\OutputFile.txt
I would prefer it if this command wrote to stdout however that does not appear to be an option aside from using dbisql.exe which is not available in the environment I am in.
When I run it in this format the header and data is generated however in an unparsable format.
Any assistance would be greatly appreciated.
Try adding the 'FORMAT SQL' clause to the OUTPUT statement. It will give you the select statement containing the column names as well as the data.
In reviewing the output from the following dbisqlc.exe command, it appears as though I can parse the output using perl.
Command:
dbisqlc.exe -nogui -c "ENG=myDB;DBN=dbName;UID=dba;PWD=mypwd;CommLinks=tcpip{PORT=12345}" select * from myTable; OUTPUT TO C:\OutputFile.txt
The output appears to break in odd places using text editors such as vi or TextPad however the output from this command is actually returned with specific column widths.
The second line of the output includes a set of ='s signs which are contained for the width of each column. What I did was build a "template" string based on the ='s which can be passed to perls unpack function. I then use this template to build an array of column names and parse the result set using unpack.
This may not be the most efficient method however I think it should give me the results I am looking for.