How to export a file in Db2 without character delimiter nor column delimiter? - db2

I would like to export a table in a similar way that a normal select is executed. I mean:
db2 -x "select varchar(SCHEMANAME, 16) SCHEMANAME,
varchar(OWNER, 10) OWNER,
varchar(OWNERTYPE, 10) OWNERTYPE
from syscat.schemata where SCHEMANAME like 'SYS%'"
And the output is:
SYSCAT SYSIBM S
SYSFUN SYSIBM S
SYSIBM SYSIBM S
SYSIBMADM SYSIBM S
SYSIBMINTERNAL SYSIBM S
SYSIBMTS SYSIBM S
SYSPROC SYSIBM S
SYSPUBLIC SYSIBM S
SYSSTAT SYSIBM S
SYSTOOLS SYSIBM S
I would like to generate the same via an export (fixed-length columns). I have tried:
db2 "export to myfile.csv of del
modified by coldelX20
SELECT *
from syscat.schemata"
db2 "export to myfile.csv of del
modified by nochardel coldelX20
SELECT *
from syscat.schemata"
db2 "export to myfile.csv of del
modified by chardelX21 coldelX20
SELECT *
from syscat.schemata"
And I got:
SQL3017N A delimiter is not valid or is used more than once.
(Redirecting the output of a normal select is not an option).

Delimiter considerations for moving data:
Delimiter restrictions
There are a number of restrictions in place that help prevent the
chosen delimiter character from being treated as a part of the data
being moved. First, delimiters are mutually exclusive. Second, a
delimiter cannot be binary zero, a line-feed character, a
carriage-return, or a blank space. As well, the default decimal point
(.) cannot be a string delimiter. Finally, in a DBCS environment, the
pipe (|) character delimiter is not supported.
Db2 export command doesn't support a space as a delimiter. Moreover, it doesn't right pad a value with spaces to the maximum column length in characters.
You may construct one column expression yourself like below:
select
char(schemaname, 20)
-- ...
|| ' ' || char(create_time)
--- ...
|| ' ' || char(coalesce(char(auditpolicyid), '-'), 11)
-- ...
from syscat.schemata;

Try this
export to myfile.csv of del
modified by nochardel
select rpad(left(SCHEMANAME,16) , 16 ) ||
rpad(left(OWNER,10), 10 ) ||
rpad(left(OWNERTYPE,10), 10)
from syscat.schemata

Related

PSQL (postgres or redshift) stored variable to prompt and write query to dynamic filename

My adhoc workflow has me in the psql client often, so often that I define useful queries or settings changes in my .psqlrc file. I'm sharing the solution to this because there are few examples online and since you can't use newlines in a metacommand the syntax gets ugly and debugging took a long time.
Define a psql meta-command in a variable that prompts for sql file path and writes to a local file with a dynamic filename
prompt for sql file to execute
prompt for output filename prefix
generate a dynamic output filename based on ISO reporting week
Here is a manual example of the steps I want to wrap into a .pqslrc-defined variable:
-- the following at psql prompt =>>
select 'file_prefix' || '_week_'
|| to_char(next_day(current_date - 1 - 7 * 1, 'sat') + 1,'iyyy-iw')
|| '.txt' report_filename;
┌──────────────────────────────┐
│ report_filename │
├──────────────────────────────┤
│ file_prefix_week_2019-07.txt │
└──────────────────────────────┘
\out file_prefix_week_2019-07.txt
\a \pset footer off -- no border or row count to output file
\i 'path/to/sql_file.sql'
-- now I have a text file of the output locally on my machine
\out \a \pset footer on
=>>
-- back to normal terminal output
Here is the working solution that can be issued on the psql command line or appended to a .psqlrc and invoked from the psql prompt with the variable name:
-- newlines are included for readability but:
-- **remove all newlines when pasting to .psqlrc**
\set report '\\echo enter filename prefix:\\\\ \\prompt file_prefix \\\\
\\echo enter sql file path:\\\\ \\prompt sql_file \\\\
select :''file_prefix'' || ''_week_''
|| to_char(next_day(current_date - 1 - 7 * 1, ''sat'') + 1,''iyyy-iw'')
|| ''.txt'' report_filename \\gset \\\\
\\pset footer off \\pset border 0 \\pset expanded off \\pset format unaligned \\\\
\\out :report_filename \\\\
\\i :sql_file \\\\
\\out \\\\
\\pset footer on \\pset border 2 \\pset expanded auto
\\pset format aligned \\pset linestyle unicode'
key points:
Postgres psql documentation current version
when copy/pasting remove all the newlines
the variable string cannot have newlines, just one long string
the string is contained in single quotes '
the symbol \\ separates commands
the symbol \ is escaped by doubling
therefore use \\ for \, \\\\ for \\
single quotes within the string are escaped by doubling
therefore use '' for ' within the string
this counts for :variables that are strings as well
\gset assigns the output of a query to a variable of the column name
the :sql_file can include spaces, the variable gets stored as a text string that \i is able to parse without wrapping as :''sql_file''
relevant components in .psqlrc
-- .psqlrc
-- remove newlines from \set variable strings before pasting
\pset fieldsep '\t'
\set prompt_1 '%R%#> '
\set PROMPT1 :prompt_1
\set prompt_copyout 'copyout : : : copyout\n%R%#> '
-- helper variables
-- usage:
-- :copyout desired_output_filename
\set copyout '\\pset footer off \\pset border 0 \\pset expanded off
\\pset format unaligned \\pset title \\pset null ''''
\\set PROMPT1 :prompt_copyout \\out '
-- usage to return to normal terminal output
-- :copyoff
\set copyoff '\\pset footer on \\pset border 2 \\pset expanded auto
\\pset format aligned \\pset title \\pset null ''[null]''
\\set PROMPT1 :prompt_1 \\pset linestyle unicode \\out \\\\'
reformulated with helper variables
I switch between copying out to local text files and viewing sql output in the terminal screen, so my actual usage includes the following helpers:
-- again remove all newlines before pasting into .psqlrc
\set report '\\echo enter filename prefix:\\\\ \\prompt file_prefix \\\\
\\echo enter sql file path:\\\\ \\prompt sql_file \\\\
select :''file_prefix'' || ''_week_''
|| to_char(next_day(current_date - 1 - 7 * 1, ''sat'') + 1,''iyyy-iw'')
|| ''.txt'' report_filename \\gset \\\\
:copyout :report_filename \\\\
\\i :sql_file \\\\
:copyoff'

Oracle PL/SQL: How do I filter out whitespaces in SELECT?

I have a table mytable that has a column ngram which is a VARCHAR2. I want to SELECT only those rows where ngram does not contain any whitespaces (tabs, spaces, EOLs etc). What should I replace <COND> below with?
SELECT ngram FROM mytable WHERE <COND>;
Thanks!
You could use regexp_instr (or regexp_like, or other regexp functions), see here for example
where regexp_instr(ngram, '[ '|| CHR(10) || CHR(13) || CHR(9) ||']') = 0
the white space is managed here '[ '
chr(10) = line feed
chr(13) = carriage return
chr(9) = tab
you can use CHR and INSTR function ASCII code of the characters you want to filter for example your where clause can be like this for an special character:
INSTR(ngram,CHR(the ASCI CODE of special char))=0
or the condition can be like this:
where
and ngram not like '%'||CHR(0)||'%' -- for null
.
.
.
and ngram not like '%'||CHR(31)||'%' -- for unit separator
and ngram not like '%'||CHR(127)||'%'-- for delete
here you can get all codes http://www.theasciicode.com.ar/extended-ascii-code/non-breaking-space-no-break-space-ascii-code-255.html
This should match ngram where it contains no whitespace characters by using the \s shorthand for all whitespace characters. I only tested by inserting a TAB into a string in a VARCHAR2 column and it was then excluded:
where regexp_instr(ngram, '\s') = 0;

how to deal with missings when importing csv to postgres?

I would like to import a csv file, which has multiple occurrences of missing values. I recoded them into NULL and tried to import the file as. I suppose that my attributes which include the NULLS are character values. However transforming them to numeric is bit complicated. Therefore I would like to import all of my table as:
\copy player_allstar FROM '/Users/Desktop/Rdaten/Data/player_allstar.csv' DELIMITER ';' CSV WITH NULL AS 'NULL' ';' HEADER
There must be a syntax error. But I tried different combinations and always get:
ERROR: syntax error at or near "WITH NULL"
LINE 1: COPY player_allstar FROM STDIN DELIMITER ';' CSV WITH NULL ...
I also tried:
\copy player_allstar FROM '/Users/Desktop/Rdaten/Data/player_allstar.csv' WITH(FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
and get:
ERROR: invalid input syntax for integer: "NULL"
CONTEXT: COPY player_allstar, line 2, column dreb: "NULL"
I suppose it is caused by preprocessing with R. The Table came with NAs so I change them to:
data[data==NA] <- "NULL"
I`m not aware of a different way chaning to NULL. I think this causes strings. Is there a different way to preprocess and keep the NAs(as NULLS in postgres of course)?
Sample:
pts dreb oreb reb asts stl
11 NULL NULL 8 3 NULL
4 5 3 8 2 1
3 NULL NULL 1 1 NULL
data type is integer
Given /tmp/sample.csv:
pts;dreb;oreb;reb;asts;stl
11;NULL;NULL;8;3;NULL
4;5;3;8;2;1
3;NULL;NULL;1;1;NULL
then with a table like:
CREATE TABLE player_allstar (pts integer, dreb integer, oreb integer, reb integer, asts integer, stl integer);
it works for me:
\copy player_allstar FROM '/tmp/sample.csv' WITH (FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
Your syntax is fine, the problem seem to be in the formatting of your data. Using your syntax I was able to load data with NULLs successfully:
mydb=# create table test(a int, b text);
CREATE TABLE
mydb=# \copy test from stdin WITH(FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> col a header;col b header
>> 1;one
>> NULL;NULL
>> 3;NULL
>> NULL;four
>> \.
mydb=# select * from test;
a | b
---+------
1 | one
|
3 |
| four
(4 rows)
mydb=# select * from test where a is null;
a | b
---+------
|
| four
(2 rows)
In your case you can substitute to NULL 'NA' in the copy command, if the original value is 'NA'.
You should make sure that there's no spaces around your data values. For example, if your NULL is represented as NA in your data and fields are delimited with semicolon:
1;NA <-- good
1 ; NA <-- bad
1<tab>NA <-- bad
etc.

Truncating leading zero from the string in postgresql

I'm trying to truncate leading zero from the address. example:
input
1 06TH ST
12 02ND AVE
123 001St CT
expected output
1 6TH ST
12 2ND AVE
123 1St CT
Here is what i have:
update table
set address = regexp_replace(address,'(0\d+(ST|ND|TH))','?????? need help here')
where address ~ '\s0\d+(ST|ND|TH)\s';
many thanks in advance
assuming that the address always has some number/letter address (1234, 1a, 33B) followed by a sequence of 1 or more spaces followed by the part you want to strip leading zeroes...
select substr(address, 1, strpos(address, ' ')) || ltrim(substr(address, strpos(address, ' ')), ' 0') from table;
or, to update the table:
update table set address = substr(address, 1, strpos(address, ' ')) || ltrim(substr(address, strpos(address, ' ')), ' 0');
-g
What you are looking for is the back references in the regular expressions:
UPDATE table
SET address = regexp_replace(address, '\m0+(\d+\w+)', '\1', 'g')
WHERE address ~ '\m0+(\d+\w+)'
Also:
\m used to match the beginning of a word (to avoid replacing inside words (f.ex. in 101Th)
0+ truncates all zeros (does not included in the capturing parenthesis)
\d+ used to capture the remaining numbers
\w+ used to capture the remaining word characters
a word caracter can be any alphanumeric character, and the underscore _.

how to use chinese character in pymysql to create table?

1.sqlite3
import sqlite3
con=sqlite3.connect("g:\\mytest1.db")
cur=con.cursor()
cur.execute('create table test (上市 TEXT)')
con.commit()
cur.close()
con.close()
I successfully create a test table mytest1.db ,and a chinese character name "上市" as field.
2.in mysql command console.
C:\Users\root>mysql -uroot -p
Welcome to the MySQL monitor. Commands end with ; or \g.
mysql> create database mytest2;
Query OK, 1 row affected (0.00 sec)
mysql> use mytest2;
Database changed
mysql> set names "gb2312";
Query OK, 0 rows affected (0.00 sec)
mysql> create table stock(上市 TEXT) ;
Query OK, 0 rows affected (0.07 sec)
The conclusion can be get : chinese characters can be used in mysql console.
3.pymysql
code31
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='******')
cur=con.cursor()
cur.execute("create database if not exists mytest31")
cur.execute("use mytest31")
cur.execute('set names "gb2312" ')
cur.execute('create table stock(上市 TEXT) ')
con.commit()
code32
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='******')
cur=con.cursor()
cur.execute("create database if not exists mytest32")
cur.execute("use mytest32")
cur.execute('set names "gb2312" ')
cur.execute('create table stock(上市 TEXT) ')
con.commit()
The same problem occurs
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 21-22: o rdinal not in range(256)
4.mysql-python-connect
code 41
import mysql.connector
config={'host':'127.0.0.1',
'user':'root',
'password':'123456',
'port':3306 ,
'charset':'utf8'
}
con=mysql.connector.connect(**config)
cur=con.cursor()
cur.execute("create database if not exists mytest41")
cur.execute("use mytest41")
cur.execute('set names "gb2312" ')
str='create table stock(上市 TEXT)'
cur.execute(str)
code 42
import mysql.connector
config={'host':'127.0.0.1',
'user':'root',
'password':'******',
'port':3306 ,
'charset':'utf8'
}
con=mysql.connector.connect(**config)
cur=con.cursor()
cur.execute("create database if not exists mytest42")
cur.execute("use mytest42")
cur.execute('set names "gb2312" ')
str='create table stock(上市.encode("utf-8") TEXT)'
cur.execute(str)
same errrors such as in pymysql.
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 22-23: o
rdinal not in range(256)
It is surely a bug in python mysql module that chinese characters can not be used as field name.
1.Chinese characters can be used as field name in python sqlite3 module.
2.Chinese characters can be used as field name in mysql console only if you 'set name "gb2312" '
pymysql.connect() accepts a charset argument. I have tested charset="utf8" and charset="gb2312" and both works (Python 3, PyMySQL 0.6.2). You don't need to use a "SET NAMES" query in this case.
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306,
user='root', passwd='******',
charset="utf8")
cur = con.cursor()
cur.execute("create database if not exists mytest31")
cur.execute("use mytest31")
cur.execute("create table stock(上市 TEXT)")
con.commit()
You're encodeing when you should decode. To convert a Chinese character to a unicode character, use:
"上市".decode("GB18030")
Which is an encoding generally used for Chinese chars. latin-1 will not work as most Chinese characters are not within its scope. The GB18030 encoding should work, but if not, there are a host of other encodings you can try, like gbk or big5_hkscs (generally for encodings done within HK/China).
Unicode errors are easy to spot, they show up as u'\ufffd' (which when encoded will be a diamond with a question mark in the middle).
I hope this was helpful!
Edit: I'm somewhat confused by your comment.
>>> print type("上市")
<type 'str'>
>>> print type("上市".decode("GB18030"))
<type 'unicode'>
str.decode() returns unicode.