DB2 CLI result output - db2

When running command-line queries in MySQL you can optionally use '\G' as a statement terminator, and instead of the result set columns being listed horizontally across the screen, it will list each column vertically, which the corresponding data to the right. Is there a way to the same or a similar thing with the DB2 command line utility?
Example regular MySQL result
mysql> select * from tagmap limit 2;
+----+---------+--------+
| id | blog_id | tag_id |
+----+---------+--------+
| 16 | 8 | 1 |
| 17 | 8 | 4 |
+----+---------+--------+
Example Alternate MySQL result:
mysql> select * from tagmap limit 2\G
*************************** 1. row ***************************
id: 16
blog_id: 8
tag_id: 1
*************************** 2. row ***************************
id: 17
blog_id: 8
tag_id: 4
2 rows in set (0.00 sec)
Obviously, this is much more useful when the columns are large strings, or when there are many columns in a result set, but this demonstrates the formatting better than I can probably explain it.

I don't think such an option is available with the DB2 command line client. See http://www.dbforums.com/showthread.php?t=708079 for some suggestions. For a more general set of information about the DB2 command line client you might check out the IBM DeveloperWorks article DB2's Command Line Processor and Scripting.

Little bit late, but found this post when I searched for an option to retrieve only the selected data.
So db2 -x <query> gives only the result back. More options can be found here: https://www.ibm.com/docs/en/db2/11.1?topic=clp-options
Example:
[db2inst1#a21c-db2 db2]$ db2 -n select postschemaver from files.product
POSTSCHEMAVER
--------------------------------
147.3
1 record(s) selected.
[db2inst1#a21c-db2 db2]$ db2 -x select postschemaver from files.product
147.3

DB2 command line utility always displays data in tabular format. i.e. rows horizontally and columns vertically. It does not support any other format like \G statement terminator do for mysql. But yes, you can store column organized data in DB2 tables when DB2_WORKLOAD=ANALYTICS is set.
db2 => connect to coldb
Database Connection Information
Database server = DB2/LINUXX8664 10.5.5
SQL authorization ID = BIMALJHA
Local database alias = COLDB
db2 => create table testtable (c1 int, c2 varchar(10)) organize by column
DB20000I The SQL command completed successfully.
db2 => insert into testtable values (2, 'bimal'),(3, 'kumar')
DB20000I The SQL command completed successfully.
db2 => select * from testtable
C1 C2
----------- ----------
2 bimal
3 kumar
2 record(s) selected.
db2 => terminate
DB20000I The TERMINATE command completed successfully.

Related

DB2 Scheduled Trigger

I'm new to triggers and I want to ask the proper procedure to create a trigger (or any better methods) to duplicate the contents of T4 table to T5 table on a specified datetime.
For example, on the 1st day of every month at 23:00, I want to duplicate the contents of T4 table to T5 table.
Can anyone please advise what's the best method?
Thank you.
CREATE TRIGGER TRIG1
AFTER INSERT ON T4
REFERENCING NEW AS NEW
FOR EACH ROW
BEGIN
INSERT INTO T5 VALUES (:NEW.B, :NEW.A);
END TRIG1;
It can be done by Administrative Task Scheduler feature instead of cron. Here is a sample script.
#!/bin/sh
db2set DB2_ATS_ENABLE=YES
db2stop
db2start
db2 -v "drop db db1"
db2 -v "create db db1"
db2 -v "connect to db1"
db2 -v "CREATE TABLESPACE SYSTOOLSPACE IN IBMCATGROUP MANAGED BY AUTOMATIC STORAGE EXTENTSIZE 4"
db2 -v "create table s1.t4 (c1 int)"
db2 -v "create table s1.t5 (c1 int)"
db2 -v "insert into s1.t4 values (1)"
db2 -v "create procedure s1.copy_t4_t5() language SQL begin insert into s1.t5 select * from s1.t4; end"
db2 -v "CALL SYSPROC.ADMIN_TASK_ADD ('ATS1', CURRENT_TIMESTAMP, NULL, NULL, '0,10,20,30,40,50 * * * *', 'S1', 'COPY_T4_T5',NULL , NULL, NULL )"
date
It will create a task, called 'ATS1' and will call the procedure s1.copy_t4_t5 every 10 minuets, such as 01:00, 01:20, 01:30. You may need to run below after executing the script:
db2 -v "connect to db1"
Then, after some time, run below to see if the t5 table has row as expected:
db2 -v "select * from s1.t5"
For your case, the 5th parameter would be replaced with '0 23 1 * *'.
It represents 'minute hour day_of_month month weekday' so
it will be called every 1st day of month at 23:00.
For more information, how to modify existing task, delete task, review status, see at:
Administrative Task Scheduler routines and views
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.sql.rtn.doc/doc/c0061223.html
Also, here is one of good article about it:
[DB2 LUW] Sample administrative task scheduler ADMIN_TASK_ADD and ADMIN_TASK_REMOVE usage
https://www.ibm.com/support/pages/node/1140388?lang=en
Hope this helps.

How to upload data into a Redshift Table with a Date Format 'MMDDYYYY'

I need to upload a Data in the format 'MMDDYYYY'
current way code i am using to send via psql
SET BaseFolder=C:\
psql -h hostname -d database -c "\copy test_table(id_test,
colum_test,columndate DATEFORMAT 'MMDDYYYY')
from '%BaseFolder%\test_table.csv' with delimiter ',' CSV HEADER;"
here test_table is the table in the postgres DB
Id_test: float8
Column_test: float8
columndate: timestamp
id_test colum_test colum_date
94 0.3306 12312017
16 0.3039 12312017
25 0.5377 12312017
88 0.6461 12312017
i am getting the following error when i run the above query in CMD in windows 10
ERROR: date/time field value out of range: "12312017"
HINT: Perhaps you need a different "datestyle" setting.
CONTEXT: COPY test_table, line 1, column columndate : "12312017"
The DATEFORMAT applies to the whole COPY command, not a single field.
I got it to work as follows...
Your COPY command suggests that the data is comma-separated, so I used this input data and stored it in an Amazon S3 bucket:
id_test colum_test,colum_date
94,0.3306,12312017
16,0.3039,12312017
25,0.5377,12312017
88,0.6461,12312017
I created a table:
CREATE TABLE foo (
foo_id BIGINT,
foo_value DECIMAL(4,4),
foo_date DATE
)
Then loaded the data:
COPY foo (foo_id, foo_value, foo_date)
FROM 's3://my-bucket/foo.csv'
IAM_ROLE 'arn:aws:iam::123456789012:role/Redshift-Role'
CSV
IGNOREHEADER 1
DATEFORMAT 'MMDDYYYY'
Please note that the recommended way to load data into Amazon Redshift is from files stored in Amazon S3. (I haven't tried using the native psql copy command with Redshift, and would recommend against it — particularly for large data files. You certainly can't mix commands from the Redshift COPY command into the psql Copy command.)
Then, I ran SELECT * FROM foo and it returned:
16 0.3039 2017-12-31
88 0.6461 2017-12-31
94 0.3306 2017-12-31
25 0.5377 2017-12-31
That is a horrible format for dates. Don't break your date type, convert your data to a saner format.
=> select to_date('12312017', 'MMDDYYYY');
to_date
------------
2017-12-31

PSQL - Copy from csv file if column item does not exist

I am trying to import a csv file into the postgres table where I can successfully do so using COPY FROM:
import.sql
\copy myTable FROM '..\CSV_OUTPUT.csv' DELIMITER ',' CSV HEADER;
But that query only adds rows if it is currently not in the database, otherwise it exits with an error. Key (id)=(#) already exists.
myTable
id | alias | address
------+-------------+---------------
11 | red_foo | 10.1.1.11
12 | blue_foo | 10.1.1.12
CSV_OUTPUT.csv
id | alias | address
------+-------------+---------------
10 | black_foo | 10.1.1.11
12 | blue_foo | 10.1.1.12
13 | grey_foo | 10.1.1.13
14 | pink_foo | 10.1.1.14
My desired output is to insert the rows from the csv file into postgresql if address does not exist. myTable should contain grey_foo and pink_foo already but not black_foo since its address already exist.
What should be the right queries to use in order to achieve this? Your suggestions and ideas are highly appreciated.
Copy the data into a staging table first, and then update your main table (myTable) with only the rows with the keys that don't already exist. For example, assuming you have imported the data into a table named staging:
with nw as (
select s.id, s.alias, s.address
from staging as s
left join mytable as m on m.address=s.address
where m.address is null
)
insert into mytable
(id, alias, address)
select id, alias, address
from nw;
If you can upgrade to Postgres 9.5, you could instead use an INSERT command with the ON CONFLICT DO NOTHING clause.

Invalid column name in DB2

I'm having trouble with the column name of one of my tables.
My version of DB2 is DB2/LINUXX8664 11.1.0. I'm running it on a CentOS Linux Release 7.2.1511. My version of IBM Data Studio is 4.1.2.
The column is named "NRO_AÑO" in the table "PERIODO" in the schema "COMPRAS".
When I execute the simple query
SELECT NRO_AÑO
FROM COMPRAS.PERIODO
it yields the following error:
"NRO_AÑO" is not valid in the context where it is used.. SQLCODE=-206, SQLSTATE=42703, DRIVER=3.68.61
If I execute the query
SELECT *
FROM COMPRAS.PERIODO
it yields data with the following columns
I'm guessing it has something to do with the charsets involved, but I'm not sure where to look at.
Thanks in advance.
It worked for me:
[db2inst1#server ~]$ db2 "create table compras.periodo (nro_año int)"
DB20000I The SQL command completed successfully.
[db2inst1#server ~]$ db2 "insert into compras.periodo values (1)"
DB20000I The SQL command completed successfully.
[db2inst1#server ~]$ db2 "insert into compras.periodo (nro_año) values (2)"
DB20000I The SQL command completed successfully.
[db2inst1#server ~]$ db2 "select nro_año from compras.periodo"
NRO_AÑO
-----------
1
2
2 record(s) selected.
Probably, you are having a console encoding problem (putty), and you should review how the name of the column in the database is stored:
db2 "select colname from syscat.columns where tabname = 'PERIODO'"
COLNAME
--------------------------------------------------------------------------------------------------------------------------------
NRO_AÑO
1 record(s) selected.
Creating the table from Putty (SSH client) and then selecting from Data Studio, then the characters higher that 128 will have different representations. Java (DataStudio) uses UTF-8, but probably the script used to create the table used another encoding and this is having problems in the database (Putty, Windows, Notepad, etc).
It worked for me when I run the script from DB2 command line processor on DB2 9.7.
db2 => CREATE TABLE TEMP_TABLE(NRO_AÑO INTEGER)
DB20000I The SQL command completed successfully.
db2 => INSERT INTO TEMP_TABLE(NRO_AÑO) VALUES(1)
DB20000I The SQL command completed successfully.
db2 => SELECT * FROM TEMP_TABLE
NRO_AÑO
-----------
1
1 record(s) selected.
db2 => select colname from syscat.columns where tabname = 'TEMP_TABLE'
COLNAME
------------
NRO_AÑO
1 record(s) selected.
Your issue may also be that columns need to be enclosed in quotes, as found in IBM Data Studio Ver 4, example:
INSERT INTO DB2ADMIN.FB_WEB_POSTS ("UserName","FaceID", "FaceURL","FaceStory","FaceMessage","FaceDate","FaceStamp")
VALUES ('SocialMate','233555900032117_912837012103999', 'http://localhost/doculogs.nsf/index.html', 'Some Message or Story','Random Files Project for Lotus Notes, Google, Oracle App samples', '2017-09-09', '2017-09.23');

PostgreSQL write amplification

I'm trying to find how much stress PostgreSQL puts on disks and results are kind of discouraging so far. Please take a look on methodology, apparently I'm missing something or calculating numbers in a wrong way.
Environment
PostgreSQL 9.6.0-1.pgdg16.04+1 is running inside a separate LXC container with Ubuntu 16.04.1 LTS (kernel version 4.4.0-38-generic, ext4 filesystem on top of SSD), has only one client connection from which I run tests.
I disabled autovacuum to prevent unnecessary writes.
Calculation of written bytes is done by following command, I want to find total number of bytes written by all PostgreSQL processes (including WAL writer):
pgrep postgres | xargs -I {} cat /proc/{}/io | grep ^write_bytes | cut -d' ' -f2 | python -c "import sys; print sum(int(l) for l in sys.stdin)"
Tests
With # sign I marked a database command, with → I marked result of write_bytes sum after the database command. The test case is simple: a table with just one int4 column filled with 10000000 values.
Before every test I run set of commands to free disk space and prevent additional writes:
# DELETE FROM test_inserts;
# VACUUM FULL test_inserts;
# DROP TABLE test_inserts;
Test #1: Unlogged table
As documentation states, changes in UNLOGGED table are not written to WAL log, so it's a good point to start:
# CREATE UNLOGGED TABLE test_inserts (f1 INT);
→ 1526276096
# INSERT INTO test_inserts SELECT generate_series(1, 10000000);
→ 1902977024
The difference is 376700928 bytes (~359MB), which sort of makes sense (ten millions of 4-byte integers + rows, pages and other costs), but still looks a bit too much, almost 10x of actual data size.
Test #2: Unlogged table with primary key
# CREATE UNLOGGED TABLE test_inserts (f1 INT PRIMARY KEY);
→ 2379882496
# INSERT INTO test_inserts SELECT generate_series(1, 10000000);
→ 2967339008
The difference is 587456512 bytes (~560MB).
Test #3: regular table
# CREATE TABLE test_inserts (f1 INT);
→ 6460669952
# INSERT INTO test_inserts SELECT generate_series(1, 10000000);
→ 7603630080
There the difference is already 1142960128 bytes (~1090MB).
Test #4: regular table with primary key
# CREATE TABLE test_inserts (f1 INT PRIMARY KEY);
→ 12740534272
# INSERT INTO test_inserts SELECT generate_series(1, 10000000);
→ 14895218688
Now the difference is 2154684416 bytes (~2054MB) and after about 30 seconds additional 100MB were written.
For this test case I made a breakdown by processes:
Process | Bytes written
/usr/lib/postgresql/9.6/bin/postgres | 0
\_ postgres: 9.6/main: checkpointer process | 99270656
\_ postgres: 9.6/main: writer process | 39133184
\_ postgres: 9.6/main: wal writer process | 186474496
\_ postgres: 9.6/main: stats collector process | 0
\_ postgres: 9.6/main: postgres testdb [local] idle | 1844658176
Any ideas, suggestions on how to measure values I'm looking for correctly? Maybe it's a kernel bug? Or PostgreSQL really does so many writes?
Edit: To double check what write_bytes means I wrote a simple python script that proved, that this value is the actual written bytes value.
Edit 2: For PostgreSQL 9.5 Test case #1 showed 362577920 bytes, test #4 showed 2141343744 bytes, so it's not about PG version.
Edit 3: Richard Huxton mentioned Database Page Layout article and I'd like to elaborate: I agree with the storage cost, that includes 24 bytes of row header, 4 bytes of data itself and even 4 bytes for data alignment (8 bytes usually), which gives 32 bytes per row and with that amount of rows it's about 320MB per table and this is something I got with test #1.
I could assume that primary key in that case should be about the same size as data and it test #4 both, data and PK, would be written to WAL. That gives something like 360MB x 4 = 1.4GB, which is less than result I got.