Outputting results from multiple sql queries in postgresql - postgresql

I have postgresql-9.2 installed on my local machine (running windows 7) and I am also the administrator. I am using the Query Tool of pgAdmin III to query my database. My problem is as follows:
Say I have two tables Table_A and Table_B with different number of columns. Also, say I have following two very simple queries:
select * from Table_A;
select * from Table_B;
I want to run both these queries and see the output from both of them together. I dont mind if I see the output in the GUI or in a file.
I also tried the copy command and outputting to a csv. But instead of appending to the file it overwrites it. So, I always end up with the results from query 2 only. The same thing happens with the GUI.
It is really annoying to comment one query, run the another, output to two different files and then merge those two files together.

This is not currently supported by PostgreSQL - from the docs
(http://www.postgresql.org/docs/9.4/interactive/libpq-exec.html):
The command string can include multiple SQL commands (separated by semicolons). Multiple queries sent in a single PQexec call are processed in a single transaction, unless there are explicit BEGIN/COMMIT commands included in the query string to divide it into multiple transactions. Note however that the returned PGresult structure describes only the result of the last command executed from the string. Should one of the commands fail, processing of the string stops with it and the returned PGresult describes the error condition.

Your problem does not depend on the client.
Assuming all columns to be of type text, try this query:
SELECT col_a AS col_ac, col_b AS col_bd
,NULL::text AS col_e, NULL::text AS col_f
FROM table_a
UNION ALL
SELECT col_c, col_d, col_e, col_f
FROM table_b;
Column names and data tapes are defined by the first branch of a UNION SELECT. The rest has to fall in line.

The PSQL tool in the top menu under TOOLS (pgadmin4) gives results of multiple queries, unlike the query tool. In the PSQL command line tool, you can enter two or more queries separated by a semicolon and you'll get the results of each query displayed. The downside is that this is a command line tool so the results are not ideal if you have a lot of data. I use this when I have a lot of updates to string together and I want to see the number of rows updated in each. This would work well for select queries with small results.
psql tool

You can use UNION ALL, but you need to make sure each sub query has the same number of columns.
SELECT 'a', 'b'
UNION ALL
SELECT 'c' ;
won't work.
SELECT 'a', 'b'
UNION ALL
SELECT 'c', 'd'
will work

Related

Header not coming while exporting Db2 select query results to CSV

I'm trying to execute the below query to export the results to CSV. I'm able to export the data to CSV but the headers were missing in the file. Is there anyway that we can achieve this? Im executing the file in the form "db2 -tvmf D:\Db.sql"
connect to ****** user ***** using ******
export to "D:\Vikas.csv" OF DEL MESSAGES
select
'ROW_NUM',
'DETAIL_TYPE_CD',
'ADMIN_FEES_TICKET',
'ADMINISTRATIVE_FEES',
'BASE_RENT',
'CITATIONS',
'COLLECTION_REPO_FEES',
'DESC',
'EFFECTIVE_DATE',
'LATE_CHARGE',
'MISC_FEE',
'STATUS_CD',
'ROW_ID',
'ROW_ID',
'BUILD',
'REVERSE_FLG',
'NSF_FLG',
'PR_CON_ID',
'PROC_DATE',
'PROPERTY_TAX',
'REGISTRATION_FEES',
'REPAIR_FEES',
'SALES_TAX',
'TERMINATION_FEES',
'TOTAL_TRANS',
'TRANSACTION_TYPE'
from sysibm.sysdummy1
UNION ALL (select
T1.ROW_NUM,
T5.DETAIL_TYPE_CD,
T1.ADMIN_FEES_TICKET,
T1.ADMINISTRATIVE_FEES,
T1.BASE_RENT,
T1.CITATIONS,
T1.COLLECTION_REPO_FEES,
T1.DESC,
T1.EFFECTIVE_DATE,
T1.LATE_CHARGE,
T1.MISC_FEE,
T2.STATUS_CD,
T4.ROW_ID,
T3.ROW_ID,
T2.BUILD,
T1.REVERSE_FLG,
T1.NSF_FLG,
T2.PR_CON_ID,
T1.PROC_DATE,
T1.PROPERTY_TAX,
T1.REGISTRATION_FEES,
T1.REPAIR_FEES,
T1.SALES_TAX,
T1.TERMINATION_FEES,
T1.TOTAL_TRANS,
T1.TRANSACTION_TYPE
FROM
SIEBEL.LSE_INPHIST_VIEW T1
LEFT OUTER JOIN SIEBEL.S_ASSET T2 ON T1.ACCOUNT_NUM = T2.ASSET_NUM
LEFT OUTER JOIN SIEBEL.S_ASSET_CON T3 ON T2.ROW_ID = T3.ASSET_ID AND
T3.RELATION_TYPE_CD = 'Obligor'
LEFT OUTER JOIN SIEBEL.S_ASSETCON_ADDR T4 ON T3.ROW_ID =
T4.ASSET_CON_ID AND T4.USE_TYPE_CD = 'Bill To'
LEFT OUTER JOIN SIEBEL.S_PROD_INT T5 ON T2.PROD_ID = T5.ROW_ID
WHERE
(T1.ACNT_ID = '01003501435'))
ORDER BY
T1.ACNT_ID DESC,T1.PROC_DATE DESC WITH UR
I have included the updated query now in the post.
The Db2-LUW export command lacks the ability to add columns headers to the output file. It only exports whatever is in the SELECT statement.
So when you want to have column-headers in the CSV file you have different options.
One way to do it (when there is no order by) is to make the SELECT statement into a UNION of two queries, the first query returns one row which is the list of column names, then union this with your real query. It means you must hand-craft the column-names of the first query to match the real second query. In your case for example it might look like:
SELECT 'row_num', 'detail_type_cd', ....
from sysibm.sysdummy1
UNION
SELECT t1.ROW_NUM, T5.DETAIL_TYPE_CD, ...
(you have to manually make the column-names , put them in single-quotes etc. But if you want Db2 to work out the column names you can use a technique like here ).
If you have an order by you can run two separate export commands (i.e. no union) outputting to two separate output files, and then use operating system functions to concatenate the output files like this:
export to headers.csv select 'colname1','colname2'...from sysibm.sysdummy1;
export to data.csv select ...
-- for MS-windows
!copy /a headers.csv + data.csv data_with_headers.csv ;
Another (possibly simpler) way to do it , with v11.5 (and higher) versions of Db2-LUW , is to not use the export command, but instead to create an external table, which lets you specify an option includeheader on among many other options for CSV files. You can search this site for examples, and reference the documentation.

Convert T-SQL Cross Apply to Redshift

I am converting the following T-SQL statement to Redshift. The purpose of the query is to convert a column in the table with a value containing a comma delimited string with up to 60 values into multiple rows with 1 value per row.
SELECT
id_1
, id_2
, value
into dbo.myResultsTable
FROM myTable
CROSS APPLY STRING_SPLIT([comma_delimited_string], ',')
WHERE [comma_delimited_string] is not null;
In SQL this processes 10 million records in just under 1 hour which is fine for my purposes. Obviously a direct conversation to Redshift isn't possible due to Redshift not having a Cross Apply or String Split functionality so I built a solution using the process detailed here (Redshift. Convert comma delimited values into rows) which utilizes split_part() to split the comma delimited string into multiple columns. Then another query that unions everything to get the final output into a single column. But the typical run takes over 6 hours to process the same amount of data.
I wasn't expecting to run into this issue just knowing the power difference between the machines. The SQL Server I was using for the comparison test was a simple server with 12 processors and 32 GB of RAM while the Redshift server is based on the dc1.8xlarge nodes (I don't know the total count). The instance is shared with other teams but when I look at the performance information there are plenty of available resources.
I'm relatively new to Redshift so I'm still assuming I'm not understanding something. But I have no idea what am I missing. Are there things I need to check to make sure the data is loaded in an optimal way (I'm not an adim so my ability to check this is limited)? Are there other Redshift query options that are better than the example I found? I've searched for other methods and optimizations but unless I start looking into Cross Joins, something I'd like to avoid (Plus when I tried to talk to the DBA's running the Redshift cluster about this option their response was a flat "No, can't do that.") I'm not even sure where to go at this point so any help would be much appreciated!
Thanks!
I've found a solution that works for me.
You need to do a JOIN on a number table, for which you can take any table as long as it has more rows that the numbers you need. You need to make sure the numbers are int by forcing the type. Using the funcion regexp_count on the column to be split for the ON statement to count the number of fields (delimiter +1), will generate a table with a row per repetition.
Then you use the split_part function on the column, and use the number.num column to extract for each of the rows a different part of the text.
SELECT comma_delimited_string, numbers.num, REGEXP_COUNT(comma_delimited_string , ',')+1 AS nfields, SPLIT_PART(comma_delimited_string, ',', numbers.num) AS field
FROM mytable
JOIN
(
select
(row_number() over (order by 1))::int as num
from
mytable
limit 15 --max num of fields
) as numbers
ON numbers.num <= regexp_count(comma_delimited_string , ',') + 1

are INTO, FROM an JOIN the only ways to get a table?

I'm currently writing a script which will allow me to input a file (generally .sql) and it'll generate a list of every table that's used in that file. the process is simple as it opened the input file, checks for a substring and if that substring exists outputs the line to the screen.
the substring that being checked is tsql keywords that is indicative of a selected table such as INTO, FROM and JOIN. not being a T-SQL wizard those 3 keywords are the only ones i know of that are used to select a table in a query.
So my question is, in T-SQL are INTO, FROM an JOIN the only ways to get a table? or are these others?
There're many ways to get a table, here're some of them:
DELETE
FROM
INTO
JOIN
MERGE
OBJECT_ID (N'dbo.mytable', N'U') where U is the object type for table.
TABLE, e.g. ALTER TABLE, TRUNCATE TABLE, DROP TABLE
UPDATE
However, by using your script, you'll not only get real tables, but maybe VIEW and temporary table. Here're 2 examples:
-- Example 1
SELECT *
FROM dbo.myview
-- Example 2
WITH tmptable AS
(
SELECT *
FROM mytable
)
SELECT *
FROM tmptable

DB2 to Netezza Migration

I have one query in DB2 which has mentioned below.
What would be the syntax for the same in NETEZZA?
select distinct acct_num from GTD_demo_dim where ACCT_NUM fetch first 1 rows only);
First, I don't think your statement is valid.
select distinct acct_num from GTD_demo_dim where ACCT_NUM fetch first 1 rows only);
The where clause needs to be finished and you've used a closing parenthesis without an opening one.
fetch first is common (standard?) ODBC syntax, so it's very likely that this will work. However, the usual way to do this in netezza is using a limit. All that said, this is how I'd query and expect the intended result (omitting your where since I can't infer the intent):
select distinct acct_num from gtd_demo_dim limit 1;

Cannot sort a row of size 8130, which is greater than the allowable maximum of 8094

SELECT DISTINCT tblJobReq.JobReqId
, tblJobReq.JobStatusId
, tblJobClass.JobClassId
, tblJobClass.Title
, tblJobReq.JobClassSubTitle
, tblJobAnnouncement.JobClassDesc
, tblJobAnnouncement.EndDate
, blJobAnnouncement.AgencyMktgVerbage
, tblJobAnnouncement.SpecInfo
, tblJobAnnouncement.Benefits
, tblSalary.MinRateSal
, tblSalary.MaxRateSal
, tblSalary.MinRateHour
, tblSalary.MaxRateHour
, tblJobClass.StatementEval
, tblJobReq.ApprovalDate
, tblJobReq.RecruiterId
, tblJobReq.AgencyId
FROM ((tblJobReq
LEFT JOIN tblJobAnnouncement ON tblJobReq.JobReqId = tblJobAnnouncement.JobReqId)
INNER JOIN tblJobClass ON tblJobReq.JobClassId = tblJobClass.JobClassId)
LEFT JOIN tblSalary ON tblJobClass.SalaryCode = tblSalary.SalaryCode
WHERE (tblJobReq.JobClassId in (SELECT JobClassId
from tblJobClass
WHERE tblJobClass.Title like '%Family Therapist%'))
When i try to execute the query it results in the following error.
Cannot sort a row of size 8130, which is greater than the allowable maximum of 8094
I checked and didn't find any solution. The only way is to truncate (substring())the "tblJobAnnouncement.JobClassDesc" in the query which has column size of around 8000.
Do we have any work around so that i need not truncate the values. Or Can this query be optimised? Any setting in SQL Server 2000?
The [non obvious] reason why SQL needs to SORT is the DISTINCT keyword.
Depending on the data and underlying table structures, you may be able to do away with this DISTINCT, and hence not trigger this error.
You readily found the alternative solution which is to truncate some of the fields in the SELECT list.
Edit: Answering "Can you please explain how DISTINCT would be the reason here?"
Generally, the fashion in which the DISTINCT requirement is satisfied varies with
the data context (expected number of rows, presence/absence of index, size of row...)
the version/make of the SQL implementation (the query optimizer in particular receives new or modified heuristics with each new version, sometimes resulting in alternate query plans for various constructs in various contexts)
Yet, all the possible plans associated with a "DISTINCT query" involve *some form* of sorting of the qualifying records. In its simplest form, the plan "fist" produces the list of qualifying rows (records) (the list of records which satisfy the WHERE/JOINs/etc. parts of the query) and then sorts this list (which possibly includes some duplicates), only retaining the very first occurrence of each distinct row. In other cases, for example when only a few columns are selected and when some index(es) covering these columns is(are) available, no explicit sorting step is used in the query plan but the reliance on an index implicitly implies the "sortability" of the underlying columns. In other cases yet, steps involving various forms of merging or hashing are selected by the query optimizer, and these too, eventually, imply the ability of comparing two rows.
Bottom line: DISTINCT implies some sorting.
In the specific case of the question, the error reported by SQL Server and preventing the completion of the query is that "Sorting is not possible on rows bigger than..." AND, the DISTINCT keyword is the only apparent reason for the query to require any sorting (BTW many other SQL constructs imply sorting: for example UNION) hence the idea of removing the DISTINCT (if it is logically possible).
In fact you should remove it, for test purposes, to assert that, without DISTINCT, the query completes OK (if only including some duplicates). Once this fact is confirmed, and if effectively the query could produce duplicate rows, look into ways of producing a duplicate-free query without the DISTINCT keyword; constructs involving subqueries can sometimes be used for this purpose.
An unrelated hint, is to use table aliases, using a short string to avoid repeating these long table names. For example (only did a few tables, but you get the idea...)
SELECT DISTINCT JR.JobReqId, JR.JobStatusId,
tblJobClass.JobClassId, tblJobClass.Title,
JR.JobClassSubTitle, JA.JobClassDesc, JA.EndDate, JA.AgencyMktgVerbage,
JA.SpecInfo, JA.Benefits,
S.MinRateSal, S.MaxRateSal, S.MinRateHour, S.MaxRateHour,
tblJobClass.StatementEval,
JR.ApprovalDate, JR.RecruiterId, JR.AgencyId
FROM (
(tblJobReq AS JR
LEFT JOIN tblJobAnnouncement AS JA ON JR.JobReqId = JA.JobReqId)
INNER JOIN tblJobClass ON tblJobReq.JobClassId = tblJobClass.JobClassId)
LEFT JOIN tblSalary AS S ON tblJobClass.SalaryCode = S.SalaryCode
WHERE (JR.JobClassId in
(SELECT JobClassId from tblJobClass
WHERE tblJobClass.Title like '%Family Therapist%'))
FYI, running this SQL command on your DB can fix the problem if it is caused by space that needs to be reclaimed after dropping variable length columns:
DBCC CLEANTABLE (0,[dbo.TableName])
See: http://msdn.microsoft.com/en-us/library/ms174418.aspx
This is a limitation of SQL Server 2000. You can:
Split it into two queries and combine elsewhere
SELECT ID, ColumnA, ColumnB FROM TableA JOIN TableB
SELECT ID, ColumnC, ColumnD FROM TableA JOIN TableB
Truncate the columns appropriately
SELECT LEFT(LongColumn,2000)...
Remove any redundant columns from the SELECT
SELECT ColumnA, ColumnB, --IDColumnNotUsedInOutput
FROM TableA
Migrate off of SQL Server 2000