i am a newbie in OWB Dataware house builder , i wanted to know about the SET operator union Function , the union function is it UNION or UNION ALL ?
simple union means remove full duplicates while UNION ALL means All rows .
You can set the operation in the properties editor, if the SET operator is selected.
Possible operations are:
UNION
UNIONALL
MINUS
INTERSECT
Related
I would like to put a list of names in my Aginity Netezza macro. For instance, I would like to be able to repeatedly use the list ("Adam", "Bill", "Cynthia", "Dick", "Ella", "Fanny") in my future queries, e.g. in WHERE clauses.
My questions are:
(1) Is there a limit to how many characters I can put inside the "Value" window of the Query Parameters Editor?
(2) Is there a way to make this work without using a macro? For instance, predefining this list somewhere?
I would put the list into a (temporary) table, and simply join to it when necessasary:
Create temp table names as
Select ‘Adam’::varchar(50)
Union all Select ‘Bill’::varchar(50)
Union all Select ‘Cynthia’::varchar(50)
Union all Select ‘Dick’::varchar(50)
Union all Select ‘Ella’::varchar(50)
Union all Select ‘Fanny’
;
Select x.a,x.b
from x
where x.name in (select * from Names)
;
Select
case
when x.name in (select * from Names)
then ‘Special’
Else ‘Other’
End as NameGrp,
Count(*) as size,
Sum(income) as TotalIncome
Group by NameGrp
Order by size desc
;
Alternatively netezza has an extension toolkit that enables ARRAY data types, but especially the first query will not perform well if you use it for that purpose. Interested? See here: https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.sqltk.doc/c_sqlext_array.html or google for examples
I have one large table 100m+ rows and two smaller ones 2m rows ea. All three tables have a column of company names that need to be sent out to an API for matching. I want to select the strings from each column and then combine into a single column of unique strings.
I'm using a version of this response, but unsurprisingly the performance is very slow. Combined 2 columns into one column SQL
SELECT DISTINCT
unnest(string_to_array(upper(t.buyer) || '#' || upper(a.aw_supplier_name) || '#' || upper(b.supplier_source_string), '#'))
FROM
tenders t,
awards a,
banking b
;
Any ideas on a more performant way to achieve this?
Update: the banking table is the largest table with 100m rows.
Assuming PostgreSQL 9.6 and borrowing the select from rd_nielsen's answer, the following should give you a comma delimited string of the distinct names.
WITH cte
AS (
SELECT UPPER(T.buyer) NAMES
FROM tenders T
UNION
SELECT UPPER(A.aw_supplier_name) NAMES
FROM awards A
UNION
SELECT UPPER(b.supplier_source_string) NAMES
FROM banking b
)
SELECT array_to_string(ARRAY_AGG(cte.names), ',')
FROM cte
To get just a list of the combined names from all three tables, you could instead union together the selections from each table, like so:
select
upper(t.buyer)
from
tenders t
union
select
upper(a.aw_supplier_name)
from
awards a
union
select
upper(b.supplier_source_string)
from
banking b
;
I have a View, which uses union between 2 select statements. the select statements is on table with million rows which joins with 4-5 small tables.
There is one Procedure which have some query on view with some where clause.
I thought using Union ALl in my view will be faster compared to Union (result is guaranteed to distinct), but if i use Union ALl instead of Union my Procedure takes significantly high time.
I checked the query plan and major difference I notice is
when i use Union in my view plan contains all seek on table.
When i use Union all in my view it takes Scan on all tables (60% cost)
View select statement has condition as MainTable.Column <> 10, if I use Union plan shows as MainTable.Column > 10 Or MainTable.Column < 10. Whereas Union all plan shows <> condition. Even if i make my condition as explicitly > Or < , still Union all plan shows as <>
how to know why changing union to union all plan generated Scan instead of seek.
I have a series of queries joined by union. Example:
SELECT
SUM(WHOS) [CRITERIA]
FROM ONFIRST
UNION
SELECT
COUNT(WHATS) [CRITERIA]
FROM ONSECOND
UNION
SELECT
IDONTKNOW [CRITERIA]
FROM ONTHIRD
etc.
The query results don't always come back in the same order and I want the results to be in the same order I have the queries written.
Example: Sometimes I get the SUM of WHOS first, sometimes I get the COUNT of WHATS first.
What's the best way to accomplish this?
You can control this easily by using a dummy order column, and ordering by that value:
;With Cte As
(
Select Sum(WHOS) CRITERIA
, 1 As Ord
From ONFIRST
Union
Select Count(WHATS) CRITERIA
, 2 As Ord
From ONSECOND
Union
Select IDONTKNOW CRITERIA
, 3 As Ord
From ONTHIRD
)
Select CRITERIA
From Cte
Order By Ord Asc;
When I execute this query in SQL Server which calls to IBM,
Select * from openquery(ibm,'
Select COST_AMT,'Query1' as Query
from table
where clause
with ur;
')
union
Select * from openquery(ibm,'
Select COST_AMT,'Query2' as Query
from table
different where clause
with ur;
')
I get different results in the union query than when I execute them separately and bring the results in together. I have tried the union query inside the openquery so I believe this is an IBM thing. The results appear to be a distinct selection of COST_AMT sorted by lowest to highest.
ie:
1,Query1
2,Query1
3,Query1
1,Query2
2,Query2
3,Query2
but the data is actually like this:
1,Query1
1,Query1
1,Query1
2,Query1
2,Query1
3,Query1
1,Query2
1,Query2
1,Query2
2,Query2
2,Query2
3,Query1
Am I missing something about the ibm union query? I realize I could sum and get the answer, (which is what I plan no doing) but I want to know more about why this is happening.
This has nothing to do with "ibm" or "db2" -- the SQL UNION operator removes duplicates. To retain duplicates use UNION ALL.