way to ignore intermediate column in an INSERT SELECT statement? - tsql

For a insert ... select statement, is there a way to ignore a variable that is required for the select statement but should be inserted?
I'm using a rank function to select which data to insert. Unfortunately the rank function can't be called in the where clause.
insert into target_table(v1
,v2
,v3
)
select v1
,v2
,v3
,RANK() over (partition by group_col
order by order_col desc
) as my_rank
from source_table
where my_rank > 1
The result is the following error:
Msg 121, Level 15, State 1, Line 1 The select list for the INSERT
statement contains more items than the insert list. The number of
SELECT values must match the number of INSERT columns.
I know I can do this using a temporary table but would like to keep it in a single statement if possible.

Wrap your main query inside a subquery.
INSERT INTO target_table
(v1, v2, v3)
SELECT q.v1, q.v2, q.v3
FROM (SELECT v1, v2, v3,
RANK() OVER (PARTITION BY group_col ORDER BY order_col DESC) AS my_rank
FROM source_table) q
WHERE q.my_rank > 1;
For SQL Server 2005+, you could also use a CTE:
WITH cteRank AS (
SELECT v1, v2, v3,
RANK() OVER (PARTITION BY group_col ORDER BY order_col DESC) AS my_rank
FROM source_table
)
INSERT INTO target_table
(v1, v2, v3)
SELECT v1, v2, v3
FROM cteRank
WHERE my_rank > 1;

Related

T-SQL - Pivot/Crosstab - variable number of values

I have a simple data set that looks like this:
Name Code
A A-One
A A-Two
B B-One
C C-One
C C-Two
C C-Three
I want to output it so it looks like this:
Name Code1 Code2 Code3 Code4 Code...n ...
A A-One A-Two
B B-One
C C-One C-Two C-Three
For each of the 'Name' values, there can be an undetermined number of 'Code' values.
I have been looking at various examples of Pivot SQL [including simple Pivot sql and sql using the XML function?] but I have not been able to figure this out - or to understand if it is even possible.
I would appreciate any help or pointers.
Thanks!
Try it like this:
DECLARE #tbl TABLE([Name] VARCHAR(100),Code VARCHAR(100));
INSERT INTO #tbl VALUES
('A','A-One')
,('A','A-Two')
,('B','B-One')
,('C','C-One')
,('C','C-Two')
,('C','C-Three');
SELECT p.*
FROM
(
SELECT *
,CONCAT('Code',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
FROM #tbl
)t
PIVOT
(
MAX(Code) FOR ColumnName IN (Code1,Code2,Code3,Code4,Code5 /*add as many as you need*/)
)p;
This line
,CONCAT('Code',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
will use a partitioned ROW_NUMBER in order to create numbered column names per code. The rest is simple PIVOT...
UPDATE: A dynamic approach to reflect the max amount of codes per group
CREATE TABLE TblTest([Name] VARCHAR(100),Code VARCHAR(100));
INSERT INTO TblTest VALUES
('A','A-One')
,('A','A-Two')
,('B','B-One')
,('C','C-One')
,('C','C-Two')
,('C','C-Three');
DECLARE #cols VARCHAR(MAX);
WITH GetMaxCount(mc) AS(SELECT TOP 1 COUNT([Code]) FROM TblTest GROUP BY [Name] ORDER BY COUNT([Code]) DESC)
SELECT #cols=STUFF(
(
SELECT CONCAT(',Code',Nmbr)
FROM
(SELECT TOP((SELECT mc FROM GetMaxCount)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) t(Nmbr)
FOR XML PATH('')
),1,1,'');
DECLARE #sql VARCHAR(MAX)=
'SELECT p.*
FROM
(
SELECT *
,CONCAT(''Code'',ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Code)) AS ColumnName
FROM TblTest
)t
PIVOT
(
MAX(Code) FOR ColumnName IN (' + #cols + ')
)p;';
EXEC(#sql);
GO
DROP TABLE TblTest;
As you can see, the only part which will change in order to reflect the actual amount of columns is the list in PIVOTs IN() clause.
You can create a string, which looks like Code1,Code2,Code3,...CodeN and build the statement dynamically. This can be triggered with EXEC().
I'd prefer the first approach. Dynamically created SQL is very mighty, but can be a pain in the neck too...

Sum of a column within the subquery in Postgresql

I have a Postgresql table where I have 2 fields i.e. ID and Name ie column1 and column2 in the SQLFiddle. The default record_count I put for a particular ID is 1. I want to get the record_count for column 1 and sum that record_count by column1.
I tried to use this query but somehow its showing some error.
select sum(column_record) group by column_record ,
* from (select column1,1::int4 as column_record from test) a
Also find the Input/Output screenshot in the form of excel below :
SQL Fiddle for the same :
http://sqlfiddle.com/#!15/12fe9/1
If you're using a window function (you may want to use normal grouping, which is "a lot" more faster and performant), this is the way to do it:
-- create temp table test as (select * from (values ('a', 'b'), ('c', 'd')) a(column1, column2));
select sum(column_record) over (partition by column_record),
* from (select column1, 1::int4 as column_record from test) a;

How can I add values to a column that show the ranking by a random value?

I can't see what is going wrong here:
DECLARE #cData TABLE(cID NVARCHAR(1), cSeed DECIMAL(8,8), cRank INT)
INSERT INTO #cData (cID, cSeed) SELECT 'W', RAND()
INSERT INTO #cData (cID, cSeed) SELECT 'X', RAND()
INSERT INTO #cData (cID, cSeed) SELECT 'Y', RAND()
INSERT INTO #cData (cID, cSeed) SELECT 'Z', RAND()
SELECT cID, cSeed, (RANK() OVER (ORDER BY cSeed)) AS cRank FROM #cData
UPDATE #cData
SET cRank = (SELECT (RANK() OVER (ORDER BY cSeed)))
SELECT * FROM #cData
Why am I getting different results from my first select statement than I am from my second--why didn't my update statement put the same data into the table that my first select statement displayed?
SELECT (RANK() OVER (ORDER BY cSeed))
This is a statement on its own, correlated by a column used only in OVER / ORDER BY clause.
It operates over an implied rowset of exactly one record (the current record from #cData) and hence always returns 1, as the rank of the only record in a set is 1 by definition.
I believe you want to run this instead:
WITH t AS
(
SELECT *,
RANK() OVER (ORDER BY cSeed) rnk
FROM #cData
)
UPDATE t
SET cRank = rnk

Multi-INSERT with unchangeable param

Is there any way to INSERT multiple values with one from DB that unchangable?
I thought about WITH but without success:
WITH t as (SELECT date_trunc('hour', NOW()))
INSERT INTO my_table(ID, TIME) VALUES (1,t),(2,t);
No need for the CTE, just use a plain SELECT as the source for the insert:
insert into my_table (id, time)
select i, date_trunc('hour', NOW())
from generate_series(1,2) i;
If you really want the CTE, you need to select from it in the values clause:
WITH t as (
SELECT date_trunc('hour', NOW()) hour_t
)
INSERT INTO my_table(ID, TIME)
VALUES
(1, (select hour_t from t)),
(2, (select hour_t from t));

SQL Server SUM() for DISTINCT records

I have a field called "Users", and I want to run SUM() on that field that returns the sum of all DISTINCT records. I thought that this would work:
SELECT SUM(DISTINCT table_name.users)
FROM table_name
But it's not selecting DISTINCT records, it's just running as if I had run SUM(table_name.users).
What would I have to do to add only the distinct records from this field?
Use count()
SELECT count(DISTINCT table_name.users)
FROM table_name
SQLFiddle demo
This code seems to indicate sum(distinct ) and sum() return different values.
with t as (
select 1 as a
union all
select '1'
union all
select '2'
union all
select '4'
)
select sum(distinct a) as DistinctSum, sum(a) as allSum, count(distinct a) as distinctCount, count(a) as allCount from t
Do you actually have non-distinct values?
select count(1), users
from table_name
group by users
having count(1) > 1
If not, the sums will be identical.
You can see for yourself that distinct works with the following example. Here I create a subquery with duplicate values, then I do a sum distinct on those values.
select DistinctSum=sum(distinct x), RegularSum=Sum(x)
from
(
select x=1
union All
select 1
union All
select 2
union All
select 2
) x
You can see that the distinct sum column returns 3 and the regular sum returns 6 in this example.
You can use a sub-query:
select sum(users)
from (select distinct users from table_name);
SUM(DISTINCTROW table_name.something)
It worked for me (innodb).
Description - "DISTINCTROW omits data based on entire duplicate records, not just duplicate fields." http://office.microsoft.com/en-001/access-help/all-distinct-distinctrow-top-predicates-HA001231351.aspx
;WITH cte
as
(
SELECT table_name.users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM table_name
)
SELECT SUM(users)
FROM cte
WHERE rn = 1
SQL Fiddle
Try here yourself
TEST
DECLARE #table_name Table (Users INT );
INSERT INTO #table_name Values (1),(1),(1),(3),(3),(5),(5);
;WITH cte
as
(
SELECT users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM #table_name
)
SELECT SUM(users) DisSum
FROM cte
WHERE rn = 1
Result
DisSum
9
If circumstances make it difficult to weave a "distinct" into the sum clause, it will usually be possible to add an extra "where" clause to the entire query - something like:
select sum(t.ColToSum)
from SomeTable t
where (select count(*) from SomeTable t1 where t1.ColToSum = t.ColToSum and t1.ID < t.ID) = 0
May be a duplicate to
Trying to sum distinct values SQL
As per Declan_K's answer:
Get the distinct list first...
SELECT SUM(SQ.COST)
FROM
(SELECT DISTINCT [Tracking #] as TRACK,[Ship Cost] as COST FROM YourTable) SQ