How to transpose all rows into columns with particular column as header in sqlserver - sql-server-2008-r2

I want to transpose each row of the table into different columns making one particular row as a header for the transposed table. Please help me out with this asap. Any simple example would be fine

Use PIVOT:
select
--userID, Stage1, Stage2, Stage3, Stage4
*
from (
select '1' as UserID,'1' as Stage,'1-1-2013' as [Date] union all
select '1','2','2-1-2013' union all
select '2','1','1-1-2013' union all
select '1','3','5-1-2013' union all
select '2','2','3-1-2013' union all
select '3','1','6-1-2013' union all
select '3','2','8-1-2013' union all
select '1','4','10-1-2013' union all
select '3','3','12-1-2013'
) x
pivot (
max([Date]) for Stage in ([1],[2],[3],[4],[5],[6])
) p
More about this topic:
http://blog.sqlauthority.com/2008/06/07/sql-server-pivot-and-unpivot-table-examples/

Related

How to query two tables with same schema but different time ranges in a date column

I have two tables:
main_products
old_products
They have the same info and schema with only one difference:
main_products has min(date) = 2022-01 and max(date) = 2022-05
and
old_products has min(date) = 2020-01 and max(date) = 2020-12
How can I query to get all records from old_products + all records from main_products to get products from 2020-01 to 2022-05 ?
The product on both tables has and product_id field.
I tried to join both tables on product_id but the output is a table with twice number of columns.
select t1.*, t2.* from t1
inner join t2
one t1.product_id = t2.product_id
I think you are looking for a UNION or UNION ALL:
SELECT *
FROM t1
WHERE ...
UNION ALL
SELECT *
FROM t2
WHERE ...
If the columns in t1 and t2 are the same (same number of columns and same types), this will pull the data from both of them. Use UNION if you want duplicates removed or UNION ALL to include duplicates. (In your case it won't make a functional difference since the tables don't overlap by date, but UNION ALL will be faster.)
In the above example, you can put your condition (to only get 2022-01 to 2022-05) in both WHERE conditions. If you don't like repeating the condition, you can use the UNION ALL query in a subquery with the condition outside:
SELECT *
FROM
(
SELECT *
FROM t1
UNION ALL
SELECT *
FROM t2
) sq
WHERE ...

Find the next oldest row in Redshift

I have a table called user_activity in Redshift that has department, user_id, activity_type, activity_id, activity_date.
I'd like to query a daily report of how many days since the last event (of any type). Using CROSS APPLY (SQL Server) or LATERAL JOIN (Postgres 9+), I'd do something like...
SELECT d.date, a.last_activity_date
FROM date_table d
CROSS JOIN (
SELECT DISTINCT user_id FROM activity_table
) u
CROSS APPLY (
SELECT TOP 1 activity_date as last_activity_date
FROM activity_table
WHERE user_id = u.user_id AND activity_date <= d.date
ORDER BY activity_date DESC
) a
For now, I write it similar to the below, but it is a bit slow and I am afraid it'll only get slower.
with user_activity as (
select distinct activity_date, user_id from activity_table
)
select
d.date, u.user_id,
max(u.activity_date) as last_activity_date
from date_table d
inner join user_activity u on u.activity_date <= d.date
where d.date between '2020-01-01' and current_date
group by 1, 2
Can someone suggest a good alternative for my needs or for CROSS APPLY / LATERAL JOIN.
As you are seeing cross joining and inequality joining will slow down as you data grows and are generally not the approach you want in Redshift. This is due to the data size increase that comes with this type of action when applied to large data tables that are typical in Redshift.
You want to use window functions to perform this type of analysis. But you will need to step back and rethink how you will structure the SQL. A MAX(activity_date) window function, partitioned by user_id and ordered by date and with a frame clause of all preceding rows, will find the most recent activity to any activity.
Now this will produce only rows for user_ids and dates that exist in the data table and it looks like you want 1 row for each date for each user_id, right? To do this you need to UNION in a frame of data that has 1 row for each date for each user_id ahead of the window function. You will need NULLs in for the other columns so that the data widths match. You will also want the dates in a separate column from activity_date. Now all dates for all user ids will be in the source and the window function will give you the result you want.
You also ask ‘how is this better than the joins?’ Well in the joins you are replicating all the data records by the number of dates which can get really big. In this approach you just have the original data records plus one row per user_id per date (which is the size of your output) and as the number of records per user_id grows this approach doesn’t.
——— Request to modify asker’s code per comments made to their approach ———
Your code is definitely on the right track as you have removed the massive inequality join of your original. I made 2 comments about it. The first is that I believe you need GROUP BY user_id, date to prevent multiple rows per user_id per date that would result if there are records for the same user_id on a single date with differing activity_types. This is a simple oversight.
The second is to state that I intended for you to use UNION ALL, not LEFT JOIN, in combining the actual data and the user_id/date framework. Your approach works fine but I have found that unioning with very large amounts of data is generally faster than joining but you do need to make sure the columns match up. Either way we end up with a data segment with 3 columns - 2 date columns, one with NULLs for framework rows, and 1 user_id. Your approach is fine and the difference in performance is likely very small unless you have huge tables.
Since you asked for a rewrite, here it is with both changes. (NOTE: my laptop is in the shop so I don’t have ready access to Redshift at the moment and this SQL is untested. If the intent is not clear from this and you need me to debug it will be delayed by a few days. I’m keeping your setup methods and SQL structure.)
with date_table as (
select '2000-01-01'::date as date
union all
select '2000-01-02'::date
union all
select '2000-01-03'::date
union all
select '2000-01-04'::date
union all
select '2000-01-05'::date
union all
select '2000-01-06'::date
),
users as (
select 1 as user_id
union all
select 2
union all
select 3
),
user_activity as (
select 1 as user_id, '2000-01-01'::date as activity_date
union all
select 1 as user_id, '2000-01-04'::date as activity_date
union all
select 3 as user_id, '2000-01-03'::date as activity_date
union all
select 1 as user_id, '2000-01-05'::date as activity_date
union all
select 1 as user_id, '2000-01-06'::date as activity_date
),
user_dates as (
select d.date, u.user_id
from date_table d
cross join users u
),
user_date_activity as (
select cal_date, user_id,
lag(max(activity_date), 1) ignore nulls over (partition by user_id order by date) as last_activity_date
from (
Select user_id, date as cal_date, NULL as activity_date from user_dates
Union all
Select user_id, activity_date as cal_date, activity_date from user_activity
)
Group by user_id, cal_date
)
select * from user_date_activity
order by user_id, cal_date```
This was my query based on Bill's answer.
with date_table as (
select '2000-01-01'::date as date
union all
select '2000-01-02'::date
union all
select '2000-01-03'::date
union all
select '2000-01-04'::date
union all
select '2000-01-05'::date
union all
select '2000-01-06'::date
),
users as (
select 1 as user_id
union all
select 2
union all
select 3
),
user_activity as (
select 1 as user_id, '2000-01-01'::date as activity_date
union all
select 1 as user_id, '2000-01-04'::date as activity_date
union all
select 3 as user_id, '2000-01-03'::date as activity_date
union all
select 1 as user_id, '2000-01-05'::date as activity_date
union all
select 1 as user_id, '2000-01-06'::date as activity_date
),
user_dates as (
select d.date, u.user_id
from date_table d
cross join users u
),
user_date_activity as (
select ud.date, ud.user_id,
lag(ua.activity_date, 1) ignore nulls over (partition by ud.user_id order by ud.date) as last_activity_date
from user_dates ud
left join user_activity ua on ud.date = ua.activity_date and ud.user_id = ua.user_id
)
select * from user_date_activity
order by user_id, date

Pivot Table By One Column And Multiple Aggregates

I prepared the following report. It displays three columns (AccruedInterest, Tip & CardRevenue) by month for a given year. Now I want to "Rotate" the result so that StartDate value turn into 12 columns.
I have this:
I need this:
I have tried pivoting the table but I need multiple columns to be aggregated as you see.
You have to unpivot your data and the pivot.
Note: I put in my own values in your table as to make each unique so you know the data is correct.
--Create YourTable
SELECT * INTO YourTable
FROM
(
SELECT CAST('2015-01-01' AS DATE) StartDate,
607.834 AS AccruedInterest,
1 AS Tip,
3 AS CardRevenue
UNION ALL
SELECT CAST('2015-02-01' AS DATE) StartDate,
643.298 AS AccruedInterest,
16.8325 AS Tip,
5 AS CardRevenue
) A;
GO
--This pivots your data
SELECT *
FROM
(
--This unpivots your data using cross apply
SELECT col,val,StartDate
FROM YourTable
CROSS APPLY
(
SELECT 'AccruedInterest', CAST(AccruedInterest AS VARCHAR(100))
UNION ALL
SELECT 'Tip', CAST(Tip AS VARCHAR(100))
UNION ALL
SELECT 'CardRevenue', CAST(CardRevenue AS VARCHAR(100))
) A(col,val)
) B
PIVOT
(
MAX(val) FOR startdate IN([2015-01-01],[2015-02-01])
) pvt
Results:
col 2015-01-01 2015-02-01
AccruedInterest 607.834 643.298
CardRevenue 3 5
Tip 1.0000 16.8325

select the inverse of sql result as a string list

having a sql e.g. something like the following resulting in some rows with one value.
I search a different sql than SELECT * FROM some_sql which results in one row with comma separated values.
WITH some_sql AS (
SELECT 1 FROM DUAL
UNION
SELECT 2 FROM DUAL
)
SELECT * FROM some_sql
this SQL results in the two rows with value 1 and 2.
I seach a SQl resulting in 1,2 without changing the code of 'some_sql'.
Consider http://halisway.blogspot.com/2006/08/oracle-groupconcat-updated-again.html
Sice you are on 11G you can use LISTAGG
WITH some_sql AS (
SELECT 1 x FROM DUAL
UNION
SELECT 2 x FROM DUAL
)
SELECT LISTAGG(x, ',') WITHIN GROUP(ORDER BY x) FROM some_sql

Query to get row from one table, else random row from another

tblUserProfile - I have a table which holds all the Profile Info (too many fields)
tblMonthlyProfiles - Another table which has just the ProfileID in it (the idea is that this table holds 2 profileids which sometimes become monthly profiles (on selection))
Now when I need to show monthly profiles, I simply do a select from this tblMonthlyProfiles and Join with tblUserProfile to get all valid info.
If there are no rows in tblMonthlyProfile, then monthly profile section is not displayed.
Now the requirement is to ALWAYS show Monthly Profiles. If there are no rows in monthlyProfiles, it should pick up 2 random profiles from tblUserProfile. If there is only one row in monthlyProfiles, it should pick up only one random row from tblUserProfile.
What is the best way to do all this in one single query ?
I thought something like this
select top 2 * from tblUserProfile P
LEFT OUTER JOIN tblMonthlyProfiles M
on M.profileid = P.profileid
ORder by NEWID()
But this always gives me 2 random rows from tblProfile. How can I solve this ?
Try something like this:
SELECT TOP 2 Field1, Field2, Field3, FinalOrder FROM
(
select top 2 Field1, Field2, Field3, FinalOrder, '1' As FinalOrder from tblUserProfile P JOIN tblMonthlyProfiles M on M.profileid = P.profileid
UNION
select top 2 Field1, Field2, Field3, FinalOrder, '2' AS FinalOrder from tblUserProfile P LEFT OUTER JOIN tblMonthlyProfiles M on M.profileid = P.profileid ORDER BY NEWID()
)
ORDER BY FinalOrder
The idea being to pick two monthly profiles (if that many exist) and then 2 random profiles (as you correctly did) and then UNION them. You'll have between 2 and 4 records at that point. Grab the top two. FinalOrder column is an easy way to make sure that you try and get the monthly's first.
If you have control of the table structure, you might save yourself some trouble by simply adding a boolean field IsMonthlyProfile to the UserProfile table. Then it's a single table query, order by IsBoolean, NewID()
In SQL 2000+ compliant syntax you could do something like:
Select ...
From (
Select TOP 2 ...
From tblUserProfile As UP
Where Not Exists( Select 1 From tblMonthlyProfile As MP1 )
Order By NewId()
) As RandomProfile
Union All
Select MP....
From tblUserProfile As UP
Join tblMonthlyProfile As MP
On MP.ProfileId = UP.ProfileId
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) >= 1
Union All
Select ...
From (
Select TOP 1 ...
From tblUserProfile As UP
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) = 1
Order By NewId()
) As RandomProfile
Using SQL 2005+ CTE you can do:
With
TwoRandomProfiles As
(
Select TOP 2 ..., ROW_NUMBER() OVER ( ORDER BY UP.ProfileID ) As Num
From tblUserProfile As UP
Order By NewId()
)
Select MP.Col1, ...
From tblUserProfile As UP
Join tblMonthlyProfile As MP
On MP.ProfileId = UP.ProfileId
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) >= 1
Union All
Select ...
From TwoRandomProfiles
Where Not Exists( Select 1 From tblMonthlyProfile As MP1 )
Union All
Select ...
From TwoRandomProfiles
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) = 1
And Num = 1
The CTE has the advantage of only querying for the random profiles once and the use of the ROW_NUMBER() column.
Obviously, in all the UNION statements the number and type of the columns must match.