T-SQL stored procedure to get data from any table on the server for CSV export (SQL Server 2016) - tsql

Answered / Solved.
Long story short, I need a stored procedure that would get the data from a few different views and put it into a .CSV file. Easy enough, but me being me, I decided to write something that could get the data from any table one could potentially desire. I decided to go with 2 procedures in the end:
Loop through a table with all the parameters (catalog, schema, table name, export path/file name etc. etc.) and feed it to 2nd stored procedure (in theory it should make it easier to manage in future, if/when different data needs to be exported). This one is fairly straightforward and doesn't cause any issues.
Pick up the column names (which was surprisingly easy - below in case it helps anyone)
select #SQL = 'insert into Temp_Export_Headers ' +
'select COLUMN_NAME ' +
'from [' + #loc_Source_Database + '].information_schema.columns ' +
'where table_name = ''' + #loc_Source_Table + ''''
and
select #Headers = coalesce(#Headers + ',', '') + convert(varchar, Column_Name)
from Temp_Export_Headers
After that, I want to dump all the data from "actual" table into temp one, which in itself is easy enough, but that's where things start to go downhill for me.
select #SQL =
'drop table if exists TempData ' +
'select * ' +
'into TempData ' +
'from [' + #loc_Source_Database + '].' + #loc_Source_Schema + '.' + #loc_Source_Table + ' with (nolock) '
Select * is just temporary, will probably replace it with a variable later on, for now it can live in this state on dev.
Now I want to loop through TempData and insert things I want (everything at the moment, will add some finesse and where clauses in near future) and put it into yet another temp table that holds all the stuff for actual CSV export.
Is there any way to add a self incrementing column to my TempData without having to look for and get rid of the original PK / Identity? (Different tables will have different values / names for those, making it a bit of a nightmare for someone with my knowledge / experience to loop through in a sensible manner, so I'd just like a simple column starting with 1 and ending with whatever last row number is)
#ShubhamPandey 's answer was exactly what I was after, code below is a product of my tired mind on the verge of madness (It does, however, work)
select #SQL =
'alter table TempData ' +
'add Uni_Count int'
select #SQL2 =
'declare #UniCount int ' +
'select #UniCount = 0 ' +
'update tempdata with (rowlock) ' +
'set #UniCount = Uni_Count = #UniCount + 1'
Both versions execute quicker than select * into without any other manipulation. Something I cannot yet comprehend.
Is there a better / more sensible way of doing this? (My reasoning with the loop - there will potentially be a lot of data for some of the tables / views, with most of them executed daily, plan was to export everything on Sat/Sun when system isn't that busy, and have daily "updates" going from last highest unique id to current.)
Looping was a horrible idea. To illustrate just how bad it was:
Looping through 10k rows meant execution time of 1m 21s.
Not looping through 500k rows resulted in execution time of 56s.

Since you are doing a table creation while insertion, you can always go forward with a statement like:
select #SQL =
'drop table if exists TempData ' +
'select ROW_NUMBER() OVER (<some column name>) AS [Id], * ' +
'into TempData ' +
'from [' + #loc_Source_Database + '].' + #loc_Source_Schema + '.' + #loc_Source_Table + ' with (nolock) '
This would create an auto-incrementing index for you in the TempData table

Related

Dynamic SQL Error with REPLACE Statement

I am trying to create a script that builds a SQL statement dynamically and then executes it.
Here is the string that is built & stored in #strSQL (verified by using PRINT)
UPDATE DNN_RSM_Exams
SET ScoringDates = REPLACE(ScoringDates, '/2015', '/2016')
WHERE SchoolYear = 2015
It executes the variable as follows:
EXECUTE (#strSQL)
However I get the following error:
Conversion failed when converting the nvarchar value 'UPDATE DNN_RSM_Exams SET ScoringDates = REPLACE(ScoringDates, '/' to data type int.
It seems to terminate the EXECUTE when it hits the first forward-slash (/). I tried escaping it using a double slash (//), an open bracket ([), and a backslash (\). None if it worked.
Can anyone help me please?
UPDATE #01: 8/17/15
Sorry if I didn't give enough information initially...
ScoringDates is an NVARCHAR(MAX) field.
The Code that builds the SQL Statement is:
SET #strSQL = 'UPDATE ' + #strTableName +
' SET ScoringDates = REPLACE(ScoringDates, ''/' + LTRIM(RTRIM(STR(#intSchoolYear_CopyFrom + 1))) + ''', ''/' + LTRIM(RTRIM(STR(#intSchoolYear_CopyFrom + 2))) + ''')' +
' WHERE SchoolYear = ' + LTRIM(RTRIM(STR(#intSchoolYear_CopyTo)))
ScoringDates is a string based field that holds data in an INI-like formatted string. I want to change the year portion of ANY date found in the string, but I want to avoid accidental changes of any other numbers that may match. So I am specifically looking to replace "/YYYY" with a different "YYYY" value. The "/" preceding the year value is to ensure that what is being preplaced is the YEAR and not another numeric value within the string.
UPDATE #02: 8/18/15
So I am completely flabbergasted...after banging my head on this script for hours yesterday, I went home defeated. Come in today, start up my PC and run the script again so I can see the error message again...and it worked!
I've never come across this with SQL Management Studio, but it is possible that SQL Management Studio somehow lost it's marbles yesterday and needed a reboot? I thought the SQL was process by the server directly. Could it be that some is processed by the studio first before handing it off to the server and if the studio had "issues" then it would cause strange errors?
In any case, thank you so much guys for your input, I am sorry that it was a wheel spinner. It never occurred to me that a reboot would fix my issue, I just assumed my code was wrong.
You want to replace '/ with ''
So your set statement will be something like...
SET #strSQL = 'UPDATE DNN_RSM_Exams
SET ScoringDates = REPLACE(ScoringDates, ''2015'', ''2016'')
WHERE SchoolYear = 2015'
EDIT:
How is your code different than this below? (I will edit this again and clean it up after. Code just won't fit into comments)
DECLARE #TableName TABLE (ScoringDates VARCHAR(100), SchoolYear INT)
DECLARE #strTableName VARCHAR(MAX)
DECLARE #intSchoolYear_CopyFrom VARCHAR(MAX)
DECLARE #intSchoolYear_CopyTo VARCHAR(MAX)
SET #strTableName = '#TableName'
SET #intSchoolYear_CopyFrom = '2009'
SET #intSchoolYear_CopyTo = '2010'
DECLARE #strSQL VARCHAR(MAX)
SET #strSQL = 'DECLARE #TableName TABLE (ScoringDates VARCHAR(100), SchoolYear INT); UPDATE ' + #strTableName +
' SET ScoringDates = REPLACE(ScoringDates, ''/' + LTRIM(RTRIM(STR(#intSchoolYear_CopyFrom + 1))) + ''', ''/' + LTRIM(RTRIM(STR(#intSchoolYear_CopyFrom + 2))) + ''')' +
' WHERE SchoolYear = ' + LTRIM(RTRIM(STR(#intSchoolYear_CopyTo)))
PRINT #strSQL
EXECUTE (#strSQL)

Correct statement for UNION ALL with three or more selects

I have the following situation:
I have a script consisting of 6 selects joined by "UNION ALL".
From the CLP DB2 console, this script fails. Curiously, each query independently work, and even come to work if grouped in pairs. However, when I try with three or more, it fails.
So, my question is: is there is a limit for more that one UNION ALL?
My environment is:
Client. DB2 Connect server 10.1
zOS 390 (no idea what is the DB2 version on that side)
AIX 7.1
The query is like this (but three times )
SELECT
,'GG'
,varchar(right( '000000000000000' || rtrim(ltrim(eeee.zzzz)), 15), 15)
,substr(char(right('**********'||char(left(replace(eeee.yyyy,' ','*')||'**********',10),10),10),10),1,7)
,eeee.kkkkk
,eeee.hhhhhh
,CASE WHEN hhhhhh='A5 ' THEN 'ARS' WHEN hhhhhh='A6 ' THEN 'AUD' WHEN hhhhhh='B5 ' THEN 'BRL' WHEN hhhhhh='U1 ' THEN 'GBP' WHEN hhhhhh='B9 ' THEN 'BND' WHEN hhhhhh='B6 ' THEN 'BNG' WHEN hhhhhh='C1 ' THEN 'CAD' WHEN hhhhhh='C3 ' THEN 'CLP' WHEN hhhhhh='C4 ' THEN 'CNY' WHEN hhhhhh='C5 ' THEN 'COP' WHEN hhhhhh='C7 ' THEN 'CRC' WHEN hhhhhh='L5 ' THEN 'HRK' WHEN hhhhhh='C9 ' THEN 'CYP' WHEN hhhhhh='X0 ' THEN 'CZK' WHEN hhhhhh='D0 ' THEN 'DKK' WHEN hhhhhh='D1 ' THEN 'DOP' WHEN hhhhhh='U0 ' THEN 'EGP' WHEN hhhhhh='E3 ' THEN 'EUR' WHEN hhhhhh='G5 ' THEN 'GTQ' WHEN hhhhhh='H0 ' THEN 'HTG' WHEN hhhhhh='H3 ' THEN 'HUF' WHEN hhhhhh='I1 ' THEN 'INR' WHEN hhhhhh='I2 ' THEN 'IDR' WHEN hhhhhh='K2 ' THEN 'WON' WHEN hhhhhh='L6 ' THEN 'LVL' WHEN hhhhhh='L7 ' THEN 'LTL' WHEN hhhhhh='M2 ' THEN 'MYR' WHEN hhhhhh='M6 ' THEN 'MXN' WHEN hhhhhh='I8 ' THEN 'ILS' WHEN hhhhhh='N2 ' THEN 'NZD' WHEN hhhhhh='N4 ' THEN 'NIO' WHEN hhhhhh='N6 ' THEN 'NOK' WHEN hhhhhh='T4 ' THEN 'XPF' WHEN hhhhhh='P0 ' THEN 'PKR' WHEN hhhhhh='P1 ' THEN 'PAB' WHEN hhhhhh='P3 ' THEN 'PEN' WHEN hhhhhh='P4 ' THEN 'PHP' WHEN hhhhhh='P5 ' THEN 'PLN' WHEN hhhhhh='R2 ' THEN 'RON' WHEN hhhhhh='U3 ' THEN 'RUB' WHEN hhhhhh='S0 ' THEN 'SAR' WHEN hhhhhh='R6 ' THEN 'RSD' WHEN hhhhhh='S2 ' THEN 'SGD' WHEN hhhhhh='K5 ' THEN 'SKK' WHEN hhhhhh='S4 ' THEN 'ZAR' WHEN hhhhhh='C2 ' THEN 'LKR' WHEN hhhhhh='S8 ' THEN 'SEK' WHEN hhhhhh='S9 ' THEN 'CHF' WHEN hhhhhh='T2 ' THEN 'THB' WHEN hhhhhh='T6 ' THEN 'TRL' WHEN hhhhhh='U4 ' THEN 'USD' WHEN hhhhhh='U6 ' THEN 'UAH' WHEN hhhhhh='U5 ' THEN 'AED' WHEN hhhhhh='U2 ' THEN 'UYU' WHEN hhhhhh='V0 ' THEN 'VEB' WHEN hhhhhh='V1 ' THEN 'VND' WHEN hhhhhh='J1 ' THEN 'JPY' ELSE '###' END
, case when eeee.FCRCIDF='Y' then 1 else 0 end
,
CASE
WHEN SUBSTR(eeee.yyyy,7,1) = 'X' THEN 'X'
WHEN SUBSTR(eeee.yyyy,4,2) = 'O' THEN 'O'
WHEN SUBSTR(eeee.yyyy,4,2) = 'C' THEN 'C'
WHEN SUBSTR(eeee.yyyy,4,2) = 'R' THEN 'R'
WHEN eeee.lll = 'F' THEN 'F'
WHEN eeee.ppp <> '' THEN 'D'
WHEN eeee.rrr = 0 THEN '0'
WHEN eeee.rrr <> eeee.ACINTOT THEN 'P'
WHEN eeee.rrr = eeee.ACINTOT THEN '1'
ELSE '*'
END
,1
,eeee.DCINISS
,0
from (SELECT ori.*,oric.tttt FROM www.SK1V01_CUSTOMER ori left OUTER JOIN www.SK1V01A_CUSTCUF oric
ON ori.bbbb=oric.bbbb and ori.ICUSCNO=oric.ICUSCNO ) as aaaa
,www.SK1V02_OPENBILL eeee,www.SK1V41_OPENBILL kkkk
where aaaa.bbbb=eeee.bbbb
and aaaa.cagllic=eeee.cagllic
and aaaa.icuscno=eeee.icuscno
Without the entire statement its pretty hard to determine exact reasons. An given that just one portion is so long & poorly formatted, I'm not sure we'd want to dig through it all. But I can suggest a few approaches that may help resolve your problem.
Simplest part first. In practically any computer language, well formatted code helps you see the structure of what's going on. It may also help you spot the differences between your queries. (Perhaps you know this, & your code merely lost its formatting when you tried to post it.)
When trying to UNION multiple complex queries, it's not uncommon to have column inconsistencies among the queries. You might have missing or extra columns, or columns out of order. But it's possible some of your column expressions are evaluating to different types. You might want to cast() those expressions, or use type conversion functions, just to be sure.
There's so much going on here. Try testing with a version where you comment out large chunks of code, same on each major subquery, until you find which part is causing the problem.
You have a ridiculously long CASE expression on hhhhhh. Why don't you put these value pairs into a lookup table that you can join to.
Try using a module approach, just as you should when writing a large program. You could create a view for each of the major queries, then UNION them together. (Some developers use layers of views like layers of modular code).
Metadata about your views is available in the database catalog views. This means you could write a query to compare the attributes of the columns in your set of union views.

UPDATE table via join in SQL

I am trying to normalize my tables to make the db more efficient.
To do this I have removed several columns from a table that I was updating several columns on.
Here is the original query when all the columns were in the table:
UPDATE myActDataBaselDataTable
set [Correct Arrears 2]=(case when [Maturity Date]='' then 0 else datediff(d,convert(datetime,#DataDate, 102),convert(datetime,[Maturity Date],102)) end)
from myActDataBaselDataTable
Now I have removed [Maturity Date] from the table myActDataBaselDataTable and it's necessary to retrieve that column from the base reference table ACTData, where it is called Mat.
In my table myActDataBaselDataTable the Account number field is a concatenation of 3 fields in ACTData, thus
myActDataBaselDataTable.[Account No]=ac.[Unit] + ' ' + ac.[ACNo] + ' ' + ac.[Suffix]
(where ac is the alias for ACTData)
So, having looked at the answers given elsewhere on SO (such as 1604091: update-a-table-using-join-in-sql-server), I tried to modify this particular update statement as below, but I cannot get it right:
UPDATE myActDataBaselDataTable
set dt.[Correct Arrears 2]=(
case when ac.[Mat]=''
then 0
else datediff(d,convert(datetime,'2014-04-30', 102),convert(datetime,ac.[Mat],102))
end)
from ACTData ac
inner join myActDataBaselDataTable dt
ON dt.[Account No]=ac.[Unit] + ' ' + ac.[ACNo] + ' ' + ac.[Suffix]
I either get an Incorrect syntax near 'From' error, or The multi-part identifier "dt.Correct Arrears 2" could not be bound.
I'd be grateful for any guidance on how to get this right, or suugestiopns about how to do it better.
thanks
EDIT:
BTW, when I run the below as a SELECT it returns data with no errors:
select case when [ac].[Mat]=''
then 0
else datediff(d,convert(datetime,'2014-04-30', 102),convert(datetime,[ac].[Mat],102))
end
from ACTData ac
inner join myActDataBaselDataTable dt
ON dt.[Account No]=ac.[Unit] + ' ' + ac.[ACNo] + ' ' + ac.[Suffix]
In a join update, update the alias
update dt
What is confusing is that in later versions of SQL you don't need to use the alias in the update line

Count previous occurences of a value split by date ranges

Here's a simple query we do for ad hoc requests from our Marketing department on the leads we received in the last 90 days.
SELECT ID
,FIRST_NAME
,LAST_NAME
,ADDRESS_1
,ADDRESS_2
,CITY
,STATE
,ZIP
,HOME_PHONE
,MOBILE_PHONE
,EMAIL_ADDRESS
,ROW_ADDED_DTM
FROM WEB_LEADS
WHERE ROW_ADDED_DTM BETWEEN #START AND #END
They are asking for more derived columns to be added that show the number of previous occurences of ADDRESS_1 where the EMAIL_ADDRESS matches. But they want is for different date ranges.
So the derived columns would look like this:
,COUNT_ADDRESS_1_LAST_1_DAYS,
,COUNT_ADDRESS_1_LAST_7_DAYS
,COUNT_ADDRESS_1_LAST_14_DAYS
etc.
I've manually filled these derived columns using update statements when there was just a few. The above query is really just a sample of a much larger query with many more columns. The actual request has blossomed into 6 date ranges for 13 columns. I'm asking if there's a better way then using 78 additional update statements.
I think you will have a hard time writing a query that includes all of these 78 metrics per e-mail address without actually creating a query that hard-codes the different choices. However you can generate such a pivot query with dynamic SQL, which will save you some keystrokes and will adjust dynamically as you add more columns to the table.
The result you want to end up with will look something like this (but of course you won't want to type it):
;WITH y AS
(
SELECT
EMAIL_ADDRESS,
/* aggregation portion */
[ADDRESS_1] = COUNT(DISTINCT [ADDRESS_1]),
[ADDRESS_2] = COUNT(DISTINCT [ADDRESS_2]),
... other columns
/* end agg portion */
FROM dbo.WEB_LEADS AS wl
WHERE ROW_ADDED_DTM >= /* one of 6 past dates */
GROUP BY wl.EMAIL_ADDRESS
)
SELECT EMAIL_ADDRESS,
/* pivot portion */
COUNT_ADDRESS_1_LAST_1_DAYS = *count address 1 from 1 day ago*,
COUNT_ADDRESS_1_LAST_7_DAYS = *count address 1 from 7 days ago*,
... other date ranges ...
COUNT_ADDRESS_2_LAST_1_DAYS = *count address 2 from 1 day ago*,
COUNT_ADDRESS_2_LAST_7_DAYS = *count address 2 from 7 days ago*,
... other date ranges ...
... repeat for 11 more columns ...
/* end pivot portion */
FROM y
GROUP BY EMAIL_ADDRESS
ORDER BY EMAIL_ADDRESS;
This is a little involved, and it should all be run as one script, but I'm going to break it up into chunks to intersperse comments on how the above portions are populated without typing them. (And before long #Bluefeet will probably come along with a much better PIVOT alternative.) I'll enclose my interspersed comments in /* */ so that you can still copy the bulk of this answer into Management Studio and run it with the comments intact.
Code/comments to copy follows:
/*
First, let's build a table of dates that can be used both to derive labels for pivoting and to assist with aggregation. I've added the three ranges you've mentioned and guessed at a fourth, but hopefully it is clear how to add more:
*/
DECLARE #d DATE = SYSDATETIME();
CREATE TABLE #L(label NVARCHAR(15), d DATE);
INSERT #L(label, d) VALUES
(N'LAST_1_DAYS', DATEADD(DAY, -1, #d)),
(N'LAST_7_DAYS', DATEADD(DAY, -8, #d)),
(N'LAST_14_DAYS', DATEADD(DAY, -15, #d)),
(N'LAST_MONTH', DATEADD(MONTH, -1, #d));
/*
Next, let's build the portions of the query that are repeated per column name. First, the aggregation portion is just in the format col = COUNT(DISTINCT col). We're going to go to the catalog views to dynamically derive the list of column names (except ID, EMAIL_ADDRESS and ROW_ADDED_DTM) and stuff them into a #temp table for re-use.
*/
SELECT name INTO #N FROM sys.columns
WHERE [object_id] = OBJECT_ID(N'dbo.WEB_LEADS')
AND name NOT IN (N'ID', N'EMAIL_ADDRESS', N'ROW_ADDED_DTM');
DECLARE #agg NVARCHAR(MAX) = N'', #piv NVARCHAR(MAX) = N'';
SELECT #agg += ',
' + QUOTENAME(name) + ' = COUNT(DISTINCT '
+ QUOTENAME(name) + ')' FROM #N;
PRINT #agg;
/*
Next we'll build the "pivot" portion (even though I am angling for the poor man's pivot - a bunch of CASE expressions). For each column name we need a conditional against each range, so we can accomplish this by cross joining the list of column names against our labels table. (And we'll use this exact technique again in the query later to make the /* one of past 6 dates */ portion work.
*/
SELECT #piv += ',
COUNT_' + n.name + '_' + l.label
+ ' = MAX(CASE WHEN label = N''' + l.label
+ ''' THEN ' + QUOTENAME(n.name) + ' END)'
FROM #N as n CROSS JOIN #L AS l;
PRINT #piv;
/*
Now, with those two portions populated as we'd like them, we can build a dynamic SQL statement that fills out the rest:
*/
DECLARE #sql NVARCHAR(MAX) = N';WITH y AS
(
SELECT
EMAIL_ADDRESS, l.label' + #agg + '
FROM dbo.WEB_LEADS AS wl
CROSS JOIN #L AS l
WHERE wl.ROW_ADDED_DTM >= l.d
GROUP BY wl.EMAIL_ADDRESS, l.label
)
SELECT EMAIL_ADDRESS' + #piv + '
FROM y
GROUP BY EMAIL_ADDRESS
ORDER BY EMAIL_ADDRESS;';
PRINT #sql;
EXEC sp_executesql #sql;
GO
DROP TABLE #N, #L;
/*
Now again, this is a pretty complex piece of code, and perhaps it can be made easier with PIVOT. But I think even #Bluefeet will write a version of PIVOT that uses dynamic SQL because there is just way too much to hard-code here IMHO.
*/

DESCENDING/ASCENDING Parameter to a stored procedure

I have the following SP
CREATE PROCEDURE GetAllHouses
set #webRegionID = 2
set #sortBy = 'case_no'
set #sortDirection = 'ASC'
AS
BEGIN
Select
tbl_houses.*
from tbl_houses
where
postal in (select zipcode from crm_zipcodes where web_region_id = #webRegionID)
ORDER BY
CASE UPPER(#sortBy)
when 'CASE_NO' then case_no
when 'AREA' then area
when 'FURNISHED' then furnished
when 'TYPE' then [type]
when 'SQUAREFEETS' then squarefeets
when 'BEDROOMS' then bedrooms
when 'LIVINGROOMS' then livingrooms
when 'BATHROOMS' then bathrooms
when 'LEASE_FROM' then lease_from
when 'RENT' then rent
else case_no
END
END
GO
Now everything in that SP works but I want to be able to choose whether I want to sort ASCENDING or DESCENDING.
I really can't fint no solution for that using SQL and can't find anything in google.
As you can see I have the parameter sortDirection and I have tried using it in multiple ways but always with errors... Tried Case Statements, IF statements and so on but it is complicated by the fact that I want to insert a keyword.
Help will be very much appriciated, I have tried must of the things that comes into mind but haven't been able to get it right.
You could use two order by fields:
CASE #sortDir WHEN 'ASC' THEN
CASE UPPER(#sortBy)
...
END
END ASC,
CASE #sortDir WHEN 'DESC' THEN
CASE UPPER(#sortBy)
...
END
END DESC
A CASE will evaluate as NULL if none of the WHEN clauses match, so that causes one of the two fields to evaluate to NULL for every row (not affecting the sort order) and the other has the appropriate direction.
One drawback, though, is that you'd need to duplicate your #sortBy CASE statement. You could achieve the same thing using dynamic SQL with sp_executesql and writing a 'ASC' or 'DESC' literal depending on the parameter.
That code is going to get very unmanageable very quickly as you'll need to double nest your CASE WHEN's... one set for the Column to order by, and nested set for whethers it's ASC or DESC
Might be better to consider using Dynamic SQL here...
DECLARE #sql nvarchar(max)
SET #sql = '
Select
tbl_houses.*
from tbl_houses
where
postal in (select zipcode from crm_zipcodes where web_region_id = ' + #webRegionID + ') ORDER BY '
SET #sql = #sql + ' ' + #sortBy + ' ' + #sortDirection
EXEC (#sql)
You could do it with some dynamic SQL and calling it with an EXEC. Beware SQL injection though if the user has any control over the parameters.
CREATE PROCEDURE GetAllHouses
set #webRegionID = 2
set #sortBy = 'case_no'
set #sortDirection = 'ASC'
AS
BEGIN
DECLARE #dynamicSQL NVARCHAR(MAX)
SET #dynamicSQL =
'
SELECT
tbl_houses.*
FROM
tbl_houses
WHERE
postal
IN
(
SELECT
zipcode
FROM
crm_zipcodes
WHERE
web_region_id = ' + CONVERT(nvarchar(10), #webRegionID) + '
)
ORDER BY
' + #sortBy + ' ' + #sortDirection
EXEC(#dynamicSQL)
END
GO