I need to output a different number of columns depending on the value of some variables in different columns. At its most basic, I have to output either X number of columns, or X+1
I have created a stored proc, set the column X+1 to be a variable, and set the final result as a string variable as the shown in the code below for illustration purposes.
CREATE PROC MyProc.dynamicProc
AS
BEGIN
DECLARE
#myVar VARCHAR(75)
SELECT #myVar = myVar FROM myTaBLE
.
.
.
.
DECLARE #RESULT VARCHAR(MAX)
SET #RESULT = 'SELECT
COL1,
COL2,
.
.
.
COLN'+
#myVar+
'COLN+1,
.
.
COLN+X
FROM #SOMEtABLE
ORDER BY COL1,
COL2'
EXECUTE (#finalSelect)
END;
GO
when I print #finalSelect, I see the string is truncated at 'ORDER BY CO' for example. The length of #finalSelect varies between 4800 and 4820 depending on the value of #myVar. Is this something that anyone has had to deal with before? When I lessen the number of columns the stored proc works as expected and I get the result set expected without any errors. I know I can lessen the number of characters in the aliases and all that but I'm just working around the problem, without understanding it. I would appreciate any pointers.
I guess the problem comes from the concatenation with the #MyVar variable.
Even if #Result is a VARCHAR(MAX), the strings you concatenate are not. So the result of concatenation is a "small" truncated VARCHAR that you store into a "big" , but then useless, VARCHAR(MAX).
Thus, you should define VARCHAR(MAX) variables for each part you concatenate and then assign to #Result
Related
I have an array parameter that I need to represent column names for a report. I have a command that queries the database for data and pivots it. I wanted to pass in the columns. The part I am having trouble with is adding the parameter values to the temp table that holds the column names.
This example does not work but I am hoping it gives an idea what I am after
So, the ?Fields parameter is my array....
Thanks!!
-- test parameters
declare #tab table(field
varchar(100))
-- insert into #tab values('John'),
-- ('Sarah'),('George')
insert into #tab
VALUES ({?Fields})
SELECT * FROM #tab
I have a dynamic SQL statement I'm building up that is something like
my_interval_var := interval - '60 days';
sql_query := format($f$ AND NOT EXISTS (SELECT 1 FROM some_table st WHERE st.%s = %s.id AND st.expired_date >= %L )$f$, var1, var2, now() - my_interval_var)
Regarding my first question, it seemed to insert the Timestamp correctly (it seems) after the now() - my_interval_var computation. However, I just want to make sure I don't need to cast anything or something, because the only way I could get it work was if I used %L, which is the string literal Identifer. Or does postgres allow direct comparisons with Strings that represent Time without a cast?, like
some_column <= '2021-12-31 00:00:00'; // is ::timestamp cast needed?
Second of all, regarding the sql_query variable that I concatenated an SQL String into above, I actually wanted to skip the Format I did, and directly inject this sql_query variable into an EXECUTE...FORMAT...USING statement.
I couldn't get it to work, but something like this:
EXECUTE format($f$ SELECT *
FROM %I tbl_alias
WHERE tbl_alias.%s = %L
%s ) USING var1, var2, var3, sql_query;
Is it possible to leave the Dynamic SQL Identifiers %I %L and %s inside the variable and FORMAT it at the EXECUTE... level? Something tells me this isn't possible, but it would be really cool.
Also last question I didn't want to add, but I feel someone might have a quick answer.
I was using the ]
FOR temprecord IN
SELECT myCol1, myCol2, myCol3
FROM %I tbl',var1)
LOOP
EXECUTE temprecord.someColumnOnMyTbl;
END LOOP;
...but I could not for the life of get the EXECUTE temprecord.someColumnOnMyTbl statement to work when I made the query dynamic. I tried everything identifier, using FORMAT, USING...
I thought columns were strings like %s because I do that for columns all the time when they are aliased like alias.%s = 'some string literal'
ANyway, I couldn't get it to work, I wanted to make the column name dynamic but tried all these things
EXECUTE format($f$ %I.%s $f$, var1, var2);
EXECUTE format($f$ %$1.%$2 $f$) USING var1, var2;
EXECUTE format($f$ %I.someColumnOn%s $f$, var1, var2);
EXECUTE format($f$ $1.someColumnOn$2 $f$) USING var1, var2;
Anyway, I tried more stuff than that, but I actually got some data from the DB when I made the temprecord variable an %I but I am Selecting 3 columns and it looked like sommething got jacked up with the second identifier because I got a syntax error and it looked like it was trying to concatenate all 3 columns of the query results...
I did try hardcoding it and that worked fine... any help appreciated!
String literal is unknown type value. Postgres always does cast to some target binary format. The type is deduced from context. When you use function format, and %L placeholder, then any binary value is converted to string, and escaped to Postgres's string literal (protection against syntax errors, and SQL injection). When you use USING clause, then the binary value is passed directly to executor. It is little bit faster, and there is not possibility to lost some information under cast to string. Without these points, the real effect of %L and USING clause is almost same.
Your type of variable is timestamp. Probably type of expired_date column is date type. So some conversion timestamp->date is necessary.
Function format is just string function. It just make string. For better readability it supports placeholders, that ensure correct escaping and correct result SQL string. %L is same like calling function quote_literal and %I is same like quote_ident (for column, table names). %s inserts string without escaping and quoting. The result of format function (when you use it in EXECUTE command) should be valid SQL statement. You can use it in RAISE NOTICE command, and you can print result to debug output. Usually it is good idea
DECLARE
query text;
x date DEFAULT current_date
y int;
BEGIN
query := format('.... WHERE inserted = $1', ...);
RAISE NOTICE 'dynamic query will be: %', query);
EXECUTE query USING x INTO y;
...
Clause USING allows using parameters in dynamic SQL (EXECUTE clause). Usually, the format's placeholdres should be used for table or column names, and USING for any other.
For types date and timestamp (scalars basic types) the following execution will be on 99.99% same:
EXECUTE format('select count(*) from foo where inserted = %L', current_date) INTO ..
EXECUTE 'select count(*) from foo where inserted = $1' USING current_date INTO ..
You cannot to use query parameters on column name or table name positions. This is limit of USING clause. But for any other cases, this clause should be used primary.
I have two queries which split comma separated list into rows and insert into table variable.
For first query I have used custom function which is:
USER defined Function for Spilt.
Create FUNCTION [dbo].[Split_S]
(
#sInputList VARCHAR(MAX)
,#sDelimiter VARCHAR(8)
)
RETURNS #List TABLE ([item] VARCHAR(8000))
AS
BEGIN
DECLARE #sItem VARCHAR(MAX)
WHILE CHARINDEX(#sDelimiter,#sInputList,0) <> 0
BEGIN
SELECT
#sItem=RTRIM(LTRIM(SUBSTRING(#sInputList,1,CHARINDEX(#sDelimiter,#sInputList,0)-1)))
,#sInputList=RTRIM(LTRIM(SUBSTRING(#sInputList,CHARINDEX(#sDelimiter,#sInputList,0)+LEN(#sDelimiter),LEN(#sInputList))))
IF LEN(#sItem) > 0
INSERT INTO #List SELECT #sItem
END
IF LEN(#sInputList) > 0
INSERT INTO #List SELECT #sInputList-- Put the last item in
RETURN
END
Query 1 :
DECLARE #F TABLE(F BIGINT)
INSERT INTO #F
SELECT [item] FROM [dbo].[Split_S]
(N'82,13,51,68,6',',')
Query 2 :
DECLARE #F2 TABLE(F BIGINT)
INSERT INTO #F2
SELECT Value
from
STRING_SPLIT(N'82,13,51,68,6',',')
Query Plan of Both Query
Why 37% and using STRING_SPLIT Its 63% .
but if i only compare select statement then query cost of STRING_SPLIT is 1%.
Which query has better performance and why?
If you will check only the part of the query that include the select query, then you will get that using STRING_SPLIT gives much better performance according too execution plan (EP). the result will be 99% vs 1%.
But when we use the data that returned by the STRING_SPLIT function (for example "select... into" or like in your case "insert...select'), then you might notice that the server uses "table spool (Eager Spool)" which make the difference. This operator takes the rows and stores them in a hidden temporary object stored in the tempdb database (the idea of using this logic is that the spooled data can be reused later in the execution plan). The "eager" spool takes ALL rows from the previously operator at one time, which means that this is "blocking operator".
I've got a PostGIS database of points in Postgres, and I would like to extract the points in several geographically distinct areas to CSV files, one file per area.
I have set up an area table with area polygons, and area titles and I would like to effectively loop through that table, using something like Postgis' st_intersects() to select the data to go in each CSV file, and get the filename for the CSV file from the title in the area table.
I'm comfortable with the details of doing the intersection code, and setting up the CSV output - what I don't know is how to do it for each area. Is it possible to do something like this with some sort of join? Or do I need to do it with a stored procedure, and use a loop construct in plpgsql?
You can loop over rows in your area table in plpgsql. But be careful to get quoting of identifiers and values right:
Assuming this setup:
CREATE TABLE area (
title text PRIMARY KEY
, area_polygon geometry
);
CREATE TABLE points(
point_id serial PRIMARY KEY
, the_geom geometry);
You can use this plpgsql block:
DO
$do$
DECLARE
_title text;
BEGIN
FOR _title IN
SELECT title FROM area
LOOP
EXECUTE format('COPY (SELECT p.*
FROM area a
JOIN points p ON ST_INTERSECTS(p.the_geom, a.area_polygon)
WHERE a.title = %L) TO %L (FORMAT csv)'
, _title
, '/path/to/' || _title || '.csv');
END LOOP;
END
$do$;
Use format with %L (for string literal) to get properly quoted strings to avoid syntax errors and possible SQL injection. You still need to use strings in area.title that work for file names.)
Also careful to quote the filename as a whole, not just the title part of it.
You must concatenate the whole command as string. The "utility command" COPY does not allow variable substitution. That's only possible with the core DML commands SELECT, INSERT, UPDATE, and DELETE. See:
Error when setting n_distinct using a plpgsql variable
So don't read out area.area_polygon in the loop. It would have to be cast to text to concatenate it into the query string, where the text representation would be cast back to geometry (or whatever your actual undisclosed data type is). That's prone to errors.
Instead I only read area.title to uniquely identify the row and handle the rest in the query internally.
You can use a plpgsql function or an inline do (if you only need to do it once and you don't want to store a function.)
do $body$
DECLARE i int;
BEGIN FOR i IN SELECT DISTINCT city FROM table
LOOP RAISE
NOTICE 'foo';
EXECUTE format($$COPY (SELECT * FROM foo WHERE x='%s') TO /tmp/%s$$, i, i);
END LOOP;
RETURN;
END;
$body$ LANGUAGE plpgsql;
I have a TSQL sproc that does three loops in order to find relevant data. If the first loop renders no results, then the second one normally does. I append another table that has multiple values that I can use later on.
So at most I should only have two tables returned in the dataset from the sproc.
The issue is that if the first loop is blank, I then end up with three data tables in my data set.
In my C# code, I can remove this empty table, but would rather not have it returned at all from the sproc.
Is there a way to remove the empty table from within the sproc, given the following:
EXEC (#sqlTop + #sqlBody + #sqlBottom)
SET #NumberOfResultsReturned = ##ROWCOUNT;
.
.
.
IF #NumberOfResultsReturned = 0
BEGIN
SET #searchLoopCount = #searchLoopCount + 1
END
ELSE
BEGIN
-- we have data, so no need to run again
BREAK
END
The process goes as follows: On the first loop there could be no results. Thus the rowcount will be zero because the EXEC executes a dynamically created SQL query. That's one table.
In the next iteration, results are returned, making that two data tables in the dataset output, plus my third one added on the end.
I didn't want to do a COUNT(*) then if > 0 then perform the query as I want to minimize the queries.
Thanks.
You can put the result for your SP in a table variable and then check if the table variable has any data in it.
Something like this with a SP named GetData that returns one integer column.
declare #T table(ID int)
declare #SQL varchar(25)
-- Create dynamic SQL
set #SQL = 'select 1'
-- Insert result from #SQL to #T
insert into #T
exec (#SQL)
-- Check for data
if not exists(select * from #T)
begin
-- No data continue loop
set #searchLoopCount = #searchLoopCount + 1
end
else
begin
-- Have data so wee need to query the data
select *
from #T
-- Terminate loop
break
end