What's the most efficient way of querying the database based on a list of values?

What's the most efficient way of querying the database based on a list of values? - tsql

I have a list of record Id's that I want to retrieve from Sql Server. I'm trying to figure out what the most performant way doing this would be. For example in code I have this:
var recordsToFind = new List<long>{ 12345, 12346, 45756, 42423 ... }
I want to create a stored proc that does this:
Select * From Puzzles where ID = {any of the integers passed in}
I know there are several options like, table value parameters, converting the list into a comma seperated string and using CharIndex, created a temp table and splitting the string etc...
What would be the best approach keeping in mind this will be used A LOT!
Thanks!

Several must read articles about this can be found here: http://www.sommarskog.se/arrays-in-sql.html Performance considerations are addressed in those articles.

Do you have the ability to change the code and the stored procedure or are you bound to some limit? If you have the ability to change both, I've seen this done before:
//Convert array/list to a comma delimited string using your own function (We'll call string strFilter)
//Pass strFilter into stored procedure (varchar(1000) maybe, we'll call parameter #SQLFILTER)
DECLARE #SQL AS VARCHAR(2000) --Or VARCHAR(MAX)
SET #SQL = "SELECT * FROM PUZZLES WHERE ID IN (" + #SQLFILTER + ")"
EXEC (#SQL)
I apologize if my syntax is off. My job unfortunately uses Oracle and I haven't used SQL Server a lot in 8 months. Hope this helps.

Micah,
is it out of the question to just do the simple sql 'in()' function?? (with obviously your array being used as a parameter, rather then the hardcoded values)
select * From Puzzles where ID in (12345, 12346, 45756, 42423)
maybe i've misunderstood the question mind you :)
[edit] - i notice that this is being used in .net. would it be possible to use linq against the database at all?? if so, then a few well crafted linq (over EF or subsonic for example) methods may do the trick. for example:
var idlist = new int[] { 12345, 12346, 45756, 42423 };
var puzzleids = Puzzles.Where(x =>
idlist.Contains(x.ID));

Related

extract text from string data using SQL (with handing null values)

I have issue with extracting text from string data using T-SQL. I have following data and text that I would like to extract (value between second and third underscore:
sss_ss_d_a -> d
aaa_dd_b -> b
aaa_aa -> NULL
I know that there is a lot of similar topics, but I have problem especially with handling NULLs and situation when there are no delimiter after desired value
Thank you very much for your help, regards

Try it like this:
Some sample data in a declared table
DECLARE #tbl TABLE(YourString VARCHAR(100));
INSERT INTO #tbl VALUES('sss_ss_d_a'),('aaa_dd_b'),('aaa_aa');
--The query
SELECT t.YourString
,CAST(CONCAT('<x>',REPLACE(t.YourString,'_','</x><x>'),'</x>') AS XML).value('/x[3]/text()[1]','nvarchar(max)')
FROM #tbl t;
The idea in short:
We replace the underscores with XML tags thus transforming the string to castable XML.
We use XQuery within .value to pick the third element.
Starting with v2016 the recommended approach uses JSON instead
(hint: we use "2" instead of "3" as JSON is zero-based).
,JSON_VALUE(CONCAT('["',REPLACE(t.YourString,'_','","'),'"]'),'$[2]')
The idea is roughly the same.
General hint: Both approaches might need escaping forbidden characters, such as < or & in XML or " in JSON. But this is easy...

FsSql Not working when Parameterizing Columns

Using F# , FsSql and PostGres
So I'm using this function
let getSqlParameter value =
let uniqueKey = Guid.NewGuid().ToString("N")
let key = (sprintf "#%s" uniqueKey)
(key,Sql.Parameter.make(key,value))
to get me a parameter of anything I pass in dynamically
Which I then append to a query and I get something like this
select * from (select * from mytable) as innerQuery where #a29c575b69bb4629a9971dac2808b445 LIKE '%#9e3485fdf99249e5ad6adb6405f5f5ca%'
Then I take a collection of these and pass them off
Sql.asyncExecReader connectionManager query parameters
The problem that I'm having is that when I don't run this through my parameterization engine, it works fine. When I do, it doesn't work. It just returns empty sets.
The only thing I can think of is that the column names can't be parameterized. This is a problem because they're coming from the client. Is there a way to do this?

Okay so the answer here is that you can't parameterize column names as far as I can tell.
What I ended up doing was creating a whitelist of acceptable column names and then compare what was coming in to my whitelist. If it doesn't exist then I drop it.
By far a sub-optimal solution. I really wish there was a way to do this.

Does PostgreSQL have the equivalent of an Oracle ArrayBind?

Oracle has the ability to do bulk inserts by passing arrays as bind variables. The database then does a separate row insert for each member of the array:
http://www.oracle.com/technetwork/issue-archive/2009/09-sep/o59odpnet-085168.html
Thus if I have an array:
string[] arr = { 1, 2, 3}
And I pass this as a bind to my SQL:
insert into my_table(my_col) values (:arr)
I end up with 3 rows in the table.
Is there a way to do this in PostgreSQL w/o modifying the SQL? (i.e. I don't want to use the copy command, an explicit multirow insert, etc)

Nearest that you can use is :
insert into my_table(my_col) SELECT unnest(:arr)

PgJDBC supports COPY, and that's about your best option. I know it's not what you want, and it's frustrating that you have to use a different row representation, but it's about the best you'll get.
That said, you will find that if you prepare a statement then addBatch and executeBatch, you'll get pretty solid performance. Sufficiently so that it's not usually worth caring about using COPY. See Statement.executeBatch. You can create "array bind" on top of that with a trivial function that's a few lines long. It's not as good as server-side array binding, but it'll do pretty well.

No, you cannot do that in PostgreSQL.
You'll either have to use a multi-row INSERT or a COPY statement.

I'm not sure which language you're targeting, but in Java, for example, this is possible using Connection.createArrayOf().
Related question / answer:
error setting java String[] to postgres prepared statement

SQL INJECTION and two queries [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 13 years ago.
So, I read article about SQL injection and there was an example:
SELECT * FROM table_name WHERE smth = 'x';
UPDATE table_name SET smth ='smth#email.addr' WHERE user = 'admin';
Why it doesn't work? Or it is an old article and nowadays this way is nonsense? So how hackers update mysql then?
Thanks.

Most sites nowadays are using parametized SQL -- not inline SQL. The situation would occur above if for instance, there was parsed, inline SQL, similar to the following:
Non-Parameterized Pseudo
string sql = "SELECT * FROM table_name WHERE smth='" + UserInput + "'";
ExecuteSql(sql);
...where UserInput defines an element on the website.
Instead of adding valid data to the UserInput field, you add,
UserInput = '';DROP table_name;
...you would actually be adding new logic to the end of the query, resulting in a malicious use of the system.
Parametized statements eliminate the possibility of SQL injection, since you can't modify the structure of the query by inserting logic into the signature.
If you attempted to set the UserInput field to a malacious query, but the site used parameters in the statement, then you would be out of luck.
Parameterized Pseudo:
Adapter proc;
proc.StoredProcedure = "GetUserNames"
proc.AddParameter("#USER",UserInput);
proc.Execute();
...as #USER is now equal to the literal "'\;DROP table_name;", which the SQL will treat as a regular ol' parameter.

It depends on how you execute the code above.
Many code languages have dedicated database communication classes which you can supply with sql parameters instead of concatenated strings.
The risk in SQL injection is forgetting to escape some of the user input in your query thus allowing malformed queries to be executed.

The idea behind SQL injection attacks is this:
I have a website that let's users searching for information about animals. The user types in the name of the animal, then it runs:
select * from animals where name = '$[what the user typed in]';
so if they type in sheep, the query becomes:
select * from animals where name = 'sheep';
However, what if they type in: `sheep'; drop table animals'? If I simply copy what they typed into the query and run it, I'll run:
select * from animals where name = 'sheep'; drop table animals;
which would be bad.
This kinds of attacks can still happen if the person setting up the website and database isn't careful to look for and clean up any SQL that is in something the user enters.

DB 101 warns ardently about SQL injection, so most developers these days are aware of it and prevent it. This is most often done by using some sort of prepared statement where your parameters are injected via a mechanism that prevents arbitrary SQL from being executed. Lazy programming can still lead to vulnerabilities and they're out there, for sure, but blind dynamic SQL building is rarer and rarer.

SQL injection attacks are possible when you have query "templates" and you require user input to fill in some of the query. For example, you might have a PHP script that does something like this:
<?php
$smth_value = $_POST["smth"]; // some form field
$smth_user = $_POST["user"]; // some other form field
$smth_email = $_POST["email"]; // yet another form field
$sql1 = "SELECT * FROM table_name WHERE smth = '".$smth_value."'";
$sql2 = "UPDATE table_name SET smth ='".$smth_email."' WHERE user = '".$smth_user."'";
mysql_query($sql1);
mysql_query($sql2);
?>
If an individual knew the structure of my table (or figured it out somehow), they could "inject" SQL into my queries by putting SQL text into the form fields that results in my SQL variable strings looking like two valid queries separated by a semicolon. For example, someone could type into the "smth" form field something like:
';DELETE FROM table_name WHERE 1=1 OR smth='
and then $sql1 would end up looking like:
SELECT * FROM table_name WHERE smth = '';DELETE FROM table_name WHERE 1=1 OR smth=''
... and there goes all the data from table_name.
That's why there are functions in PHP like mysql_escape_string to help guard strings from these kind of attacks. If you know a variable is supposed to be a number, cast it to a number. If you have text, wrap it in a string escaping function. That's the basic idea on how to defend.

Parameterized SQL Columns?

I have some code which utilizes parameterized queries to prevent against injection, but I also need to be able to dynamically construct the query regardless of the structure of the table. What is the proper way to do this?
Here's an example, say I have a table with columns Name, Address, Telephone. I have a web page where I run Show Columns and populate a select drop-down with them as options.
Next, I have a textbox called Search. This textbox is used as the parameter.
Currently my code looks something like this:
result = pquery('SELECT * FROM contacts WHERE `' + escape(column) + '`=?', search);
I get an icky feeling from it though. The reason I'm using parameterized queries is to avoid using escape. Also, escape is likely not designed for escaping column names.
How can I make sure this works the way I intend?
Edit:
The reason I require dynamic queries is that the schema is user-configurable, and I will not be around to fix anything hard-coded.

Instead of passing the column names, just pass an identifier that you code will translate to a column name using a hardcoded table. This means you don't need to worry about malicious data being passed, since all the data is either translated legally, or is known to be invalid. Psudoish code:
#columns = qw/Name Address Telephone/;
if ($columns[$param]) {
$query = "select * from contacts where $columns[$param] = ?";
} else {
die "Invalid column!";
}
run_sql($query, $search);

The trick is to be confident in your escaping and validating routines. I use my own SQL escape function that is overloaded for literals of different types. Nowhere do I insert expressions (as opposed to quoted literal values) directly from user input.
Still, it can be done, I recommend a separate — and strict — function for validating the column name. Allow it to accept only a single identifier, something like
/^\w[\w\d_]*$/
You'll have to rely on assumptions you can make about your own column names.

I use ADO.NET and the use of SQL Commands and SQLParameters to those commands which take care of the Escape problem. So if you are in a Microsoft-tool environment as well, I can say that I use this very sucesfully to build dynamic SQL and yet protect my parameters
best of luck

Make the column based on the results of another query to a table that enumerates the possible schema values. In that second query you can hardcode the select to the column name that is used to define the schema. if no rows are returned then the entered column is invalid.

In standard SQL, you enclose delimited identifiers in double quotes. This means that:
SELECT * FROM "SomeTable" WHERE "SomeColumn" = ?
will select from a table called SomeTable with the shown capitalization (not a case-converted version of the name), and will apply a condition to a column called SomeColumn with the shown capitalization.
Of itself, that's not very helpful, but...if you can apply the escape() technique with double quotes to the names entered via your web form, then you can build up your query reasonably confidently.
Of course, you said you wanted to avoid using escape - and indeed you don't have to use it on the parameters where you provide the ? place-holders. But where you are putting user-provided data into the query, you need to protect yourself from malicious people.
Different DBMS have different ways of providing delimited identifiers. MS SQL Server, for instance, seems to use square brackets [SomeTable] instead of double quotes.

Column names in some databases can contain spaces, which mean you'd have to quote the column name, but if your database contains no such columns, just run the column name through a regular expression or some sort of check before splicing into the SQL:
if ( $column !~ /^\w+$/ ) {
die "Bad column name [$column]";
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

What's the most efficient way of querying the database based on a list of values? - tsql

Several must read articles about this can be found here: http://www.sommarskog.se/arrays-in-sql.html Performance considerations are addressed in those articles.

Related

extract text from string data using SQL (with handing null values)

FsSql Not working when Parameterizing Columns

Does PostgreSQL have the equivalent of an Oracle ArrayBind?

SQL INJECTION and two queries [closed]

Parameterized SQL Columns?

Categories

Resources