I'm trying to use select pg_catalog.hashtext(?) via JDBC PreparedStatement, and running into a weird behavior.
For most strings it works fine, e.g. the following randomly generated string:
"Fm_:VW:<jBGOl$K "
and I get the correct hash back: 641495800
But for some of the strings, it spits back a hash that doesn't match the value when I query directly the DB via psql or some other tool such as DataGrip.
For instance, this works fine:
"}F:d(2 dS8xt9KP0$~tYw;R(V"!2[7&Xs2Wj#5 k|F[}%.ZQ^93~
Cuk&93d!t8b|{4F&{1j{.;C},1s/b&wYZ Ckc5vqy|e+5&5EW%RQ6F0>R4#h.6$iU>{=kl!{e(CTH^DvN/<eG9 bjHx#9=&& G$W_Y =! j\q3T;[H.ve-~>S5j8eI.gWQmg. C!WpWK0z>f?^^LLMO:3R';!4eVxU2)~1F6Zs!p0 F'1b*G:xBO5cN{O'1P~
fj5g%IcT}]w ;;DlD Q~D=wT qN7zON]/J9Heh3qwJ #n qMTG\M7#h,8JUP3Sl}L:wb7#bRc&eIWp\z>HuwZI2Ej5;v7M _8DU.d?mvD| !rS!XS;8QQYh6D=BMJ5m2$>cR ob#'{dCOr#NzDk c!JtQbzCg&#dG:qtHy)O4 ohWQ`ed
2 O'HmHt\<SO
gHKAo`WIb"HF\LrpKKDsW -e##v%RS+,-61lze bd|tyl);A0h":O40O71b(0cDM57gTFL~[7ksp
_Nx:"
But this doesn't:
".4X$!S"s
3E&fJZP*yC#6 ii7^D%Nj3Qn(]:&ykP3(%9 Ww}| ZOmcZ:(w<d= On/m\)vfAEu)s:Yy<17:l9GImT!BgH,FG(:DanwL|3'#XS
a_+nwbqPYBu[DWW`VbBKzF%CnaYpH "
Now, I tried using Statement instead of PreparedStatement along with a String concatenated query, and that works fine, as long as I escape single-quote characters (') with two-single quotes ('') before executing the query. So it appears that somehow PreparedStatement.setString is doing something weird with the String that I pass to it.
Note: The reason I'm testing this with random strings is because my code needs to be able to work with any UTF-8 string that's thrown at it. This test only uses ASCII, and it's already failing in some cases. I don't want to use Statement as that opens up a whole different discussion.
I have a table test(id,name).
I need to insert values like: user's log, 'my user', customer's.
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
I am getting an error if I run any of the above statements.
If there is any method to do this correctly please share. I don't want any prepared statements.
Is it possible using sql escaping mechanism?
String literals
Escaping single quotes ' by doubling them up → '' is the standard way and works of course:
'user's log' -- incorrect syntax (unbalanced quote)
'user''s log'
Plain single quotes (ASCII / UTF-8 code 39), mind you, not backticks `, which have no special purpose in Postgres (unlike certain other RDBMS) and not double-quotes ", used for identifiers.
In old versions or if you still run with standard_conforming_strings = off or, generally, if you prepend your string with E to declare Posix escape string syntax, you can also escape with the backslash \:
E'user\'s log'
Backslash itself is escaped with another backslash. But that's generally not preferable.
If you have to deal with many single quotes or multiple layers of escaping, you can avoid quoting hell in PostgreSQL with dollar-quoted strings:
'escape '' with '''''
$$escape ' with ''$$
To further avoid confusion among dollar-quotes, add a unique token to each pair:
$token$escape ' with ''$token$
Which can be nested any number of levels:
$token2$Inner string: $token1$escape ' with ''$token1$ is nested$token2$
Pay attention if the $ character should have special meaning in your client software. You may have to escape it in addition. This is not the case with standard PostgreSQL clients like psql or pgAdmin.
That is all very useful for writing PL/pgSQL functions or ad-hoc SQL commands. It cannot alleviate the need to use prepared statements or some other method to safeguard against SQL injection in your application when user input is possible, though. #Craig's answer has more on that. More details:
SQL injection in Postgres functions vs prepared queries
Values inside Postgres
When dealing with values inside the database, there are a couple of useful functions to quote strings properly:
quote_literal() or quote_nullable() - the latter outputs the unquoted string NULL for null input.
There is also quote_ident() to double-quote strings where needed to get valid SQL identifiers.
format() with the format specifier %L is equivalent to quote_nullable().
Like: format('%L', string_var)
concat() or concat_ws() are typically no good for this purpose as those do not escape nested single quotes and backslashes.
According to PostgreSQL documentation (4.1.2.1. String Constants):
To include a single-quote character within a string constant, write
two adjacent single quotes, e.g. 'Dianne''s horse'.
See also the standard_conforming_strings parameter, which controls whether escaping with backslashes works.
This is so many worlds of bad, because your question implies that you probably have gaping SQL injection holes in your application.
You should be using parameterized statements. For Java, use PreparedStatement with placeholders. You say you don't want to use parameterised statements, but you don't explain why, and frankly it has to be a very good reason not to use them because they're the simplest, safest way to fix the problem you are trying to solve.
See Preventing SQL Injection in Java. Don't be Bobby's next victim.
There is no public function in PgJDBC for string quoting and escaping. That's partly because it might make it seem like a good idea.
There are built-in quoting functions quote_literal and quote_ident in PostgreSQL, but they are for PL/PgSQL functions that use EXECUTE. These days quote_literal is mostly obsoleted by EXECUTE ... USING, which is the parameterised version, because it's safer and easier. You cannot use them for the purpose you explain here, because they're server-side functions.
Imagine what happens if you get the value ');DROP SCHEMA public;-- from a malicious user. You'd produce:
insert into test values (1,'');DROP SCHEMA public;--');
which breaks down to two statements and a comment that gets ignored:
insert into test values (1,'');
DROP SCHEMA public;
--');
Whoops, there goes your database.
In postgresql if you want to insert values with ' in it then for this you have to give extra '
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');
you can use the postrgesql chr(int) function:
insert into test values (2,'|| chr(39)||'my users'||chr(39)||');
When I used Python to insert values into PostgreSQL, I also met the question: column "xxx" does not exist.
The I find the reason in wiki.postgresql:
PostgreSQL uses only single quotes for this (i.e. WHERE name = 'John'). Double quotes are used to quote system identifiers; field names, table names, etc. (i.e. WHERE "last name" = 'Smith').
MySQL uses ` (accent mark or backtick) to quote system identifiers, which is decidedly non-standard.
It means PostgreSQL can use only single quote for field names, table names, etc. So you can not use single quote in value.
My situation is: I want to insert values "the difference of it’s adj for sb and it's adj of sb" into PostgreSQL.
How I figure out this problem:
I replace ' with ’, and I replace " with '. Because PostgreSQL value does not support double quote.
So I think you can use following codes to insert values:
insert into test values (1,'user’s log');
insert into test values (2,'my users');
insert into test values (3,'customer’s');
If you need to get the work done inside Pg:
to_json(value)
https://www.postgresql.org/docs/9.3/static/functions-json.html#FUNCTIONS-JSON-TABLE
You must have to add an extra single quotes -> ' and make doubling quote them up like below examples -> ' ' is the standard way and works of course:
Wrong way: 'user's log'
Right way: 'user''s log'
problem:
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
Solutions:
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');
I'm working on a prototype that uses Postgres as its backend. I don't do a lot of SQL, so I'm feeling my way through it. I made a .pgsql file I run with psql that executes each of many files that set up my database, and I use a variable to define the schema that will be used so I can test features without mucking up my "good" instance:
\set schema_name 'example_schema'
\echo 'The Schema name is' :schema_name
\ir sql/file1.pgsql
\ir sql/file2.pgsql
This has been working well. I've defined several functions that expand :schema_name properly:
CREATE OR REPLACE FUNCTION :schema_name.get_things_by_category(...
For reasons I can't figure out, this isn't working in my newest function:
CREATE OR REPLACE FUNCTION :schema_name.update_thing_details(_id uuid, _details text)
RETURNS text
LANGUAGE 'plpgsql'
AS $BODY$
BEGIN
UPDATE :schema_name.things
...
The syntax error indicates it's interpreting :schema_name literally after UPDATE instead of expanding it. How do I get it to use the variable value instead of the literal value here? I get that maybe within the BEGIN..END is a different context, but surely there's a way to script this schema name in all places?
I can think of three approaches, since psql cannot do this directly.
Shell script
Use a bash script to perform the variable substitution and pipe the results into psql, like.
#!/bin/bash
$schemaName = $1
$contents = `cat script.sql | sed -e 's/#SCHEMA_NAME#/$schemaName'`
echo $contents | psql
This would probably be a lot of boiler plate if you have a lot of .sql scripts.
Staging Schema
Keep the approach you have now with a hard-coded schema of something like staging and then have a bash script go and rename staging to whatever you want the actual schema to be.
Customize the search path
Your entry point could be an inline script within bash that is piped into psql, does an up-front update of the default connection schema, then uses \ir to include all of your .sql files, which should not specify a schema.
#!/bin/bash
$schemaName = $1
psql <<SCRIPT
SET search_path TO $schemaName;
\ir sql/file1.pgsql
\ir sql/file2.pgsql
SCRIPT
Some details: How to select a schema in postgres when using psql?
Personally I am leaning towards the latter approach as it seems the simplest and most scalable.
The documentation says:
Variable interpolation will not be performed within quoted SQL literals and identifiers. Therefore, a construction such as ':foo' doesn't work to produce a quoted literal from a variable's value (and it would be unsafe if it did work, since it wouldn't correctly handle quotes embedded in the value).
Now the function body is a “dollar-quoted%rdquo; string literal ($BODY$...$BODY$), so the variable will not be replaced there.
I can't think of a way to do this with psql variables.
I asked similar question here for: hstore value with space. And get solved by user: Clodoaldo Neto. Now I have come across next case with string containing single quote.
SELECT 'k=>"name", v=>"St. Xavier's Academy"'::hstore;
I tried it by using dollar-quoted string constant by reading http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
SELECT 'k=>"name", v=>$$St. Xavier's Academy$$'::hstore;
But I couldn't get it right.
How to make postgresql hstore using strings containing single quote?
It seems like there are more such exceptions possible for this query. How to address them all at once?
You can escape the embedded single quote that same way you'd escape any other single quote inside a string literal: double it.
SELECT 'k=>"name", v=>"St. Xavier''s Academy"'::hstore;
-- ------------------------------^^
Alternatively, you could dollar quote the whole string:
SELECT $$k=>"name", v=>"St. Xavier's Academy"$$::hstore;
Whatever interface you're using to talk to PostgreSQL should be taking care of these quoting and escaping issues. If you're using manual string wrangling to build your SQL then you should be using your driver's quoting and placeholder methods.
hstore's internal parsing understands double quotes around keys:
Double-quote keys and values that include whitespace, commas, =s or >s.
Dollar quoting is, as you noted, for SQL string literals, hstore's parser doesn't know what they mean.
I've very complex data that I'm inserting into postgresql and am using double dollar ($$) to escape. However I've one row which ends with dollar sign and is causing error.
The original row is like 'dd^d\w=dd$' and when escaped '$$dd^d\w=dd$$$'.
How can I escape this specific row?
Use any string inside the double dollar to differentiate it:
select $anything$abc$$anything$;
?column?
----------
abc$
The insert is similar:
insert into t (a, b) values
($anything$abc$$anything$, $xyz$abc$$xyz$);
INSERT 0 1
select * from t;
a | b
------+------
abc$ | abc$
I found this question troubleshooting problem with executing query with double dollar in literal from within linux shell. For example select '$abc$' in psql gives correct result $abc$ while psql -U me -c "select '$abc$'" called from linux shell produces incorrect result $ (provided there's no system variable abc).
In that case, wrapping into another delimiter ($wrapper$$abc$$wrapper$) won't help since the primary problem is interpreting dollars in shell context. Possible solution is escaping dollars (psql -U me -c "select '\$abc\$'") however this produces backslashes literally when called in psql. To produce same query usable in both psql and linux shell, psql -U me -c "select concat(chr(36),'abc',chr(36))" is universal solution.
While Clodoaldo is quite right I think there's another aspect of this you need to look at:
Why are you doing the quoting yourself? You should generally be using parameterised ("prepared") statements via your programming language's client driver. See bobby tables for some examples in common languages. Using parameterised statements means you don't have to care about doing any quoting at all anymore, it's taken care of for you by the database client driver.
I'd give you an example but you haven't mentioned your client language/tools.