I'm trying this query on my postgres 9.2
SELECT ar.nome_defensor, count(*)
FROM sirdp.atividade_realizadas ar
INNER JOIN sirdp.naturezas n on n.id = ar.natureza_id
INNER JOIN sirdp.atividades at on at.id = n.atividade_id
WHERE ar.data_atividade between '01/08/2011' and '31/08/2014'
and ar.local_atuacao_defensor in ('1ª Vara de Acara\303\272')
group by ar.nome_defensor
order by ar.nome_defensor
It don't work, but on 9.0 it works.
I think it has something with the parameter: 1ª Vara de Acara\303\272 because the problem is with accented words.
Both database have this config:
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'pt_BR.UTF-8'
LC_CTYPE = 'pt_BR.UTF-8'
CONNECTION LIMIT = -1;
I think it has something with the parameter: 1ª Vara de Acara\303\272
because the problem is with accented words.
Yes. Postgresql 9.0 and before had the configuration parameter standard_conforming_strings set to OFF by default, which means that this string literal:
'1ª Vara de Acara\303\272'
was interpreted in the context of UTF-8 encoding as: 1ª Vara de Acaraú
Since PostgreSQL 9.1, this standard_conforming_strings has been turned to ON by default, so now the backslash is interpreted as just a backslash. This is explained in the documentation:
standard_conforming_strings (boolean)
This controls whether ordinary string literals ('...') treat backslashes literally, as specified in the SQL standard. Beginning in
PostgreSQL 9.1, the default is on (prior releases defaulted to off).
Applications can check this parameter to determine how string literals
will be processed. The presence of this parameter can also be taken as
an indication that the escape string syntax (E'...') is supported.
Escape string syntax (Section 4.1.2.2) should be used if an
application desires backslashes to be treated as escape characters.
You may get away with this by either:
Using directly 1ª Vara de Acaraú in the query. After all this, ª character is not in the US-ASCII range already, so your method to submit the query already supports accents.
Using E'1ª Vara de Acara\303\272' in the query.
Not changing the query, but setting standard_conforming_strings to OFF to switch back to the behavior of PostgreSQL 9.0 and previous versions.
Related
I inserted a bunch of rows with a text field like content='...\n...\n...'.
I didn't use e in front, like conent=e'...\n...\n..., so now \n is not actually displayed as a newline - it's printed as text.
How do I fix this, i.e. how to change every row's content field from '...' to e'...'?
The syntax variant E'string' makes Postgres interpret the given string as Posix escape string. \n encoding a newline is only one of many interpreted escape sequences (even if the most common one). See:
Insert text with single quotes in PostgreSQL
To "re-evaluate" your Posix escape string, you could use a simple function with dynamic SQL like this:
CREATE OR REPLACE FUNCTION f_eval_posix_escapes(INOUT _string text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT E''' || _string || '''' INTO _string;
END
$func$;
WARNING 1: This is inherently unsafe! We have to evaluate input strings dynamically without quoting and escaping, which allows SQL injection. Only use this in a safe environment.
WARNING 2: Don't apply repeatedly. Or it will misinterpret your actual string with genuine \ characters, etc.
WARNING 3: This simple function is imperfect as it cannot cope with nested single quotes properly. If you have some of those, consider instead:
Unescape a string with escaped newlines and carriage returns
Apply:
UPDATE tbl
SET content = f_eval_posix_escapes(content)
WHERE content IS DISTINCT FROM f_eval_posix_escapes(content);
db<>fiddle here
Note the added WHERE clause to skip updates that would not change anything. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
Use REPLACE in an update query. Something like this: (I'm on mobile so please ignore any typo or syntax erro)
UPDATE table
SET
column = REPLACE(column, '\n', e'\n')
Assume a value has all windows special characters including "-" and it cannot be inserted directly to Mongo.
where i can find the mongo db special character restrictions?
How to escape while saving or retrieving?
example, do we need to pass the value within brackets ([...])? how to insert and retrieve the record with special character.
Thanks.
Values in MongoDB can be any UTF-8 string. Escaping would be the responsibility of your client program. Depending on which client program / language driver you are using, you would need to use the necessary escape character(s) for that language.
Since you mentioned that you're using Jongo, that means that you're using Java and the mongo-java-driver. In Java, you can just use the unicode escape sequence for the character you're trying to use. For example, \u2014 is the em-dash character.
For example:
DB db = new MongoClient().getDB("test");
Jongo jongo = new Jongo(db);
MongoCollection collection = jongo.getCollection("mycollection");
collection.insert("{fieldWithDash: 'x-y', fieldWithEmDash: 'x\u2014y'}");
Test test = collection.findOne("{fieldWithEmDash: 'x\u2014y'}").as(Test.class);
System.out.println(test);
This:
MongoCollection.findOne(query, parameter)
resolved the problem.
Query and parameter explicite:
MongoCollection.findOne("{ empName: # }", empName)
I need to update a record, which contains literal percent signs, using PostgreSQL in Railo. The query looks like
<cfquery>
update foo set bar = 'string with % in it %'
</cfQuery>
It throws error as ColdFusion normally interprets it as a wildcard character. I can escape it using the following query.
<cfquery>
update foo set bar = 'string with escaped \% in it \%'
</cfQuery>
However, the record now contains "\%" in the database and will be displayed on the page as "\%".
I found a documentation with an example of escaping percent sign in a SELECT. But it does not work for me: syntax error at or near "ESCAPE".
SELECT emp_discount
FROM Benefits
WHERE emp_discount LIKE '10\%'
ESCAPE '\';
Is there a better to achieve the same goal? The underlining database is PostgreSQL. Thanks!
Queryparameters escape special characters. Yet another reason to use them.
im working with PostgreSQl 9.0
and i have a table from which i need to replace a character with ''(blank space)
for that im using
update species set engname = replace(engname, '', '');
(this is the query image)
(image is posted)
in the case species is the table and engname is the field(character varying)..
the contens of one of the row is
" -tellifer fÂÂrthii"
even after firing the query the character is not replaced.
i have tried with
update species set sciname = regexp_replace(sciname, '', '')
but the character doesnot get replace
my database is
CREATE DATABASE myDB
WITH OWNER = Myadmin
ENCODING = 'SQL_ASCII'
TABLESPACE = pg_default
LC_COLLATE = 'C'
LC_CTYPE = 'C'
CONNECTION LIMIT = -1;
We are planning to move to UTF-8 encoding but during conversion with iconv the conversion fails because of this
so i wanted to replace the character with..
can anyone tell me how to remove that character?
this symbol can be used for more characters - so you cannot to use replace. Probably your client application uses a different encoding than database. Symbol is used to signalisation broken encoding.
Solution is using correct encoding
postgres=# select * from ff;
a
───────────────
žluťoučký kůň
(1 row)
postgres=# set client_encoding to 'latin2'; --setting wrong encoding
SET
postgres=# select * from ff; -- and you can see strange symbols
a
───────────────
�lu�ou�k� k�
(1 row)
postgres=# set client_encoding to 'utf8'; -- setting good encoding
SET
postgres=# select * from ff;
a
───────────────
žluťoučký kůň
(1 row)
Other solution is replacing national or special chars by related ascii characters
9.x has unaccent contrib module for utf or for some 8bites encoding there is function to_ascii()
I'm using a bytea type in PostgreSQL, which, to my understanding, contains just a series of bytes. However, I can't get it to play well with nulls. For example:
=# select length(E'aa\x00aa'::bytea);
length
--------
2
(1 row)
I was expecting 5. Also:
=# select md5(E'aa\x00aa'::bytea);
md5
----------------------------------
4124bc0a9335c27f086f24ba207a4912
(1 row)
That's the MD5 of "aa", not "aa\x00aa". Clearly, I'm Doing It Wrong, but I don't know what I'm doing wrong. I'm also on an older version of Postgres (8.1.11) for reasons outside of my control. (I'll see if this behaves the same on the latest Postgres as soon as I get home...)
Try this:
# select length(E'aa\\000aa'::bytea);
length
--------
5
Updated: Why the original didn't work? First, understand the difference between one slash and two:
pg=# select E'aa\055aa', length(E'aa\055aa') ;
?column? | length
----------+--------
aa-aa | 5
(1 row)
pg=# select E'aa\\055aa', length(E'aa\\055aa') ;
?column? | length
----------+--------
aa\055aa | 8
In the first case, I'm writing a literal string, 4 characters unescaped('a') and one escaped. The slash is consumed by the parser in a first pass, which converts the full \055
to a single char ('-' in this case).
In the second case, the first slash just escapes the second, the pair \\ is translated by the parser to a single \ and the 055 is seen as three characters.
Now, when converting a text to a bytea, escape characters (in a already parsed or produced text) are parsed/interpreted again! (Yes, this is confusing).
So, when I write
select E'aa\000aa'::bytea;
in the first parsing, the literal E'aa\000aa' is converted to an internal text with a null character in the third position (and depending on your postgresql version, the null character is interpreted as an EOS, and the text is assumed to be of length two - or in other versions an illegal string error is thrown).
Instead, when I write
select E'aa\\000aa'::bytea;
in the first parsing, the literal string "aa\000aa" (eight characters) is seen, and is asigned to a text; then in the casting to bytea, it is parsed again, and the sequence of characters '\000' is interpreted as a null byte.
IMO postgresql kind of sucks here.
You can use regular strings or dollar-quoted strings instead of escaped strings:
# select length('aa\000aa'::bytea);
length
════════
5
(1 row)
# select length($$aa\000aa$$::bytea);
length
════════
5
(1 row)
I think that dollar-quoted strings are a better option because, if the configuration parameter standard_conforming_strings is off, then PostgreSQL recognizes backslash escapes in both regular and escape string constants. However, as of PostgreSQL 9.1, the default is on, meaning that backslash escapes are recognized only in escape string constants.