How to convert PostgreSQL escape bytea to hex bytea? - postgresql

I got the answer to check for one certain BOM in a PostgreSQL text column. What I really like to do would be to have something more general, i.e. something like
select decode(replace(textColumn, '\\', '\\\\'), 'escape') from tableXY;
The result of a UTF8 BOM is:
\357\273\277
Which is octal bytea and can be converted by switching the output of bytea in pgadmin:
update pg_settings set setting = 'hex' WHERE name = 'bytea_output';
select '\357\273\277'::bytea
The result is:
\xefbbbf
What I would like to have is this result as one query, e.g.
update pg_settings set setting = 'hex' WHERE name = 'bytea_output';
select decode(replace(textColumn, '\\', '\\\\'), 'escape') from tableXY;
But that doesn't work. The result is empty, probably because the decode cannot handle hex output.

If the final purpose is to get the hexadecimal representation of all the bytes that constitute the strings in textColumn, this can be done with:
SELECT encode(convert_to(textColumn, 'UTF-8'), 'hex') from tableXY;
It does not depend on bytea_output. BTW, this setting plays a role only at the final stage of a query, when a result column is of type bytea and has to be returned in text format to the client (which is the most common case, and what pgAdmin does). It's a matter of representation, the actual values represented (the series of bytes) are identical.
In the query above, the result is of type text, so this is irrelevant anyway.
I think that your query with decode(..., 'escape') can't work because the argument is supposed to be encoded in escape format and it's not, per comments it's normal xml strings.

With the great help of Daniel-Vérité I use this general query now to check for all kind of BOM or unicode char problems:
select encode(textColumn::bytea, 'hex'), * from tableXY;
I had problem with pgAdmin and too long columns, as they had no result. I used that query for pgAdmin:
select encode(substr(textColumn,1,100)::bytea, 'hex'), * from tableXY;
Thanks Daniel!

Related

Converting bytea back to varchar

In Postgres when I want to save a varchar to a bytea column, this is made easy by an implicit conversion. So I can simply execute
UPDATE my_table SET my_bytea_col = 'This varchar will be converted' WHERE id = 1;
I use this all the time. However, I would like to occasionally see the contents of this column as a varchar. IDEs will handle this for you, but I would prefer in my use case to return the results with the bytea converted back to a varchar.
Of course I've tried something like this, among more complex options:
select my_bytea_col::VARCHAR from my_table WHERE id = 1
This, however, doesn't return my original readable text. How else can I convert my bytea back to the original varchar after postgres's implicit conversion in updates and inserts like the one above?
If the string encoding is UTF-8, you could use
SELECT convert_from(my_bytea_col, 'UTF8')
FROM my_table
WHERE id = 1;
If the encoding is different, you need to supply the appropriate second argument (e.g. LATIN1) to convert_from.
May I remark that I consider it not a good idea to store text strings as bytea?

DB2 Concatenation not working in Informatica

I am using below query for DB2 database in SQ of a mapping and sending the records to a csv target:
SELECT FIELD1||':'||FIELD2 FROM LIBRARY.FILE
But its returning some hexadecimal value though it is returning correct number of records.
Even the above query is working fine in Squrrel.
But when I am not using the separator it is working fine (Below query):
SELECT FIELD1||FIELD2 FROM LIBRARY.FILE
Any help.
Check the CCSID of the user of the job. The CCSID of FIELD1 and FIELD2 are different from the string ':'. The FIELD1/2 CCSID come from the database, and I think the ':' comes from the CCSID of the job. If they are different, DB2 returns the result of the concatenation in EBCDIC. Cast your ':' with the same CCSID as the FIELD1:
cast(':' as char(1) CCSID XXXX)
See DB2 query results in Hex format
and if you use
select field1 concat ':' concat field2 from yourlib.yourtable
OR
select concat(field1, concat(':', field2)) from yourlib.yourtable
NB: on AS400 you have a tool for export you table on IFS nammed CPYTOIMPF
example :
CPYTOIMPF FROMFILE(yourlib/yourtable) TOSTMF('/yourIFSdir/outputfile.txt') STMFCODPAG(*PCASCII) RCDDLM(':')

How to removing spacing in SQL

I have data in DB2 then i want to insert that data to SQL.
The DB2 data that i had is like :
select char('AAA ') as test from Table_1
But then, when i select in SQL after doing insert, the data become like this.
select test from Table_1
result :
test
------
AAA
why Space character read into box character. How do I fix this so that the space character is read into.
Or is there a setting I need to change? or do I have to use a parameter?
I used AS400 and datastage.
Thank you.
Datastage appends pad characters so you know that there are spaces there. The pad character is 0x00 (NUL) by default and that's what you're seeing.
Research the APT_STRING_PADCHAR environment variable; you can set it to something else if you want.
The 0x00 characters are not actually in your database. The short answer is, you can safely ignore it.
When you said:
select char('AAA ') as test from Table_1
You were not actually showing any data from the table. Instead you were showing an expression casting a constant AAA as a character value, and giving that result column the name test which coincidentally seems to be the name of a column in the table, although that coincidence doesn't matter here.
Then your 2nd statement does show the contents of the database column.
select test from Table_1
Find out what the hexadecimal value actually is.

HEX where clause in Postgre

I'm new in postgreSQL
how to do this
select * from table_abc where table_abc.a>=7a and table_abc.b<=7a
all value is HEX in column a, b and input value
Thanks
EDIT :
table_abc
a bytea
b bytea
c text
Careful, here. In Postgres, bytea is a byte array. You look like you want to store a single byte in those columns.
I don't see a single-byte type in the list of datatypes at http://www.postgresql.org/docs/9.0/static/datatype.html.
You can go with an integer type. For example, when I say this:
select x'7A'::integer
I get 122.
If you intend to store a single byte in these columns and write your queries with hex values, then I suggest you make the columns integers and query like this:
select * from table_abc where table_abc.a>=x'7a'::integer and table_abc.b<=x'7a'::integer

TSQL Prefixing String Literal on Insert - Any Value to This, or Redundant?

I just inherited a project that has code similar to the following (rather simple) example:
DECLARE #Demo TABLE
(
Quantity INT,
Symbol NVARCHAR(10)
)
INSERT INTO #Demo (Quantity, Symbol)
SELECT 127, N'IBM'
My interest is with the N before the string literal.
I understand that the prefix N is to specify encoding (in this case, Unicode). But since the select is just for inserting into a field that is clearly already Unicode, wouldn't this value be automatically upcast?
I've run the code without the N and it appears to work, but am I missing something that the previous programmer intended? Or was the N an oversight on his/her part?
I expect behavior similar to when I pass an int to a decimal field (auto-upcast). Can I get rid of those Ns?
Your test is not really valid, try something like a Chinese character instead, I remember if you don't prefix it it will not insert the correct character
example, first one shows a question mark while the bottom one shows a square
select '作'
select N'作'
A better example, even here the output is not the same
declare #v nvarchar(50), #v2 nvarchar(50)
select #v = '作', #v2 = N'作'
select #v,#v2
Since what you look like is a stock table why are you using unicode, are there even symbols that are unicode..I have never seen any and this includes ISIN, CUSIPS and SEDOLS
Yes, SQL Server will automatically convert (widen, cast down) varchar to nvarchar, so you can remove the N in this case. Of course, if you're specifying a string literal where the characters aren't actually present in the database's default collation, then you need it.
It's like you can suffix a number with "L" in C et al to indicate it's a long literal instead of an int. Writing N'IBM' is either being precise or a slave to habit, depending on your point of view.
One trap for the unwary: nvarchar doesn't get automatically converted to varchar, and this can be an issue if your application is all Unicode and your database isn't. For example, we had this with the jTDS JDBC driver, which bound all parameter values as nvarchar, resulting in statements effectively like this:
select * from purchase where purchase_reference = N'AB1234'
(where purchase_reference was a varchar column)
Since the automatic conversions are only one way, that became:
select * from purchase where CONVERT(NVARCHAR, purchase_reference) = N'AB1234'
and therefore the index of purchase_reference wasn't used.
By contrast, the reverse is fine: if purchase_reference was an nvarchar, and an application passed in a varchar parameter, then the rewritten query:
select * from purchase where purchase_reference = CONVERT(NVARCHAR, 'AB1234')
would be fine. In the end we had to disable binding parameters as Unicode, hence causing a raft of i18n problems that were considered less serious.