PostgreSQL hash a column using SHA256 and a pre-defined salt

PostgreSQL hash a column using SHA256 and a pre-defined salt - postgresql

I'm using: PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit running on AWS as Aurora (RDS).
I have managed to get the below working
SELECT encode(digest('email#example.com', 'sha256'),'hex');
But I need to use a salt provided by a 3rd party. Let's say for arguments sake the salt is 'this_is_my_salt'. How can I use this? i could only find examples that generate the salt using algorithms.
The vendor wants us to use their hash to compare email addresses in their database with ours. They haven't specified their Database system, but showed me their query which is:
SELECT 'email#example.com' as unhashed_email, sha2('shared_salt_value' || lower('email#example.com')) as hashed_email
And this produces a different hash to me trying the example in postgres using one of the answers below:
SELECT encode(digest('email#example.com' || 'shared_salt_value', 'sha256'),'hex');
My hash starts with db17e....
Their hash starts with b6c84....
Could it be encoding or something causing the difference?

That is trivial, just concatenate the string with the salt:
SELECT encode(digest('this_is_my_salt' || 'email#example.com', 'sha256'),'hex');

Related

Is there such a thing as a "PostgreSQL-SQL-to-OtherDB-SQL" converter in PHP, or at all?

I use PostgreSQL exclusively. I have no plans to ever change this. However, I recognize that other people are not me, and they instead use MySQL, MS SQL, IBM SQL, SQLite SQL, Oracle SQL and ManyOthers SQL. I'm aware that they have different names in reality.
My queries look like:
SELECT * FROM table WHERE id = $1;
UPDATE table SET col = $1 WHERE col = $2;
INSERT INTO table (a, b, c) VALUES ($1, $2, $3);
My database wrapper functions currently support only PostgreSQL, by internally calling the pg_* functions.
I wish to support "the other databases" too. This would involve (the trivial part) to make my wrapper functions able to interact with the other databases by using the PHP functions for those.
The difficult part is to reconstruct the PostgreSQL-flavor SQL queries from the application into something that works identically yet will be understood by the other SQL database in use, such as MySQL. This obviously involves highly advanced parsing, analysis and final creation of the final query string. For example, this PostgreSQL SQL query:
SELECT * FROM table WHERE col ILIKE $1 ORDER BY random() LIMIT 1;
... will be turned into WeirdSQL like this:
SELECT * FROM table WHERE col ISEQUALTOKINDA %1 ORDER BY rnd() LIMIT 1;
I don't require support from any other input SQL flavor than PostgreSQL, but the output must be "all the big SQL database vendors".
Has anyone even attempted this? Or is it something that is never gonna happen as free software but might exist as a commercial offering? It sounds like it would be a thing. It would be insanely useful, and "crazier" projects have been attempted.

jOOQ is a Java library that aims to hide differences between databases. It has its own SQL grammar which tries to be compatible with everything (but parameter markers must be the JDBC ?), and generates DB-specific SQL from that.
There is an online translator, which generates the following from your query for Oracle:
select *
from table
where lower(cast(col as varchar2(4000))) like lower(cast(:1 as varchar2(4000)))
order by DBMS_RANDOM.RANDOM
fetch next 1 rows only

ODBC uses its own syntax on top on the database's syntax. ODBC drivers are required to convert ODBC parameter markers (?) to whatever the database uses, and to translate escape sequences for certain elements that are likely to have a non-standard syntax in the DB (time/GUID/interval literals, LIKE escape character, outer joins, procedure calls, function calls).
However, most escape sequences are optional, and this does not help with other syntax differences, such as the LIMIT 1.
ODBC drivers provide a long list of information about SQL syntax details, but it is the application's job to construct queries that conform to those restrictions, and not all differences can be described by this list. In practice, most ODBC applications restrict themselves to a commonly supported subset of SQL.

Postgres 11.4 crypt extension issue in postgres

i am using pgcrypto extension for password encryption in my PostgreSQL db. i am using same key to encrypt all passwords. When i use same key in different passwords(different strings) it gives same output.
Samples:
db=# select crypt('Sharon_1','alpha');
crypt
---------------
aljp4LCkDT1k.
(1 row)
Time: 2.025 ms
db=# select crypt('Sharon_1trgstysa','alpha');
crypt
---------------
aljp4LCkDT1k.
(1 row)
why is it like that?. When i pass two different strings it should give different encrypted strings as output.Is this a bug?. How can i solve this ? i can't change the key. The key should be always same.
Postgres version:
db=# select version();
PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28),
64-bit
Extension version:
db=# \dx pgcrypto
List of installed extensions
Name | Version | Schema | Description
----------+---------+--------+-------------------------
pgcrypto | 1.3 | public | cryptographic functions

pgcrypt is meant for something else:
Calculates a crypt(3)-style hash of password. When storing a new
password, you need to use gen_salt() to generate a new salt value. To
check a password, pass the stored hash value as salt, and test whether
the result matches the stored value.
The following CTE encrypts the passwords using a md5 salt algorithm and the select compares a given password with the ones in the CTE:
WITH j (val) AS (
VALUES
(crypt('Sharon_1',gen_salt('md5'))),
(crypt('Sharon_1trgstysa',gen_salt('md5')))
)
SELECT
val = crypt('Sharon_1',val), -- entered password to compare!
val -- stored password
FROM j;
?column? | val
----------+------------------------------------
t | $1$XpqL58HA$k2G55BjtVFQxHVe/jpu.2.
f | $1$0OIuDMkZ$PH2cDjG.aRzUAvtUtvf3E1
(2 Zeilen)
To encrypt and decrypt with symmetric PGP keys try pgp_sym_encrypt and pgp_sym_decrypt, e.g.
WITH j (val) AS (
VALUES
(pgp_sym_encrypt('Sharon_1','alpha')),
(pgp_sym_encrypt('Sharon_1trgstysa','alpha'))
)
SELECT pgp_sym_decrypt(val,'alpha') FROM j;
pgp_sym_decrypt
------------------
Sharon_1
Sharon_1trgstysa
(2 Zeilen)

The algorithm to be used by crypt is embedded in the format of the salt. Your salt "alpha" doesn't specify an algorithm, so crypt uses des. des only looks at the first 8 characters of the password (and the first 2 characters of the salt), and your two passwords don't differ in the first 8 characters.
i can't change the key.
Then your system is broken by design.

PostgreSQL hstore concatenation

According to TFM (Postgres docs), you use the normal concatenation operator to join two HSTOREs:
SELECT 'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore
Result:
"a"=>"b", "c"=>"x", "d"=>"q"
However, I'm getting an error when running that exact same command:
[42883] ERROR: operator does not exist: hstore || hstore Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.
The only way I can get the desired result is to do something so hackish it makes me want to cry: converting any HSTOREs I have into text first, then concatenating the text and converting back to an HSTORE. This, of course, isn't good as the docs also state that if there are duplicate keys, there's no guarantee as to which one will survive the concatenation.
Before I file a bug with the Postgres folks, can anybody else duplicate this? Version info:
select version();
PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit
select extname,extversion from pg_catalog.pg_extension;
hstore 1.3

Figured out the issue. The HSTORE extension was installed into a separate schema; I was able to call that schema by identifying it (sys.hstore), but it was still not liking the operator ||. The fix was actually pretty simple: I added sys to the search_path.

How to access a HSTORE column using PostgreSQL C library (libpq)?

I cannot find any documentation regarding HSTORE data access using the C library. Currently I'm considering to just convert the HSTORE columns into arrays in my queries but is there a way to avoid such conversions?

libpqtypes appears to have some support for hstore.
Another option is to avoid directly interacting with hstore in your code. You can still benefit from it in the database without dealing with its text representation on the client side. Say you want to fetch a hstore field; you just use:
SELECT t.id, k, v FROM thetable t, LATERAL each(t.hstorefield);
or on old PostgreSQL versions you can use the quirky and nonstandard set-returning-function-in-SELECT form:
SELECT t.id, each(t.hstorefield) FROM thetable t;
(but watch out if selecting multiple records from t this way, you'll get weird results wheras LATERAL will be fine).
Another option is to use hstore_to_array or hstore_to_matrix when querying, if you're comfortable dealing with PostgreSQL array representation.
To create hstore values you can use the hstore constructors that take arrays. Those arrays can in turn be created with array_agg over a VALUES clause if you don't want to deal with PostgreSQL's array representation in your code.
All this mess should go away in future, as PostgreSQL 9.4 is likely to have much better interoperation between hstore and json types, allowing you to just use the json representation when interacting with hstore.

The binary protocol for hstore is not complicated.
See the _send and _recv functions from its IO code.
Of course, that means requesting (or binding) it in binary format in libpq.
(see the paramFormats[] and resultFormat arguments to PQexecParams)

SQL Server OpenQuery() behaving differently then a direct query from TOAD

The following query works efficiently when run directly against Oracle 11 using TOAD (with native Oracle drivers)
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date('12/31/9999','mm/dd/yyyy')
and rgn_nm = 'Boston'
) ...
;
The exact same query "never" returns if passed from SQL Server 2008 to the same Oracle database via openquery(). SQL Server has a link to the Oracle database using an Oracle Provider OLE DB driver.
select * from openquery( servername, '
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
The query doesn't return in a reasonable amount of time, and the user kills the query. I don't know if it would eventually return with the correct result.
This result where the direct TOAD query works efficiently and the openquery() version "never" returns is reproducible.
A small modification to the openquery() gives the correct efficient result: Change eff_endt to trunc(eff_endt).
That is well and good, but it doesn't seem like the change should be necessary.
openquery() is supposed to be pass through, so how can there be a difference between the TOAD and openquery() behavior?
The reason we care is because we frequently develop complex queries with TOAD directly accessing Oracle. Once we have the query functioning and optimized, we convert it to an openquery() string for use in a SQL Server application. It is extremely aggravating to have a query suddenly fail with openquery() when we know it worked as a direct query. Then we have to search for a work-around through trial and error.
I would like to see the Oracle trace files for the two scenarios, but the Oracle server is within another organization, and we are not getting cooperation from the Oracle DBAs.
Does anyone know of any driver, or TOAD, or ??? issues that could account for the discrepancy? Is there any way to eliminate the problem such that both methods always give the same result?

I know you asked this a while ago but I just came across your question.
I agree, they should be the same. Obviously there is a difference. We need to find out where the difference is.
I am thinking out loud as I type...
What happens if you specify just a few column instead of select * from openquery?
How many rows are supposed to be returned?
What if, in the oracle select, you limit the returned rows?
How quickly does the openquery timeout?
Are TOAD and SS on the same machine? Are you RDPing into the SS and running toad from there?
Are they using the same drivers? including bit? (32/64) version?
Are they using the same account on oracle?
It is interesting that using the trunc() makes a difference. I assume [eff_endt] is one of the returned fields?
I am wondering if SS is getting all the rows back but it is choking on doing the date conversions. The date type in oracle may need to be converted to a ss date type before ss shows it to you.
What if you insert the rows from the openquery into a table where the date field is just a (n)varchar. I am thinking ss might just dump the date it is getting back from oracle into that text field without trying to convert it.
something like:
insert into mytable(f1,f2,f3,datetimeX)
select f1,f2,f3,datetimeX from openquery( servername, '
select f1,f2,f3,datetimeX from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
What if toad or ss is modifying the query statement before sending it to oracle. You could fire up wireshark and see what toad and ss are actually sending.
I would be very curious if you get this resolved. I link ss to oracle often and have not run into this issue.

Here are basic things you can check for to see what the database is doing after it receives the query. First, check that the execution plans are the same in TOAD as when the query runs using openquery. You could plan the query yourself in TOAD using:
explain plan set statement_id = 'openquery_test' for <your query here>;
select *
from table(dbms_xplan.display(statement_id => 'openquery_test';
then have someone initiate the query using openquery() and have someone with permissions to view v$ tables to run:
select sql_id from v$session where username = '<user running the query>';
(If there's more than one connection with the same user, you'll have to find an additional attribute to isolate the row representing the session running the query.)
select *
from table(dbms_xplan.display_cursor('<value from query above'));
If those look the same then I'd move on to checking database waits and see what it's stuck on.
select se.username
, sw.event
, sw.p1text
, sw.p2text
, sw.p3text
, sw.wait_time_micro/1000000 as seconds_in_wait
, sw.state
, sw.time_since_last_wait_micro/1000000 as seconds_since_last_wait
from v$session se
inner join
v$session_wait sw
on se.sid = sw.sid
where se.username = '<user running the query>'
;
(again, if there's more than one session with the same username, you'll need another attribute to whittle it down to the one you're interested in.)
If the plans are different, then you need to find out why, or if they're the same, look into what it's waiting on (e.g. SQL*Net message to client ?) and why.

I noticed a difference using OLEDB through MS Access (2013) connecting to Oracle 10g & 11g tables, in that it did not always recognize indexes or primary keys on the Oracle tables properly. The same query through an MS Access 2000 database (using odbc) worked fine / had no problem with indexes & keys. The only way I found to fix the OLEDB version was to include all of the key fields in the SELECT -- which was not a satisfying answer, but it's all I could find. This might be an option to try through SSMS / OpenQuery(...) as well.
Besides that... you can try some alternatives to OPENQUERY, such as:
4-part names: SELECT ... FROM Server..Schema.Table
Execute AT: EXEC ('select...') at linked server
But as for why the OLEDB provider works differently than the native Oracle Provider -- the providers are not identical, and the native provider would be more likely to pave-over Oracle quirks than the more generic OLEDB provider would.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse