Why are these resulting symmetric encryption values different? - sql-server-2008-r2

I'm using something like this:
OPEN SYMMETRIC KEY SSNKey
DECRYPTION BY CERTIFICATE SSNCert;
UPDATE
Customers
SET
SSNEncrypted = EncryptByKey(Key_GUID('SSNKey'), 'DecryptedSSN')
Where SSNEncrypted is a varbinary column. I noticed the values come out different each time. Why is this? And what can I do to get consistent encrypted values, so I can compare them in different tables?

This is "by design".
The function EncryptByKey is nondeterministic.
But if you decrypt the different values you always get the original decrypted value.
Have a look at this blog on MSDN.

Related

Any way to get orginal data from hashed values in snowflake?

I have a table which uses the snowflake hash function to store values in some columns.
Is there any way to reverse the encrytion from the hash function and get the original values from the table?
As per the documentation, the function is not "not a cryptographic hash function", and will always return the same result for the same input expression.
Example :
select hash(1) always returns -4730168494964875235
select hash('a') always returns -947125324004678632
select hash('1234') always returns -4035663806895772878
I was wondering if there is any way to reverse the hashing and get the original input expression from the hashed values.
I think these disclaimers are for preventing potential legal disputes:
Cryptographic hash functions have a few properties which this function
does not, for example:
The cryptographic hashing of a value cannot be inverted to find the
original value.
It's not possible to reserve a hash value in general. If you consider that when you even send a very long text, and it is represented in a 64-bit value, it's obvious that the data is not preserved. On the other hand, if you use a brute force technique, you may find the actual value producing the hash, and it can be counted as reserving the hash value.
For example, if you store all hash values for the numbers between 0 and 5000 in a table, when I came with hash value '-7875472545445966613', you can look up that value in your table, and say it belongs to 1000 (number).

What's a best practice for saving a unique, random, short string to db?

I have a table with a varchar column named key, which is supposed to hold a unique, 8-char random string, which is going to be used as an unique identifier by users. This field should be generated and saved on creation of objects, I have a question about how to create it:
Most of recommendations point to UUID field, but it's not applicable for me because it's too long, and if just get a subset of it then there's no guarantee of uniqueness.
Currently I've just implemented a loop in my backend (not DB), which generates a random string and tries to insert it to DB, and retries if the string turns out to be not unique. But I feel that this is just a really bad practice.
What's the best way to do this?
I'm using Postgresql 9.6
UPDATE:
My main concern is to remove the loop that retries to find a random, short string (or number, doesn't matter) that is unique in that table. AFAIK the solution should be a way to generate the string in DB itself. The only thing that I can find for Postgresql is uuid and uuid-ossp that does something like this, but uuid is way too long for my application, and I don't know of any way to have a shorter representation of uuid without compromising it's uniqueness (and I don't think it's possible theoretically).
So, how can I remove the loop and it's back-and-forth to DB?
Encryption is guaranteed unique, it has to be otherwise decryption would not work. Provided you encrypt unique inputs, such as 0, 1, 2, 3, ... then you are guaranteed unique outputs.
You want 8 characters. You have 62 characters to play with: A-Z, a-z, 0-9 so convert your binary output from the encryption to a base 62 number.
You may need to use the cycle walking technique from Format-preserving encryption to handle a few cases.

Data hashing in Pentaho

Can anyone suggest me the best possible options that I can use in pentaho to suit my requirement. The requirement is we need to convert first_name & last_name attributes into hash and load the hash values for these columns into the user table to support the business reports. For the reports the actual values for these columns are not needed, the reporting code only checks for NULL values in first_name & last_name columns, and validates length of these fields.
I tried converting the fields to hash using Add checksum transformation but wasn't sure about which type of checksum to use (CRC 32, ADLER 32, MD5, SHA-1). Any suggestions?
source & target DB is PostgreSql not sure if it's needed.
Thanks in advance.
Hashing and encryption are not the same thing.
It seems you want a one-way hash. What hash you choose depends mainly on how much you care about collisions. If you don't care that multiple names could generate the same hash, a short fast hash like CRC32 is fine. If you do care about collisions then I'd use at least MD5.

Hashing of timestamp

I need a hash function(maybe I should not call that a "hash" function) that:
1.is used for hashing timestamps only;
2.there exist a reverse function that I can restore the timestamp through that function;
3.does not generate duplicate hash value;
4.whether not it is a hash function, it is nearly as fast as a hash function;
PS: About the data type of timestamp --- image that as a 4 bytes "long" type in C.
Is that possible?
(I need the timestamp to be a secret. --- In fact, I need the hash value as a session id and the original timestamp as an index in my database. Whenever user request something with the session id, I can get the timestamp as an index to get the request info.)
If you can skip #2 MurmurHash might be a good option:
https://sites.google.com/site/murmurhash/
(2) If you must crypt/decrypt there are standard implementations of the various algorithms for most languages (AES, for instance). This will be much slower than hashing.
If you don't actually need this to secure the data (which begs the question: why bother at all with any conversion?) and just want to make some non-timestamp-looking string that is easily reversible (by you -- and anyone else) then check this question:
Rot13 for numbers

How to encrypt a string with standard PostgreSQL?

I'm working with PostgreSQL.
I need to transform "http://www.xyz.com/some_uri/index1.html" in something like "scdfdsffd"(some unique key, based on the URL that is a unique key in the table).
By other words... the URL is a unique key in the table, but I need to generate a small unique key based on the URL.
What can I do with standard PostgreSQL 8.4?
Best Regards,
Several methods:
a) Why not use an auto-incrementing column or sequence generator to generate unique integers per insert? If you have less than 100 million URLs, your identifiers are short and easy to remember. However, if that's not an option (e.g. because you don't want people guessing IDs and attacking the database that way):
b) The built-in MD5() function may help:
INSERT INTO table (pkey, url) VALUES (MD5('http://...'), 'http://...');
MD5() is a hash function and will most likely give you a unique identifier per URL. I say "most likely" because you get a 128-bit hash from MD5, and the likelihood of a hash collision is on the order of 2^-128 (about 10^-55).
If you need smaller identifiers you can chop the result from MD5 down to a smaller number of characters, but you could potentially significantly increase the chance of a hash collision depending on which characters you take.
[Note: timestamp answer redacted since it in no way solves the original problem. -BobG]