Obtaining Data Keys using some KMS region master keys and then adding two more regions to get the same Data Key encrypted - amazon-kms

I am generating a data encryption key implicitly as follows (key IDs used are just representational):
from aws_encryption_sdk import encrypt
# Key provider with only 2 region master keys to begin with
kms_key_provider = KMSMasterKeyProvider(key_ids=[“west-1”, “west-2”])
# encrypt something random only to get the encrypted data keys in the header from those 2 regions
my_ciphertext, encryptor_header = encrypt(source=“somerandomplaintextofnorelevance”, key_provider= kms_key_provider, algorithm=AWSKeyProvider.DEFAULT_ALGORITHM, encryption_context={“somekey”: “some value”})
my_data_keys = []
for dek in encryptor_header.encrypted_data_keys:
my_data_keys.append(dek.encrypted_data_key)
I get two encrypted Data Encryption Keys (DEK) strings in my_data_keys (say, DEK_enc_west_1 and DEK_enc_west_2) both of which would decrypt to a single plain data encryption key, say, DEK_Plain. Now I can encrypt/decrypt for DEK_Plain in either of the regions for redundancy.
Then, I go on and activate two more master keys in regions east-1 and east-2. Now I want that same DEK_Plain to be also encrypted under those two new region (east-1 & east-2) master keys to get two new encrypted data keys (say, DEK_enc_east_1 and DEK_enc_east_2).
So, with the new fully formed Key Provider like:
kms_key_provider = KMSMasterKeyProvider(key_ids=[“west-1”, “west-2”, “east-1”, “east-2”])
I can get my DEK_Plain from any of these 4 regions using:
my_plain_data_key = kms_key_provider.decrypt_data_key_from_list(…..)
Basically, how can I add additional region master keys to be leveraged for the same data encryption key that was generated and encrypted earlier using some other regions master keys which existed before?

Looking around in the AWS crypto doc, I find something like the following sample helping my case (though it would have been ideal if KMS key provider implementation had such region extender capability for the data keys:). In the following assume plain_dek is assigned the DEK_Plain from the question above.
new_region_key_id_1 = "arn:aws:kms:us-east-1:XXXXXXXXXX:alias/xyz/master"
new_region_master_key_1 = kms_key_provider.master_key_for_encrypt(new_region_key_id_1)
key_provider_info = {"provider_id": u'aws-kms', "key_info": new_region_key_id_1}
key_provider_info_obj = MasterKeyInfo(**key_provider_info)
plain_dk_raw = RawDataKey(key_provider_info_obj, plain_dek)
encryption_context = {“somekey”: “some value”}
new_encrypted_dek_from_region_new_master_key_1 = new_region_master_key_1.encrypt_data_key(plain_dk_raw, AES_256_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384, encryption_context)
This can be repeated in loop for any number of new kms regions thereby extending an existing data key to be decrypt'able in broader AWS regions where new master keys were, say, recently added.

Related

Why can't `blake2_256` prevent the "first key pair" in a StorageDoubleMap from being compromised when using decl_storage?

decl_storage! is a "procedural macro" used for storing data to make it available in subsequent blocks.
It says if the user is able to set the first key pair in the double_map, then we cannot trust that key pair, and so we must use a cryptographic hasher such as blake2_256 to prevent "other values of all storage items being compromised".
Then it goes on to say that if the user is able to set the second key pair in the double_map, then we cannot trust that key pair, and so we must use a cryptographic hasher such as blake2_256 to prevent "other items in storage with the same first key being compromised".
With regard to the first key pair, why does it say that it's just to prevent "other values of all storage items being compromised"? Isn't blake2_256 also used to prevent the first key pair itself from being compromised (rather than just "other values")?
Let's say the hash of module1.someValue is 0x12345678
hash of module2.doubleMapValue.firstKey(value1) is 0x1234
hash of module2.doubleMapValue.secondKey(value2) is 0x5678
This means module2.doubleMapValue.fullKey(value1, value2) and module1.someValue have same hash. i.e. the values are stored in the same place.
If a user is able to control both keys of module2.doubleMapValue and figure out the value of value1 and value2, then they will be able to override the value of module1.someValue and cause security issues.
That's why the hash function of key1 of double map needs to be a cryptographic hasher if the value is controlled by a user. Otherwise a user may be able to craft value1 such it collides with the storage of all other modules, and hence compromise them.
In case a user does not control key2, double map provides a clear all keys with hash(key1) prefix feature that could be hijacked to cause troubles as well.

Why are these resulting symmetric encryption values different?

I'm using something like this:
OPEN SYMMETRIC KEY SSNKey
DECRYPTION BY CERTIFICATE SSNCert;
UPDATE
Customers
SET
SSNEncrypted = EncryptByKey(Key_GUID('SSNKey'), 'DecryptedSSN')
Where SSNEncrypted is a varbinary column. I noticed the values come out different each time. Why is this? And what can I do to get consistent encrypted values, so I can compare them in different tables?
This is "by design".
The function EncryptByKey is nondeterministic.
But if you decrypt the different values you always get the original decrypted value.
Have a look at this blog on MSDN.

Generate C* bucket hash from multipart primary key

I will have C* tables that will be very wide. To prevent them to become too wide I have encountered a strategy that could suit me well. It was presented in this video.
Bucket Your Partitions Wisely
The good thing with this strategy is that there is no need for a "look-up-table" (it is fast), the bad part is that one needs to know the max amount of buckets and eventually end up with no more buckets to use (not scalable). I know my max bucket size so I will try this.
By calculating a hash from the tables primary keys this can be used as a bucket part together with the rest of the primary keys.
I have come up with the following method to be sure (I think?) that the hash always will be the same for a specific primary key.
Using Guava Hashing:
public static String bucket(List<String> primKeyParts, int maxBuckets) {
StringBuilder combinedHashString = new StringBuilder();
primKeyParts.forEach(part ->{
combinedHashString.append(
String.valueOf(
Hashing.consistentHash(Hashing.sha512()
.hashBytes(part.getBytes()), maxBuckets)
)
);
});
return combinedHashString.toString();
}
The reason I use sha512 is to be able to have strings with max characters of 256 (512 bit) otherwise the result will never be the same (as it seems according to my tests).
I am far from being a hashing guru, hence I'm asking the following questions.
Requirement: Between different JVM executions on different nodes/machines the result should always be the same for a given Cassandra primary key?
Can I rely on the mentioned method to do the job?
Is there a better solution of hashing large strings so they always will produce the same result for a given string?
Do I always need to hash from string or could there be a better way of doing this for a C* primary key and always produce same result?
Please, I don't want to discuss data modeling for a specific table, I just want to have a bucket strategy.
EDIT:
Elaborated further and came up with this so the length of string can be arbitrary. What do you say about this one?
public static int murmur3_128_bucket(int maxBuckets, String... primKeyParts) {
List<HashCode> hashCodes = new ArrayList();
for(String part : primKeyParts) {
hashCodes.add(Hashing.murmur3_128().hashString(part, StandardCharsets.UTF_8));
};
return Hashing.consistentHash(Hashing.combineOrdered(hashCodes), maxBuckets);
}
I currently use a similar solution in production. So for your method I would change to:
public static int bucket(List<String> primKeyParts, int maxBuckets) {
String keyParts = String.join("", primKeyParts);
return Hashing.consistentHash(
Hashing.murmur3_32().hashString(keyParts, Charsets.UTF_8),
maxBuckets);
}
So the differences
Send all the PK parts into the hash function at once.
We actually set the max buckets as a code constant since the consistent hash is only if the max buckets stay the same.
We use MurMur3 hash since we want it to be fast not cryptographically strong.
For your direct questions 1) Yes the method should do the job. 2) I think with the tweaks above you should be set. 3) The assumption is you need the whole PK?
I'm not sure you need to use the whole primary key since the expectation is that your partition part of your primary key is gonna be the same for many things which is why you are bucketing. You could just hash the bits that will provide you with good buckets to use in your partition key. In our case we just hash some of the clustering key parts of the PK to generate the bucket id we use as part of the partition key.

How to "EXPIRE" the "HSET" child key in redis?

I need to expire all keys in redis hash, which are older than 1 month.
This is not possible, for the sake of keeping Redis simple.
Quoth Antirez, creator of Redis:
Hi, it is not possible, either use a different top-level key for that
specific field, or store along with the filed another field with an
expire time, fetch both, and let the application understand if it is
still valid or not based on current time.
Redis does not support having TTL on hashes other than the top key, which would expire the whole hash. If you are using a sharded cluster, there is another approach you could use. This approach could not be useful in all scenarios and the performance characteristics might differ from the expected ones. Still worth mentioning:
When having a hash, the structure basically looks like:
hash_top_key
- child_key_1 -> some_value
- child_key_2 -> some_value
...
- child_key_n -> some_value
Since we want to add TTL to the child keys, we can move them to top keys. The main point is that the key now should be a combination of hash_top_key and child key:
{hash_top_key}child_key_1 -> some_value
{hash_top_key}child_key_2 -> some_value
...
{hash_top_key}child_key_n -> some_value
We are using the {} notation on purpose. This allows all those keys to fall in the same hash slot. You can read more about it here: https://redis.io/topics/cluster-tutorial
Now if we want to do the same operation of hashes, we could do:
HDEL hash_top_key child_key_1 => DEL {hash_top_key}child_key_1
HGET hash_top_key child_key_1 => GET {hash_top_key}child_key_1
HSET hash_top_key child_key_1 some_value => SET {hash_top_key}child_key_1 some_value [some_TTL]
HGETALL hash_top_key =>
keyslot = CLUSTER KEYSLOT {hash_top_key}
keys = CLUSTER GETKEYSINSLOT keyslot n
MGET keys
The interesting one here is HGETALL. First we get the hash slot for all our children keys. Then we get the keys for that particular hash slot and finally we retrieve the values. We need to be careful here since there could be more than n keys for that hash slot and also there could be keys that we are not interested in but they have the same hash slot. We could actually write a Lua script to do those steps in the server by executing an EVAL or EVALSHA command. Again, you need to take into consideration the performance of this approach for your particular scenario.
Some more references:
https://redis.io/commands/cluster-keyslot
https://redis.io/commands/cluster-getkeysinslot
https://redis.io/commands/eval
This is possible in KeyDB which is a Fork of Redis. Because it's a Fork its fully compatible with Redis and works as a drop in replacement.
Just use the EXPIREMEMBER command. It works with sets, hashes, and sorted sets.
EXPIREMEMBER keyname subkey [time]
You can also use TTL and PTTL to see the expiration
TTL keyname subkey
More documentation is available here: https://docs.keydb.dev/docs/commands/#expiremember
You can use Sorted Set in redis to get a TTL container with timestamp as score.
For example, whenever you insert a event string into the set you can set its score to the event time.
Thus you can get data of any time window by calling
zrangebyscore "your set name" min-time max-time
Moreover, we can do expire by using zremrangebyscore "your set name" min-time max-time to remove old events.
The only drawback here is you have to do housekeeping from an outsider process to maintain the size of the set.
Elon Musk will soon send people to the moon and we still cannot expire fields on redis :(
Anyway the solution I've been come up with is:
Lets say I want to expire every 3 minutes:
So im holding the data in 3 fields 0 1 2.
and then i do module% 3 to current time in minutes.
if the module for example == 0
so im using only 1 2 and 0 i delete;
then it change to 1 so im using 2 and 0 and delete 1.
Im not using it and i didnt checked it but im just let you know its possible
There is a Redisson java framework which implements hash Map object with entry TTL support. It uses hmap and zset Redis objects under the hood. Usage example:
RMapCache<Integer, String> map = redisson.getMapCache('map');
map.put(1, 30, TimeUnit.DAYS); // this entry expires in 30 days
This approach is quite useful.
We had the same problem discussed here.
We have a Redis hash, a key to hash entries (name/value pairs), and we needed to hold individual expiration times on each hash entry.
We implemented this by adding n bytes of prefix data containing encoded expiration information when we write the hash entry values, we also set the key to expire at the time contained in the value being written.
Then, on read, we decode the prefix and check for expiration. This is additional overhead, however, the reads are still O(n) and the entire key will expire when the last hash entry has expired.
Regarding a NodeJS implementation, I have added a custom expiryTime field in the object I save in the HASH. Then after a specific period time, I clear the expired HASH entries by using the following code:
client.hgetall(HASH_NAME, function(err, reply) {
if (reply) {
Object.keys(reply).forEach(key => {
if (reply[key] && JSON.parse(reply[key]).expiryTime < (new Date).getTime()) {
client.hdel(HASH_NAME, key);
}
})
}
});
If your use-case is that you're caching values in Redis and are tolerant of stale values but would like to refresh them occasionally so that they don't get too stale, a hacky workaround is to just include a timestamp in the field value and handle expirations in whatever place you're accessing the value.
This allows you to keep using Redis hashes normally without needing to worry about any complications that might arise from the other approaches. The only cost is a bit of extra logic and parsing on the client end. Not a perfect solution, but it's what I typically do as I haven't needed TTL for any other reason and I'm usually needing to do extra parsing on the cached value anyways.
So basically it'll be something like this:
In Redis:
hash_name
- field_1: "2021-01-15;123"
- field_2: "2021-01-20;125"
- field_2: "2021-02-01;127"
Your (pseudo)code:
val = redis.hget(hash_name, field_1)
timestamp = val.substring(0, val.index_of(";"))
if now() > timestamp:
new_val = get_updated_value()
new_timestamp = now() + EXPIRY_LENGTH
redis.hset(hash_name, field_1, new_timestamp + ";" + new_val)
val = new_val
else:
val = val.substring(val.index_of(";"))
// proceed to use val
The biggest caveat imo is that you don't ever remove fields so the hash can grow quite large. Not sure there's an elegant solution for that - I usually just delete the hash every once in a while if it feels too big. Maybe you could keep track of everything you've stored somewhere and remove them periodically (though at that point, you might as well just be using that mechanism to expire the fields manually...).
You could store key/values in Redis differently to achieve this, by just adding a prefix or namespace to your keys when you store them e.g. "hset_"
Get a key/value GET hset_key equals to HGET hset key
Add a key/value SET hset_key value equals to HSET hset key
Get all keys KEYS hset_* equals to HGETALL hset
Get all vals should be done in 2 ops, first get all keys KEYS hset_* then get the value for each key
Add a key/value with TTL or expire which is the topic of question:
SET hset_key value
EXPIRE hset_key
Note: KEYS will lookup up for matching the key in the whole database which may affect on performance especially if you have big database.
Note:
KEYS will lookup up for matching the key in the whole database which may affect on performance especially if you have big database. while SCAN 0 MATCH hset_* might be better as long as it doesn't block the server but still performance is an issue in case of big database.
You may create a new database for storing separately these keys that you want to expire especially if they are small set of keys.
Thanks to #DanFarrell who highlighted the performance issue related to
KEYS
You can. Here is an example.
redis 127.0.0.1:6379> hset key f1 1
(integer) 1
redis 127.0.0.1:6379> hset key f2 2
(integer) 1
redis 127.0.0.1:6379> hvals key
1) "1"
2) "1"
3) "2"
redis 127.0.0.1:6379> expire key 10
(integer) 1
redis 127.0.0.1:6379> hvals key
1) "1"
2) "1"
3) "2"
redis 127.0.0.1:6379> hvals key
1) "1"
2) "1"
3) "2"
redis 127.0.0.1:6379> hvals key
Use EXPIRE or EXPIREAT command.
If you want to expire specific keys in the hash older then 1 month. This is not possible.
Redis expire command is for all keys in the hash.
If you set daily hash key, you can set a keys time to live.
hset key-20140325 f1 1
expire key-20140325 100
hset key-20140325 f1 2
You could use the Redis Keyspace Notifications by using psubscribe and "__keyevent#<DB-INDEX>__:expired".
With that, each time that a key will expire, you will get a message published on your redis connection.
Regarding your question basically you create a temporary "normal" key using set with an expiration time in s/ms. It should match the name of the key that you wish to delete in your set.
As your temporary key will be published to your redis connection holding the "__keyevent#0__:expired" when it expired, you can easily delete your key from your original set as the message will have the name of the key.
A simple example in practice on that page : https://medium.com/#micah1powell/using-redis-keyspace-notifications-for-a-reminder-service-with-node-c05047befec3
doc : https://redis.io/topics/notifications ( look for the flag xE)
static async setCount(ip: string, count: number) {
const val = await redisClient.hSet(ip, 'ipHashField', count)
await redisClient.expire(ip, this.expireTime)
}
Try expire your key.

c# Looking for data structure which can take muliple values to create a key

I'm looking for a data structure which can take muliple input values to generate a key, a Guid key for instance, store the key and the values which are returned from xpath:regexp node lookups, call it a domain registry, and then be able to take the key and store another chunk of data, say Arbitory... into for instance, into an IDictionary
Then be able to take that that selfsame returned xpath:regexp xml node lookup data and lookup the data structure for the key, to look into the IDictionary and return Arbitory.
It seems fairly simple on the surface, but the Key could have 2 Guid, plus 1..N xpath:regexp lookup. An example of the xpath:regexp lookup would be.
/idmef:IDMEF-Message/idmef:Alert/idmef:Classification/#text: [Ll]ogin|[Aa]uthentication
Placement variables are used to mark the returned xml, so the whole xml message is $0, while $1 would be Login Authentication, $2 would be the next xpath:regexp lookup. Potentially their could be 1.N xpath:regexp lookups into the xml message.
So say I used string appendage to generate the key, the key could be potentially 100's of characaters long, because its made up of 2 guid + 1.N of $0, $1, etc. That was the original way I was planning to do it. But appending returned string would be massively inefficient.
So the question is - Is their a C# data structure where a key generator can take 1.N values and return a unique key, which can be used again to return that key using the same 1.N values.
I hope it's fairly clear what i'm looking for. Any help would be appreciated.
scope_creep
Well,
I changed the data structure to remove the above limitation.
Thanks for the help MusiGenesis.
Bob.