Does `Signature.init()` rewrite persistent memory? - applet

A few quotes from Java Card API 2.2.1:
Signature class description:
A tear or card reset event resets an initialized Signature object to
the state it was in when previously initialized via a call to init().
For algorithms which support keys with transient key data sets, such
as DES, triple DES, AES, and Korean SEED the Signature object key
becomes uninitialized on clear events associated with the Key object
used to initialize the Signature object.
Signature.init(...) description:
For optimal performance, when the theKey parameter is a transient key,
the implementation should, whenever possible, use transient space for
internal storage.
Does it mean that there are algorithms which require rewriting persistent memory each time Signature.init(...) is called? If yes, is there any reason for this unpleasant behaviour?
I am asking this question, because I am facing a strange behaviour in my applet. It computes ECDSA signature. After approximately 100 000 signatures the card (J2E145 by NXP) seems broken (I cannot select the applet anymore). Persistent memory damage might be the reason, because I call Signature.init(...) each time I receive the input data. Could the Signature.init(...) be the reason of this behaviour?

Yes, Signature.init() store the reference to the key object in a persistent memory. But, I believe that the API must be implemented with a feature of "wear leveling" which should maintain the endurance of EEPROM in such scenarios.
I suggest you to come to a conclusion after checking the behavior of Cipher.init(), if the card continues its similar behavior or not.

Related

decode SHA1 knowing stored value [duplicate]

Is it possible to reverse a SHA-1?
I'm thinking about using a SHA-1 to create a simple lightweight system to authenticate a small embedded system that communicates over an unencrypted connection.
Let's say that I create a sha1 like this with input from a "secret key" and spice it with a timestamp so that the SHA-1 will change all the time.
sha1("My Secret Key"+"a timestamp")
Then I include this SHA-1 in the communication and the server, which can do the same calculation. And hopefully, nobody would be able to figure out the "secret key".
But is this really true?
If you know that this is how I did it, you would know that I did put a timestamp in there and you would see the SHA-1.
Can you then use those two and figure out the "secret key"?
secret_key = bruteforce_sha1(sha1, timestamp)
Note1:
I guess you could brute force in some way, but how much work would that actually be?
Note2:
I don't plan to encrypt any data, I just would like to know who sent it.
No, you cannot reverse SHA-1, that is exactly why it is called a Secure Hash Algorithm.
What you should definitely be doing though, is include the message that is being transmitted into the hash calculation. Otherwise a man-in-the-middle could intercept the message, and use the signature (which only contains the sender's key and the timestamp) to attach it to a fake message (where it would still be valid).
And you should probably be using SHA-256 for new systems now.
sha("My Secret Key"+"a timestamp" + the whole message to be signed)
You also need to additionally transmit the timestamp in the clear, because otherwise you have no way to verify the digest (other than trying a lot of plausible timestamps).
If a brute force attack is feasible depends on the length of your secret key.
The security of your whole system would rely on this shared secret (because both sender and receiver need to know, but no one else). An attacker would try to go after the key (either but brute-force guessing or by trying to get it from your device) rather than trying to break SHA-1.
SHA-1 is a hash function that was designed to make it impractically difficult to reverse the operation. Such hash functions are often called one-way functions or cryptographic hash functions for this reason.
However, SHA-1's collision resistance was theoretically broken in 2005. This allows finding two different input that has the same hash value faster than the generic birthday attack that has 280 cost with 50% probability. In 2017, the collision attack become practicable as known as shattered.
As of 2015, NIST dropped SHA-1 for signatures. You should consider using something stronger like SHA-256 for new applications.
Jon Callas on SHA-1:
It's time to walk, but not run, to the fire exits. You don't see smoke, but the fire alarms have gone off.
The question is actually how to authenticate over an insecure session.
The standard why to do this is to use a message digest, e.g. HMAC.
You send the message plaintext as well as an accompanying hash of that message where your secret has been mixed in.
So instead of your:
sha1("My Secret Key"+"a timestamp")
You have:
msg,hmac("My Secret Key",sha(msg+msg_sequence_id))
The message sequence id is a simple counter to keep track by both parties to the number of messages they have exchanged in this 'session' - this prevents an attacker from simply replaying previous-seen messages.
This the industry standard and secure way of authenticating messages, whether they are encrypted or not.
(this is why you can't brute the hash:)
A hash is a one-way function, meaning that many inputs all produce the same output.
As you know the secret, and you can make a sensible guess as to the range of the timestamp, then you could iterate over all those timestamps, compute the hash and compare it.
Of course two or more timestamps within the range you examine might 'collide' i.e. although the timestamps are different, they generate the same hash.
So there is, fundamentally, no way to reverse the hash with any certainty.
In mathematical terms, only bijective functions have an inverse function. But hash functions are not injective as there are multiple input values that result in the same output value (collision).
So, no, hash functions can not be reversed. But you can look for such collisions.
Edit
As you want to authenticate the communication between your systems, I would suggest to use HMAC. This construct to calculate message authenticate codes can use different hash functions. You can use SHA-1, SHA-256 or whatever hash function you want.
And to authenticate the response to a specific request, I would send a nonce along with the request that needs to be used as salt to authenticate the response.
It is not entirely true that you cannot reverse SHA-1 encrypted string.
You cannot directly reverse one, but it can be done with rainbow tables.
Wikipedia:
A rainbow table is a precomputed table for reversing cryptographic hash functions, usually for cracking password hashes. Tables are usually used in recovering a plaintext password up to a certain length consisting of a limited set of characters.
Essentially, SHA-1 is only as safe as the strength of the password used. If users have long passwords with obscure combinations of characters, it is very unlikely that existing rainbow tables will have a key for the encrypted string.
You can test your encrypted SHA-1 strings here:
http://sha1.gromweb.com/
There are other rainbow tables on the internet that you can use so Google reverse SHA1.
Note that the best attacks against MD5 and SHA-1 have been about finding any two arbitrary messages m1 and m2 where h(m1) = h(m2) or finding m2 such that h(m1) = h(m2) and m1 != m2. Finding m1, given h(m1) is still computationally infeasible.
Also, you are using a MAC (message authentication code), so an attacker can't forget a message without knowing secret with one caveat - the general MAC construction that you used is susceptible to length extension attack - an attacker can in some circumstances forge a message m2|m3, h(secret, m2|m3) given m2, h(secret, m2). This is not an issue with just timestamp but it is an issue when you compute MAC over messages of arbitrary length. You could append the secret to timestamp instead of pre-pending but in general you are better off using HMAC with SHA1 digest (HMAC is just construction and can use MD5 or SHA as digest algorithms).
Finally, you are signing just the timestamp and the not the full request. An active attacker can easily attack the system especially if you have no replay protection (although even with replay protection, this flaw exists). For example, I can capture timestamp, HMAC(timestamp with secret) from one message and then use it in my own message and the server will accept it.
Best to send message, HMAC(message) with sufficiently long secret. The server can be assured of the integrity of the message and authenticity of the client.
You can depending on your threat scenario either add replay protection or note that it is not necessary since a message when replayed in entirety does not cause any problems.
Hashes are dependent on the input, and for the same input will give the same output.
So, in addition to the other answers, please keep the following in mind:
If you start the hash with the password, it is possible to pre-compute rainbow tables, and quickly add plausible timestamp values, which is much harder if you start with the timestamp.
So, rather than use
sha1("My Secret Key"+"a timestamp")
go for
sha1("a timestamp"+"My Secret Key")
I believe the accepted answer is technically right but wrong as it applies to the use case: to create & transmit tamper evident data over public/non-trusted mediums.
Because although it is technically highly-difficult to brute-force or reverse a SHA hash, when you are sending plain text "data & a hash of the data + secret" over the internet, as noted above, it is possible to intelligently get the secret after capturing enough samples of your data. Think about it - your data may be changing, but the secret key remains the same. So every time you send a new data blob out, it's a new sample to run basic cracking algorithms on. With 2 or more samples that contain different data & a hash of the data+secret, you can verify that the secret you determine is correct and not a false positive.
This scenario is similar to how Wifi crackers can crack wifi passwords after they capture enough data packets. After you gather enough data it's trivial to generate the secret key, even though you aren't technically reversing SHA1 or even SHA256. The ONLY way to ensure that your data has not been tampered with, or to verify who you are talking to on the other end, is to encrypt the entire data blob using GPG or the like (public & private keys). Hashing is, by nature, ALWAYS insecure when the data you are hashing is visible.
Practically speaking it really depends on the application and purpose of why you are hashing in the first place. If the level of security required is trivial or say you are inside of a 100% completely trusted network, then perhaps hashing would be a viable option. Hope no one on the network, or any intruder, is interested in your data. Otherwise, as far as I can determine at this time, the only other reliably viable option is key-based encryption. You can either encrypt the entire data blob or just sign it.
Note: This was one of the ways the British were able to crack the Enigma code during WW2, leading to favor the Allies.
Any thoughts on this?
SHA1 was designed to prevent recovery of the original text from the hash. However, SHA1 databases exists, that allow to lookup the common passwords by their SHA hash.
Is it possible to reverse a SHA-1?
SHA-1 was meant to be a collision-resistant hash, whose purpose is to make it hard to find distinct messages that have the same hash. It is also designed to have preimage-resistant, that is it should be hard to find a message having a prescribed hash, and second-preimage-resistant, so that it is hard to find a second message having the same hash as a prescribed message.
SHA-1's collision resistance is broken practically in 2017 by Google's team and NIST already removed the SHA-1 for signature purposes in 2015.
SHA-1 pre-image resistance, on the other hand, still exists. One should be careful about the pre-image resistance, if the input space is short, then finding the pre-image is easy. So, your secret should be at least 128-bit.
SHA-1("My Secret Key"+"a timestamp")
This is the pre-fix secret construction has an attack case known as the length extension attack on the Merkle-Damgard based hash function like SHA-1. Applied to the Flicker. One should not use this with SHA-1 or SHA-2. One can use
HMAC-SHA-256 (HMAC doesn't require the collision resistance of the hash function therefore SHA-1 and MD5 are still fine for HMAC, however, forgot about them) to achieve a better security system. HMAC has a cost of double call of the hash function. That is a weakness for time demanded systems. A note; HMAC is a beast in cryptography.
KMAC is the pre-fix secret construction from SHA-3, since SHA-3 has resistance to length extension attack, this is secure.
Use BLAKE2 with pre-fix construction and this is also secure since it has also resistance to length extension attacks. BLAKE is a really fast hash function, and now it has a parallel version BLAKE3, too (need some time for security analysis). Wireguard uses BLAKE2 as MAC.
Then I include this SHA-1 in the communication and the server, which can do the same calculation. And hopefully, nobody would be able to figure out the "secret key".
But is this really true?
If you know that this is how I did it, you would know that I did put a timestamp in there and you would see the SHA-1. Can you then use those two and figure out the "secret key"?
secret_key = bruteforce_sha1(sha1, timestamp)
You did not define the size of your secret. If your attacker knows the timestamp, then they try to look for it by searching. If we consider the collective power of the Bitcoin miners, as of 2022, they reach around ~293 double SHA-256 in a year. Therefore, you must adjust your security according to your risk. As of 2022, NIST's minimum security is 112-bit. One should consider the above 128-bit for the secret size.
Note1: I guess you could brute force in some way, but how much work would that actually be?
Given the answer above. As a special case, against the possible implementation of Grover's algorithm ( a Quantum algorithm for finding pre-images), one should use hash functions larger than 256 output size.
Note2: I don't plan to encrypt any data, I just would like to know who sent it.
This is not the way. Your construction can only work if the secret is mutually shared like a DHKE. That is the secret only known to party the sender and you. Instead of managing this, a better way is to use digital signatures to solve this issue. Besides, one will get non-repudiation, too.
Any hashing algorithm is reversible, if applied to strings of max length L. The only matter is the value of L. To assess it exactly, you could run the state of art dehashing utility, hashcat. It is optimized to get best performance of your hardware.
That's why you need long passwords, like 12 characters. Here they say for length 8 the password is dehashed (using brute force) in 24 hours (1 GPU involved). For each extra character multiply it by alphabet length (say 50). So for 9 characters you have 50 days, for 10 you have 6 years, and so on. It's definitely inaccurate, but can give us an idea, what the numbers could be.

Why passing by value is sometimes better than passing by reference

It is a common fact that it is better in a certain circumstances to pass a parameter by reference to avoid costly copying. But recently I watched a Handmade Hero series where Casey said that if the object is not too complex sometimes it's better to pass it by value. I'm not too familiar with low-level details, but I assume it's connected with a cache. Could someone give more solid explanation of what's going on?
If you pass by value you are likely passing via registers (assuming not too many arguments and each one is not too large). That means the callee doesn't need to do anything to use the values, they are already in registers. If passing by reference, the address of each value may be in a register, but that requires a dereference which needs to look in the CPU cache (if not main memory), which is slower.
On many popular systems you can pass-by-value roughly 5-10 values which are each as wide as an address.

How does the "Implementing FP languages with fast equality, sets and maps..." technique deal with garbage collection?

This paper presents a technique for the implementation of functional languages with fast equality, sets and maps, using hash-consing under the hoods. As far as my understanding goes, it uses the address of a hash-consed value as its key when inserting it on a map. This has the advantage that figuring the hashed key of essentially any value is O(1), as opposed to the O(N) standard. What I don't understand, though, is: what happens with a map after a garbage collection? Since the GC process will cause the address of every value to change, then the configuration of the map will be incorrect. In other words, there is no guarantee that addr(value) will be the same for the lifetime of the program.
Since the GC process will cause the address of every value to change
Only moving garbage collectors do that. When using non-moving algorithms like mark-and-sweep, all that happens is that unused objects are freed during the GC cycle - used objects stay exactly where they are.
Moving garbage collectors are generally seen as preferable to mark-and-sweep, but according to the abstract of the paper "mark-and-sweep becomes fast in a maximal sharing environment", which is further expanded on in section 2.4.4.
The paper also describes a way to make moving garbage collectors work (by assigning each object a unique id and using that instead of its address), but deems that impractical (section 2.4.2).

Does CGDataProviderCopyData() actually copy the bytes? Or just the pointer?

I'm running that method in quick succession as fast as I can, and the faster the better, so obviously if CGDataProviderCopyData() is actually copying the data byte-for-byte, then I think there must be a faster way to directly access that data...it's just bytes in memory. Anyone know for sure if CGDataProviderCopyData() actually copies the data? Or does it just create a new pointer to the existing data?
The bytes are copied.
Internally a CFData is created (CFDataCreate) with the content of the data provider. The CFDataCreate function always make a copy.
Anyone know for sure if CGDataProviderCopyData() actually copies the data? Or does it just create a new pointer to the existing data?
Those are the same thing. Pointers are memory addresses; by definition, if you have the same data at two addresses, it is the same data in two places, so you must have copied it (either from one to the other or to both from a common origin).
So, let's restate the question accordingly:
Or does it just copy the existing pointer?
Quartz can't necessarily do this, because data providers do not necessarily provide an existing pointer, as they can be implemented as essentially stream-based (sequential) providers instead.
What about direct-access providers? Even those need not cough up a byte pointer; the provider may simply offer range-on-demand access instead.
But what if it does offer a byte pointer? Well, the documentation for that says:
You must not move or modify the provider data until Quartz calls your CGDataProviderReleaseBytePointerCallback function.
So, conceivably, Quartz could reuse the pointer. But what if you release the data provider (causing your ReleaseBytePointer callback to be called) before you release the data?
This could still be safe if Quartz implements a private custom subclass of CFData or NSData that either implements faulting or takes over the job of calling ReleaseBytePointer, so that if you create a direct-access provider and create a CFData from it and release the provider, you can still use the CFData object.
But that's a lot of ifs. They probably just create a plain old (bytes-copying-at-creation-time) CFData, which makes it a valid performance concern.
Profile it and see how much pain it's causing you. If it's enough to worry about, then you need some solutions:
You could just implement ReleaseBytePointer as a no-op (empty function body) and release the bytes separately, making sure to do so after releasing both the provider and the data. In theory, prevents the bytes from going away out from under the CFData if it is using the original bytes pointer and Quartz doesn't implement a custom CFData subclass. A little hairy. Unfortunately, Apple can't really rely on you doing this, so I doubt it will actually help.
Handle an NS/CFData directly instead. Create the data provider only to pass it to Quartz, and release it and forget about it immediately thereafter (not own it yourself).
Depending on your needs, you may prefer to keep your callbacks structure in an instance variable and call them directly to copy parts of the data. Of course, if this solution works for you, then you don't have the problem described above anyway, since you aren't creating a here-you-can-have-my-bytes-pointer direct-access data provider.
The documentation for CGDataProviderCreateWithCFData doesn't say whether it returns a direct-access data provider or not, so you'll have to err on the side of caution if that's how you're creating your data provider.

What is the most practical Solution to Data Management using SQLite on the iPhone?

I'm developing an iPhone application and am new to Objective-C as well as SQLite. That being said, I have been struggling w/ designing a practical data management solution that is worthy of existing. Any help would be greatly appreciated.
Here's the deal:
The majority of the data my application interacts with is stored in five tables in the local SQLite database. Each table has a corresponding Class which handles initialization, hydration, dehydration, deletion, etc. for each object/row in the corresponding table. Whenever the application loads, it populates five NSMutableArrays (one for each type of object). In addition to a Primary Key, each object instance always has an ID attribute available, regardless of hydration state. In most cases it is a UUID which I can then easily reference.
Before a few days ago, I would simply access the objects via these arrays by tracking down their UUID. I would then proceed to hydrate/dehydrate them as I needed. However, some of the objects I have also maintain their own arrays which reference other object's UUIDs. In the event that I must track down one of these "child" objects via it's UUID, it becomes a bit more difficult.
In order to avoid having to enumerate through one of the previously mentioned arrays to find a "parent" object's UUID, and then proceed to find the "child's" UUID, I added a DataController w/ a singleton instance to simplify the process.
I had hoped that the DataController could provide a single access point to the local database and make things easier, but I'm not so certain that is the case. Basically, what I did is create multiple NSMutableDicationaries. Whenever the DataController is initialized, it enumerates through each of the previously mentioned NSMutableArrays maintained in the Application Delegate and creates a key/value pair in the corresponding dictionary, using the given object as the value and it's UUID as the key.
The DataController then exposes procedures that allow a client to call in w/ a desired object's UUID to retrieve a reference to the actual object. Whenever their is a request for an object, the DataController automatically hydrates the object in question and then returns it. I did this because I wanted to take control of hydration out of the client's hands to prevent dehydrating an object being referenced multiple times.
I realize that in most cases I could just make a mutable copy of the object and then if necessary replace the original object down the road, but I wanted to avoid that scenario if at all possible. I therefore added an additional dictionary to monitor what objects are hydrated at any given time using the object's UUID as the key and a fluctuating count representing the number of hydrations w/out an offset dehydration. My goal w/ this approach was to have the DataController automatically dehydrate any object once it's "hydration retainment count" hit zero, but this could easily lead to significant memory leaks as it currently relies on the caller to later call a procedure that decreases the hydration retainment count of the object. There are obviously many cases when this is just not obvious or maybe not even easily accomplished, and if only one calling object fails to do so properly I encounter the exact opposite scenario I was trying to prevent in the first place. Ironic, huh?
Anyway, I'm thinking that if I proceed w/ this approach that it will just end badly. I'm tempted to go back to the original plan but doing so makes me want to cringe and I'm sure there is a more elegant solution floating around out there. As I said before, any advice would be greatly appreciated. Thanks in advance.
I'd also be aware (as I'm sure you are) that CoreData is just around the corner, and make sure you make the right choice for the future.
Have you considered implementing this via the NSCoder interface? Not sure that it wouldn't be more trouble than it's worth, but if what you want is to extract all the data out into an in-memory object graph, and save it back later, that might be appropriate. If you're actually using SQL queries to limit the amount of in-memory data, then obviously, this wouldn't be the way to do it.
I decided to go w/ Core Data after all.