I am trying to get MD5 hash of my data (image downloaded from the interweb). Unfortunately I have upgraded the framework to swift 3 and the method I have been using doesn't work now.
I have converted most of it but I am unable to get bytes out of the data:
import Foundation
import CommonCrypto
struct MD5 {
static func get(data: Data) -> String {
var digest = [UInt8](repeating: 0, count: Int(CC_MD5_DIGEST_LENGTH))
CC_MD5(data.bytes, CC_LONG(data.count), &digest)
var digestHex = ""
for index in 0..<Int(CC_MD5_DIGEST_LENGTH) {
digestHex += String(format: "%02x", digest[index])
}
return digestHex
}
}
the CommonCrypto is already imported as a custom module. Problem is I am getting 'bytes' is unavailable: use withUnsafeBytes instead on CC_MD5(data.bytes,...
So the question really is, how do I get the bytes out of the data and will this solution work?
CC_MD5(data.bytes, CC_LONG(data.count), &digest)
As noted, bytes is unavailable because it's dangerous. It's a raw pointer into memory than can vanish. The recommended solution is to use withUnsafeBytes which promises that the target cannot vanish during the scope of the pointer. From memory, it would look something like this:
data.withUnsafeBytes { bytes in
CC_MD5(bytes, CC_LONG(data.count), &digest)
}
The point is that the bytes pointer can't escape into scopes where data is no longer valid.
For an example of this with CCHmac, which is pretty similar to MD5, see RNCryptor.
Here's a one liner:
import CryptoKit
let md5String = Insecure.MD5.hash(data: data).map { String(format: "%02hhx", $0) }.joined()
And for anyone that's interested, here's an example that you could build upon to support different Algorithms:
Usage:
Checksum.hash(data: data, using: .md5) == "MyMD5Hash"
Code Snippet:
import Foundation
import CommonCrypto
struct Checksum {
private init() {}
static func hash(data: Data, using algorithm: HashAlgorithm) -> String {
/// Creates an array of unsigned 8 bit integers that contains zeros equal in amount to the digest length
var digest = [UInt8](repeating: 0, count: algorithm.digestLength())
/// Call corresponding digest calculation
data.withUnsafeBytes {
algorithm.digestCalculation(data: $0.baseAddress, len: UInt32(data.count), digestArray: &digest)
}
var hashString = ""
/// Unpack each byte in the digest array and add them to the hashString
for byte in digest {
hashString += String(format:"%02x", UInt8(byte))
}
return hashString
}
/**
* Hash using CommonCrypto
* API exposed from CommonCrypto-60118.50.1:
* https://opensource.apple.com/source/CommonCrypto/CommonCrypto-60118.50.1/include/CommonDigest.h.auto.html
**/
enum HashAlgorithm {
case md5
case sha256
func digestLength() -> Int {
switch self {
case .md5:
return Int(CC_MD5_DIGEST_LENGTH)
case .sha256:
return Int(CC_SHA256_DIGEST_LENGTH)
}
}
/// CC_[HashAlgorithm] performs a digest calculation and places the result in the caller-supplied buffer for digest
/// Calls the given closure with a pointer to the underlying unsafe bytes of the data's contiguous storage.
func digestCalculation(data: UnsafeRawPointer!, len: UInt32, digestArray: UnsafeMutablePointer<UInt8>!) {
switch self {
case .md5:
CC_MD5(data, len, digestArray)
case .sha256:
CC_SHA256(data, len, digestArray)
}
}
}
}
Related
Those two functions seem to have very close signatures and very close descriptions:
https://developer.apple.com/documentation/swift/string/init(utf8string:)-3mcco
String.init(utf8String:)
Creates a string by copying the data from a given null-terminated C array of UTF8-encoded bytes.
https://developer.apple.com/documentation/swift/string/init(validatingutf8:)-208fn
String.init(validatingUTF8:)
Creates a new string by copying and validating the null-terminated UTF-8 data referenced by the given pointer.
Since both are nullable initializers, what are their actual differences? Is there a possible input that would give a different output for each? If they are identical in behavior, then which one is recommended to use?
String.init(validatingUTF8:) is a method from the Swift standard library, the implementation is in CString.swift:
public init?(validatingUTF8 cString: UnsafePointer<CChar>) {
let len = UTF8._nullCodeUnitOffset(in: cString)
guard let str = cString.withMemoryRebound(to: UInt8.self, capacity: len, {
String._tryFromUTF8(UnsafeBufferPointer(start: $0, count: len))
})
else { return nil }
self = str
}
String.init(utf8String:) is implemented in NSStringAPI.swift:
/// Creates a string by copying the data from a given
/// C array of UTF8-encoded bytes.
public init?(utf8String bytes: UnsafePointer<CChar>) {
if let str = String(validatingUTF8: bytes) {
self = str
return
}
if let ns = NSString(utf8String: bytes) {
self = String._unconditionallyBridgeFromObjectiveC(ns)
} else {
return nil
}
}
and is the Swift overlay for the Foundation NSString initializer
- (nullable instancetype)initWithUTF8String:(const char *)nullTerminatedCString
which in turn is implemented non-Apple platforms in swift-corelibs-foundation/Sources/Foundation/NSString.swift as
public convenience init?(utf8String nullTerminatedCString: UnsafePointer<Int8>) {
guard let str = String(validatingUTF8: nullTerminatedCString) else { return nil }
self.init(str)
}
So there is no difference in how the two methods convert C strings, but String.init(utf8String:) needs import Foundation whereas String.init(validatingUTF8:) does not need additional imports.
I want to implement digest authentication with Swift. Unfortunately after hours of testing I saw that using this method of creating the md5 hash gives me the wrong result.
extension String {
var md5: String {
let data = Data(self.utf8)
let hash = data.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) -> [UInt8] in
var hash = [UInt8](repeating: 0, count: Int(CC_MD5_DIGEST_LENGTH))
CC_MD5(bytes.baseAddress, CC_LONG(data.count), &hash)
return hash
}
return hash.map { String(format: "%02x", $0) }.joined()
}
}
using this string
let test = "test:testrealm#host.com:pwd123".md5
test has the value: 4ec2086d6f09366e4683dbdc5809444a but it should have 939e7578ed9e3c518a452acee763bce9 (following a digest auth. documentation). So me digest was always calculated in the wrong manner.
Thanks
Arnold
My error, it gives me the right result. I had an error computing the hash. The string extension is ok.
The only answer I found was in this, and I'm not satisfied with it.
I am adding a standard MD5 converter as a String extension:
/* ###################################################################################################################################### */
/**
From here: https://stackoverflow.com/q/24123518/879365
I am not making this public, because it requires the common crypto in the bridging header.
*/
fileprivate extension String {
/* ################################################################## */
/**
- returns: the String, as an MD5 hash.
*/
var md5: String {
let str = self.cString(using: String.Encoding.utf8)
let strLen = CUnsignedInt(self.lengthOfBytes(using: String.Encoding.utf8))
let digestLen = Int(CC_MD5_DIGEST_LENGTH)
let result = UnsafeMutablePointer<CUnsignedChar>.allocate(capacity: digestLen)
CC_MD5(str!, strLen, result)
let hash = NSMutableString()
for i in 0..<digestLen {
hash.appendFormat("%02x", result[i])
}
result.deallocate()
return hash as String
}
}
It requires that I add the following to my bridging header:
#import <CommonCrypto/CommonCrypto.h>
Since I'd like to add this to a suite of reusable tools, I'd like to see if there was a way to detect, at compile time, whether or not the common crypto library was being used.
Is there a way for me to set this up as a conditional compile?
It's not a big deal if not; just means that I'll need to set this up as a separate source file.
It might be worth noting that you can call CC_MD5 without a bridging header, if you use dlsym to access it.
import Foundation
typealias CC_MD5_Type = #convention(c) (UnsafeRawPointer, UInt32, UnsafeMutableRawPointer) -> UnsafeMutableRawPointer
let RTLD_DEFAULT = UnsafeMutableRawPointer(bitPattern: -2)
let CC_MD5 = unsafeBitCast(dlsym(RTLD_DEFAULT, "CC_MD5")!, to: CC_MD5_Type.self)
var md5 = Data(count: 16)
md5.withUnsafeMutableBytes {
_ = CC_MD5("abc", 3, $0)
}
assert(md5 == Data(bytes: [0x90, 0x01, 0x50, 0x98, 0x3C, 0xD2, 0x4F, 0xB0, 0xD6, 0x96, 0x3F, 0x7D, 0x28, 0xE1, 0x7F, 0x72]))
Here was the solution I hit. This was a mash-up of my original variant, and Rob's excellent answer. It works a charm (I use it for building responses to RFC2617 Digest authentication). Because of the use of the built-in hooks, I don't need the bridging header anymore, and can add this to my set of String extensions.
pretty classic case of the correct answer coming from a completely different place from where I was looking. I love it when that happens.
Here ya go:
public extension String {
/* ################################################################## */
/**
From here: https://stackoverflow.com/q/24123518/879365, but modified from here: https://stackoverflow.com/a/55639723/879365
- returns: an MD5 hash of the String
*/
var md5: String {
var hash = ""
// Start by getting a C-style string of our string as UTF-8.
if let str = self.cString(using: .utf8) {
// This is a cast for the MD5 function. The convention attribute just says that it's a "raw" C function.
typealias CC_MD5_Type = #convention(c) (UnsafeRawPointer, UInt32, UnsafeMutableRawPointer) -> UnsafeMutableRawPointer
// This is a flag, telling the name lookup to happen in the global scope. No dlopen required.
let RTLD_DEFAULT = UnsafeMutableRawPointer(bitPattern: -2)
// This loads a function pointer with the CommonCrypto MD5 function.
let CC_MD5 = unsafeBitCast(dlsym(RTLD_DEFAULT, "CC_MD5")!, to: CC_MD5_Type.self)
// This is the length of the hash
let CC_MD5_DIGEST_LENGTH = 16
// This is where our MD5 hash goes. It's a simple 16-byte buffer.
let result = UnsafeMutablePointer<CUnsignedChar>.allocate(capacity: CC_MD5_DIGEST_LENGTH)
// Execute the MD5 hash. Save the result in our buffer.
_ = CC_MD5(str, CUnsignedInt(str.count), result)
// Turn it into a normal Swift String of hex digits.
for i in 0..<CC_MD5_DIGEST_LENGTH {
hash.append(String(format: "%02x", result[i]))
}
// Don't need this anymore.
result.deallocate()
}
return hash
}
}
I've tried generating the hash_hmac('sha256', $key, $secret_key) php function equivalent in Swift 4 without success, after using libraries like CommonCrypto, CryptoSwift. I need these function for API authentication, using Alamofire library, which is a great library. Since i use Swift 4 the compatibility with other Swift libraries is not so good. Even with CryptoSwift which has the latest version(0.7.1) for Swift 4 i still get a lot of compatibility errors likes
enter image description here
Swift 3/4:
HMAC with MD5, SHA1, SHA224, SHA256, SHA384, SHA512 (Swift 3)
These functions will hash either String or Data input with one of eight cryptographic hash algorithms.
The name parameter specifies the hash function name as a String
Supported functions are MD5, SHA1, SHA224, SHA256, SHA384 and SHA512
This example requires Common Crypto
It is necessary to have a bridging header to the project:
#import <CommonCrypto/CommonCrypto.h>
Add the Security.framework to the project.
These functions takes a hash name, message to be hashed, a key and return a digest:
hashName: name of a hash function as String
message: message as Data
key: key as Data
returns: digest as Data
func hmac(hashName:String, message:Data, key:Data) -> Data? {
let algos = ["SHA1": (kCCHmacAlgSHA1, CC_SHA1_DIGEST_LENGTH),
"MD5": (kCCHmacAlgMD5, CC_MD5_DIGEST_LENGTH),
"SHA224": (kCCHmacAlgSHA224, CC_SHA224_DIGEST_LENGTH),
"SHA256": (kCCHmacAlgSHA256, CC_SHA256_DIGEST_LENGTH),
"SHA384": (kCCHmacAlgSHA384, CC_SHA384_DIGEST_LENGTH),
"SHA512": (kCCHmacAlgSHA512, CC_SHA512_DIGEST_LENGTH)]
guard let (hashAlgorithm, length) = algos[hashName] else { return nil }
var macData = Data(count: Int(length))
macData.withUnsafeMutableBytes {macBytes in
message.withUnsafeBytes {messageBytes in
key.withUnsafeBytes {keyBytes in
CCHmac(CCHmacAlgorithm(hashAlgorithm),
keyBytes, key.count,
messageBytes, message.count,
macBytes)
}
}
}
return macData
}
hashName: name of a hash function as String
message: message as String
key: key as String
returns: digest as Data
func hmac(hashName:String, message:String, key:String) -> Data? {
let messageData = message.data(using:.utf8)!
let keyData = key.data(using:.utf8)!
return hmac(hashName:hashName, message:messageData, key:keyData)
}
hashName: name of a hash function as String
message: message as String
key: key as Data
returns: digest as Data
func hmac(hashName:String, message:String, key:Data) -> Data? {
let messageData = message.data(using:.utf8)!
return hmac(hashName:hashName, message:messageData, key:key)
}
// Examples
let clearString = "clearData0123456"
let keyString = "keyData8901234562"
let clearData = clearString.data(using:.utf8)!
let keyData = keyString.data(using:.utf8)!
print("clearString: \(clearString)")
print("keyString: \(keyString)")
print("clearData: \(clearData as NSData)")
print("keyData: \(keyData as NSData)")
let hmacData1 = hmac(hashName:"SHA1", message:clearData, key:keyData)
print("hmacData1: \(hmacData1! as NSData)")
let hmacData2 = hmac(hashName:"SHA1", message:clearString, key:keyString)
print("hmacData2: \(hmacData2! as NSData)")
let hmacData3 = hmac(hashName:"SHA1", message:clearString, key:keyData)
print("hmacData3: \(hmacData3! as NSData)")
Output:
clearString: clearData0123456
keyString: keyData8901234562
clearData: <636c6561 72446174 61303132 33343536>
keyData: <6b657944 61746138 39303132 33343536 32>
hmacData1: <bb358f41 79b68c08 8e93191a da7dabbc 138f2ae6>
hmacData2: <bb358f41 79b68c08 8e93191a da7dabbc 138f2ae6>
hmacData3: <bb358f41 79b68c08 8e93191a da7dabbc 138f2ae6>
First of all it might be better to go straight for SHA512, SHA is notoriously easy to crack with GPU's, thus upping the memory scale a bit is not a bad idea.
Second, using CommonCrypto it is actually extremely easy to generate HMAC's, this is the implementation that I use:
static func hmac(_ secretKey: inout [UInt8], cipherText: inout [UInt8], algorithm: CommonCrypto.HMACAlgorithm = .sha512) -> [UInt8] {
var mac = [UInt8](repeating: 0, count: 64)
CCHmac(algorithm.value, &secretKey, secretKey.count, &cipherText, cipherText.count, &mac)
return mac
}
Where the algorithm is defined as such:
enum HMACAlgorithm {
case sha512
var value: UInt32 {
switch(self) {
case .sha512:
return UInt32(kCCHmacAlgSHA512)
}
}
}
My cipher text is cipherText+IV in this instance. When you are not using AES-GCM it seems suggested / recommended to HMAC IV+Cipher, but I cannot give you the technical details as to why.
Converting Data or NSData to a byte array:
var byteArray = data.withUnsafeBytes { [UInt8](UnsafeBufferPointer(start: $0, count: data.count) }
The reason for using an array is a substantial performance increase over Data, I don't know what the core team is doing but Data performs worse than NSMutableData even.
My application uses a somewhat complex inmutable data structure that is encoded in a binary file. I need to have access to it at the byte level, avoiding any copying. Normally, I would use C or C++ pointer arithmetic and typecasts, to access and interpret the raw byte values. I would like to do the same with Swift.
I have found that the following works:
class RawData {
var data: NSData!
init(rawData: NSData) {
data = rawData
}
func read<T>(byteLocation: Int) -> T {
let bytes = data.subdataWithRange(NSMakeRange(byteLocation, sizeof(T))).bytes
return UnsafePointer<T>(bytes).memory
}
func example_ReadAnIntAtByteLocation5() -> Int {
return read(5) as Int
}
}
However, I am not sure how efficient it is. Do data.subdataWithRange and NSMakeRange allocate objects every time I call them, or are they just syntactic sugar for dealing with pointers?
Is there a better way to do this in Swift?
EDIT:
I have created a small Objective-C class that just encapsulates a function to offset a pointer by a given number of bytes:
#implementation RawDataOffsetPointer
inline void* offsetPointer(void* ptr, int bytes){
return (char*)ptr + bytes;
}
#end
If I include this class in the bridging header, then I can change my read method to
func read<T>(byteLocation: Int) -> T {
let ptr = offsetPointer(data.bytes, CInt(byteLocation))
return UnsafePointer<T>(ptr).memory
}
which will not copy data from my buffer, or allocate other objects.
However, it would still be nice to do some pointer arithmetic from Swift, if it were possible.
If you just want to do it directly, UnsafePointer<T> can be manipulated arithmetically:
let oldPointer = UnsafePointer<()>
let newPointer = oldPointer + 10
You can also cast a pointer like so (UnsafePointer<()> is equivalent to void *)
let castPointer = UnsafePointer<MyStruct>(oldPointer)
I would recommend looking into NSInputStream, which allows you to read NSData as a series of bytes (UInt8 in Swift).
Here is a little sample I put together in the playground:
func generateRandomData(count:Int) -> NSData
{
var array = Array<UInt8>(count: count, repeatedValue: 0)
arc4random_buf(&array, UInt(count))
return NSData(bytes: array, length: count)
}
let randomData = generateRandomData(256 * 1024)
let stream = NSInputStream(data: randomData)
stream.open() // IMPORTANT
var readBuffer = Array<UInt8>(count: 16 * 1024, repeatedValue: 0)
var totalBytesRead = 0
while (totalBytesRead < randomData.length)
{
let numberOfBytesRead = stream.read(&readBuffer, maxLength: readBuffer.count)
// Do something with the data
totalBytesRead += numberOfBytesRead
}
You can create an extension to read primitive types like so:
extension NSInputStream
{
func readInt32() -> Int
{
var readBuffer = Array<UInt8>(count:sizeof(Int32), repeatedValue: 0)
var numberOfBytesRead = self.read(&readBuffer, maxLength: readBuffer.count)
return Int(readBuffer[0]) << 24 |
Int(readBuffer[1]) << 16 |
Int(readBuffer[2]) << 8 |
Int(readBuffer[3])
}
}
I would recommend the simple way to use UnsafeArray.
let data = NSData(contentsOfFile: filename)
let ptr = UnsafePointer<UInt8>(data.bytes)
let bytes = UnsafeBufferPointer<UInt8>(start:ptr, count:data.length)