Is UTF-16 encoding handles data compression by default? - swift

I've unicode char த
When I convert id to data,
UTF 8 -> Size: 3 bytes Array: [224, 174, 164]
UTF 16 -> Size: 4 bytes Array: [2980]
Seems pretty simple UTF8 tooks 1 byte per code and UTF16 takes 4 bytes per code. But, If I use "தததத" using Swift programming language in macOS,
let tamil = "தததத"
let utf8Data = tamil.data(using: .utf8)!
let utf16Data = tamil.data(using: .utf16)!
print("UTF 8 -> Size: \(utf8Data.count) bytes Array: \(tamil.utf8.map({$0}))")
print("UTF 16 -> Size: \(utf16Data.count) bytes Array: \(tamil.utf16.map({$0}))")
Then the output is
UTF 8 -> Size: 12 bytes Array: [224, 174, 164, 224, 174, 164, 224, 174, 164, 224, 174, 164]
UTF 16 -> Size: 10 bytes Array: [2980, 2980, 2980, 2980]
The UTF16 data for "தததத" => 4x4 = 16 bytes. But it is 10 bytes only still have 4 codes in the array. Why it is? Where the 6 bytes gone?

The actual byte representation of those strings is this:
UTF-8:
e0ae a4e0 aea4 e0ae a4e0 aea4
UTF-16:
feff 0ba4 0ba4 0ba4 0ba4
The UTF-8 representation is e0aea4 times four.
The UTF-16 representation is 0ba4 times four plus one leading BOM feff.
UTF-16 text should start with a BOM, but this is only required once at the start of the string, not once for each character.

Related

How to get correct RGB color values from PNG with PLTE palette using NSBitmapImageRep in Swift?

Given a PNG image, I would like to get the RGB color at a given pixel using Swift on macOS.
This is simply done by:
let d = Data(base64Encoded: "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVQImWP4z8AAAAMBAQCc479ZAAAAAElFTkSuQmCC")!
let nsImage = NSImage(data: d)!
let bitmapImageRep = NSBitmapImageRep(cgImage: nsImage.cgImage(forProposedRect: nil, context: nil, hints: [:])!)
let color = bitmapImageRep.colorAt(x: 0, y: 0)
print(color!.redComponent, color!.greenComponent, color!.blueComponent, separator: " ")
// 1.0 0.0 0.0
This minimal example contains a one pixel PNG in RGB encoding with a red pixel in base64 encoding to make this easily reproducible.
printf iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVQImWP4z8AAAAMBAQCc479ZAAAAAElFTkSuQmCC | base64 --decode | pngcheck -v
File: stdin (140703128616967 bytes)
chunk IHDR at offset 0xfffffffffffffffb, length 13
1 x 1 image, 24-bit RGB, non-interlaced
chunk IDAT at offset 0xfffffffffffffffb, length 12
zlib: deflated, 256-byte window, default compression
chunk IEND at offset 0xfffffffffffffffb, length 0
No errors detected in stdin (3 chunks, -133.3% compression).
When changing this PNG to an indexed PNG with a PLTE Palette
printf iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAMAAAAoyzS7AAAAA1BMVEX/AAAZ4gk3AAAACklEQVQImWNgAAAAAgAB9HFkpgAAAABJRU5ErkJggg== | base64 --decode | pngcheck -v
File: stdin (140703128616967 bytes)
chunk IHDR at offset 0xfffffffffffffffb, length 13
1 x 1 image, 8-bit palette, non-interlaced
chunk PLTE at offset 0xfffffffffffffffb, length 3: 1 palette entry
chunk IDAT at offset 0xfffffffffffffffb, length 10
zlib: deflated, 256-byte window, default compression
chunk IEND at offset 0xfffffffffffffffb, length 0
No errors detected in stdin (4 chunks, -600.0% compression).
the snippet above changes to
let d = Data(base64Encoded: "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAMAAAAoyzS7AAAAA1BMVEX/AAAZ4gk3AAAACklEQVQImWNgAAAAAgAB9HFkpgAAAABJRU5ErkJggg==")!
let nsImage = NSImage(data: d)!
let bitmapImageRep = NSBitmapImageRep(cgImage: nsImage.cgImage(forProposedRect: nil, context: nil, hints: [:])!)
let color = bitmapImageRep.colorAt(x: 0, y: 0)
print(color!.redComponent, color!.greenComponent, color!.blueComponent, separator: " ")
// 0.984313725490196 0.0 0.027450980392156862
It produces the unexpected output with a red component close to one and some non-zero blue component. Where does this change in color come from?

Hash is not working on Array<UInt8> datatype in swift

I'm trying to convert a payload using CBOR.encode method from SwiftCBOR which gives me a result in Array<UInt8> format and when the result is being hashed its gives the result in Array<UInt8> format but i need the hash of it
payload = {productName:"Philips Hue White A19 4-Pack 60W Equivalent Dimmable LED Smart Bulb"}
let payloadBytes = CBOR.encode(payload)
print("payloadbytes:",payloadBytes); // payloadbytes: [120, 83, 123,..., 98, 34, 125]
print("hashed payload",payloadBytes.sha512()); //[176, 26, 154,..., 85, 75, 77]
the expected output of hashing the payloadbytes is
6097B04A89468E7AAA8A1784B9CBAF0D2E968AFFC7F5AFE3B0B30CDF44A79EB41464EC773D0D729FA7E6AD6F7462EDEA09B34177CDB0D0DF4A6DDF3C6117C01E

WebCrypto AES-CBC outputting 256bit instead of 128bits

I'm playing with WebCrypto and I'm getting a confusing output.
The following test case encrypts a random 16byte (128bit) plain text with a newly generated 128bit key and 128bit random IV but is outputting a 32byte (256bit) output.
If I remember the details of AES-CBC it should output 128bit blocks.
function test() {
var data = new Uint8Array(16);
window.crypto.getRandomValues(data);
console.log(data)
window.crypto.subtle.generateKey(
{
name: "AES-CBC",
length: 128,
},
false,
["encrypt", "decrypt"]
)
.then(function(key){
//returns a key object
console.log(key);
window.crypto.subtle.encrypt(
{
name: "AES-CBC",
iv: window.crypto.getRandomValues(new Uint8Array(16)),
},
key,
data
)
.then(function(encrypted){
console.log(new Uint8Array(encrypted));
})
.catch(function(err){
console.error(err);
});
})
.catch(function(err){
console.error(err);
});
}
Example output:
Uint8Array(16) [146, 207, 22, 56, 56, 151, 125, 174, 137, 69, 133, 36, 218, 114, 143, 174]
CryptoKey {
algorithm: {name: "AES-CBC", length: 128}
extractable: false
type: "secret"
usages: (2) ["encrypt", "decrypt"]
__proto__: CryptoKey
Uint8Array(32) [81, 218, 52, 158, 115, 105, 57, 230, 45, 253, 153, 54, 183, 19, 137, 240, 183, 229, 241, 75, 182, 19, 237, 8, 238, 5, 108, 107, 123, 84, 230, 209]
Any idea what I've got wrong.
(Open to moving to crypto.stackexchange.com if more suitable)
I'm testing on Chrome 71 on MacOS at the moment.
Yes. The extra 16 bytes is the padding. Even when the message text is a multiple of the block size, padding is added, otherwise the decryption logic doesn't know when to look for padding.
The Web Cryptography API Specification says:
When operating in CBC mode, messages that are not exact multiples of
the AES block size (16 bytes) can be padded under a variety of padding
schemes. In the Web Crypto API, the only padding mode that is
supported is that of PKCS#7, as described by Section 10.3, step 2, of
[RFC2315].
This means unlike other language implementations (like Java) where you can specify NoPadding when you know that your input message text is always going to be a multiple of block size (128 bits for AES), Web Cryptography API forces you to have PKCS#7 padding.
If we look into RFC2315:
Some content-encryption algorithms assume the input length is a
multiple of k octets, where k > 1, and let the application define a
method for handling inputs whose lengths are not a multiple of k
octets. For such algorithms, the method shall be to pad the input at
the trailing end with k - (l mod k) octets all having value k - (l mod
k), where l is the length of the input. In other words, the input is
padded at the trailing end with one of the following strings:
01 -- if l mod k = k-1
02 02 -- if l mod k = k-2
.
.
.
k k ... k k -- if l mod k = 0
The padding can be removed unambiguously since all input is padded and
no padding string is a suffix of another. This padding method is
well-defined if and only if k < 256; methods for larger k are an open
issue for further study.
Note: k k ... k k -- if l mod k = 0
If you refer to the subtle.encrypt signature, you have no way to specify the padding mode. This means, the decryption logic always expects the padding.
However, in your case, if you use the Web Cryptography API only for encryption and your Python app (with NoPadding) only for decryption, I think you can simply strip off the last 16 bytes from the cipher text before feeding it to the Python app. Here is the code sample just for demonstration purpose:
function test() {
let plaintext = 'GoodWorkGoodWork';
let encoder = new TextEncoder('utf8');
let dataBytes = encoder.encode(plaintext);
window.crypto.subtle.generateKey(
{
name: "AES-CBC",
length: 128,
},
true,
["encrypt", "decrypt"]
)
.then(function(key){
crypto.subtle.exportKey('raw', key)
.then(function(expKey) {
console.log('Key = ' + btoa(String.
fromCharCode(...new Uint8Array(expKey))));
});
let iv = new Uint8Array(16);
window.crypto.getRandomValues(iv);
let ivb64 = btoa(String.fromCharCode(...new Uint8Array(iv)));
console.log('IV = ' + ivb64);
window.crypto.subtle.encrypt(
{
name: "AES-CBC",
iv: iv,
},
key,
dataBytes
)
.then(function(encrypted){
console.log('Cipher text = ' +
btoa(String.fromCharCode(...new Uint8Array(encrypted))));
})
.catch(function(err){
console.error(err);
});
})
.catch(function(err){
console.error(err);
});
}
The output of the above is:
IV = qW2lanfRo2H/3aSLzxIecA==
Key = 0LDBq5iz243HBTUE/lrM+A==
Cipher text = Wa4nIF0tt4PEBUChiH1KCkSOg6L2daoYdboEEf+Oh6U=
Now, I use take these as input, strip off the last 16 bytes of the cipher text and still get the same message text after decryption using the following Java code:
package com.sapbasu.javastudy;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import javax.crypto.Cipher;
import javax.crypto.SecretKey;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
public class EncryptCBC {
public static void main(String[] arg) throws Exception {
SecretKey key = new SecretKeySpec(Base64.getDecoder().decode(
"0LDBq5iz243HBTUE/lrM+A=="),
"AES");
IvParameterSpec ivSpec = new IvParameterSpec(Base64.getDecoder().decode(
"qW2lanfRo2H/3aSLzxIecA=="));
Cipher cipher = Cipher.getInstance("AES/CBC/NoPadding");
cipher.init(Cipher.DECRYPT_MODE, key, ivSpec);
byte[] cipherTextWoPadding = new byte[16];
System.arraycopy(Base64.getDecoder().decode(
"Wa4nIF0tt4PEBUChiH1KCkSOg6L2daoYdboEEf+Oh6U="),
0, cipherTextWoPadding, 0, 16);
byte[] decryptedMessage = cipher.doFinal(cipherTextWoPadding);
System.out.println(new String(decryptedMessage, StandardCharsets.UTF_8));
}
}

CryptoSwift with AES128 CTR Mode - Buggy counter increment?

i've encountered a problem on the CryptoSwift-API (krzyzanowskim) while using AES128 with the CTR-Mode and my test function (nullArrayBugTest()) that produces on specific counter values (between 0 and 25 = on 13 and 24) a wrong array count that should usually be 16!
Even if I use the manually incremented "iv_13" with the buggy value 13 instead of the default "iv_0" and the counter 13...
Test it out to get an idea what I mean.
func nullArrayBugTest() {
var ctr:CTR
let nilArrayToEncrypt = Data(hex: "00000000000000000000000000000000")
let key_ = Data(hex: "000a0b0c0d0e0f010203040506070809")
let iv_0: Array<UInt8> = [0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f]
//let iv_13: Array<UInt8> = [0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x1c]
var decryptedNilArray = [UInt8]()
for i in 0...25 {
ctr = CTR(iv: iv_0, counter: i)
do {
let aes = try AES(key: key_.bytes, blockMode: ctr)
decryptedNilArray = try aes.decrypt([UInt8](nilArrayToEncrypt))
print("AES_testcase_\(i) for ctr: \(ctr) withArrayCount: \(decryptedNilArray.count)")
}catch {
print("De-/En-CryptData failed with: \(error)")
}
}
}
Output with buggy values
The question why I always need the encrypted array with 16 values is not important :D.
Does anybody know why the aes.decrypt()-function handles that like I received?
Thanks for your time.
Michael S.
CryptoSwift defaults to PKCS#7 padding. Your resulting plaintexts have invalid padding. CryptoSwift ignores padding errors, which IMO is a bug, but that's how it's implemented. (All the counters that you're considering "correct" should really have failed to decrypt at all.) (I spoke this over with Marcin and he reminded me that even at this low level, it's normal to ignore padding errors to avoid padding oracle attacks. I'd forgotten that I do it this way too....)
That said, sometimes the padding will be "close enough" that CryptoSwift will try to remove padding bytes. It usually won't be valid padding, but it'll be close enough for CrypoSwift's test.
As an example, your first counter creates the following padded plaintext:
[233, 222, 112, 79, 186, 18, 139, 53, 208, 61, 91, 0, 120, 247, 187, 254]
254 > 16, so CryptoSwift doesn't try to remove padding.
For a counter of 13, the following padded plaintext is returned:
[160, 140, 187, 255, 90, 209, 124, 158, 19, 169, 164, 110, 157, 245, 108, 12]
12 < 16, so CryptoSwift removes 12 bytes, leaving 4. (This is not how PKCS#7 padding works, but it's how CryptoSwift works.)
The underlying problem is you're not decrypting something you encrypted. You're just running a static block through the decryption scheme.
If you don't want padding, you can request that:
let aes = try AES(key: key_.bytes, blockMode: ctr, padding: .noPadding)
This will return you what you're expecting.
Just in case there's any confusion by other readers: this use of CTR is wildly insecure and no part of it should be copied. I'm assuming that the actual encryption code doesn't work anything like this.
I guess the encryption happens without the padding applied, but then u use padding to decrypt. To fix that, use the same technique on both sides. That said, this is a solution (#rob-napier answer is more detailed):
try AES(key: key_.bytes, blockMode: ctr, padding: .noPadding)

ZXing truncading negative bytes

In ZXing i'm creating a string of a binary data using the encode "ISO-8859-1"
but somehow negative bytes in the data get truncated to byte 63 when reading the produced QR code
Example: String before QR code (as bytes)
-78, 99, -86, 15, -123, 31, -11, -64, 77, -91, 26, -126, -68, 33
String read from QR code:
63, 99, 63, 15, 63, 31, 63, 63, 77, 63, 26, 63, 63, 33
How do I prevent that without using base64?
For some reason ZXing assembles the QR matrix with the correct data, it's the reading that truncates the bytes. I ended up sidestepping the problem by encoding my binary data to base64 and dealing with the increased message size