Why Encoding giving different results? - encoding

This is my string
<Auth uid="" tid="" ver="" txn="" lk=""><Meta udc="" fdc="" idc="" pip="" lot="P" lov=""/><Skey ci="">03ec063e6a932b12130090c7112a15ba30bb5c0db05b9a0fd22dbd8d33fd61ea1fd1dca544905d5c77c31783376a0b94bf6136ecf9e10dbc5cb3cd09d335c4b4c114b49c5124306a3ea9c0ce124d8da4135de27a99d60085d77a11c2769207bd700fd61d80867e043557713d9dc030fec6949e1049811285e55a561644e12ead734dc61d3caa89b313879da8339efe8163bbcc750cc7901a9fb7353920e544bc851a76e07206a357d34123fb7703440d793063f8bfc54293522266f8e12a059dadfd8822d3585f230265094501d8014ab45b896928ef03062e4745a31d5e2be9b0c1dd61c95fa89dc8e1d66ab34b438a49aaa055055a2ce680719d3e62e3cefa</Skey><Data type="">55c8623b6b26ce63b0cc4fd7333d7ef47cc02e4bb27e4fef879092f487698bcc575b571c5ca834d3b6dd23ab8c73877ecfc243427d34239c9276a0187ab4e1a13559172c57de0a9aa66f8497ecc757449889d9e00cc958963838eedbccf5f6c5d4d426cba12869b2ee530e5b32d08e1123839b348efc51999a384fc46629d2beb75638650ec6a041698ac1411a306362174c8d38687832095b27f36c3f59586fbf73370e79488b03157bd5e12305140c29f1fc4d8500157e93ec71c5b829f7d23f7fcb5079fa9d21c90bc8ac2c043b7c221e45eb2ea1ba14091cdc614a5a1591153072bcbdcfaab8b517f9bfdcf66bfdef1ba1b7455edd37c2b5a08ce299f16219a6398368c1b08aa236c93f1a6949f0e8aada425c7fe3a318ae507e3ec1e6108b66c5dff1872071c15fca0e0ffd3986b2fecd0105b1667526b1760c742a2812b46f0a6775447bddf4387574d16ff85418bce76328496042d6b4d8fde3fac902580a3eb0f4ebd12b0dc653fc29eee45fcb183a1a352a58b464385e5596203c9e1273ab5b1e48a98059614e8924f25aa506aaf3328fba4cf218ecd242fa68976b73ff75c92f1bc96e434463804ad34772184f7e1e1d6d8a7a61f33e354b60c35735488a3ca6dd57ac544d5ed77338bb73178d62bbcde6e6e1997c87b1801823fa904e318f2e33cafa99b9b7d8bbe25d039296d4a22228196a00a89ce27e690ed29c81db23dc683ef6bcd8a1bb77dc8b95725b1a874fc9059aa91d55787357ffea77c2bb71863370e7cdf96efc4f3f975168c1a74861d6f776632e1cc2e0c40a247f734db07847d2eb4d6bcbbe57bb8b7c61f44c1ff1571cb70c9acf650412205325a4a005ad286b705943b4f9cc160e268cf3174cc5031f9e4b47f6df8f52bc6fc42ba1bcef4fb56a12591e7d3eedec0124c2ed300215c263010925a1c56161376019d8de44a062f81f667818f90b269b61482453ff469529c91c0bb5771ffae848a6dbbe57816a6bde84cdaf576f1279a65b1dc3dc92ed3bdf80ccbde1373f6b098f28c33e905a39138b83713bdccc45b1bf05a76be308e3f16bef17655df0ef45dbf4b661e8ecae5cb819e751d1f9bd80adcda398dd7044c7e283aa88f7c174</Data><Hmac>4c4a58555a484c6a4651486d344a6b586f586131536555573373594a30757768593046345435644c336677576a2b65682b686e416e6d52634878383751767632</Hmac></Auth>
And I am using This Code
public String encode(String orig)
{
byte[] encoded = Base64.encodeBase64(orig.getBytes());
System.out.println("Original String: " + orig );
String s=new String(encoded);
System.out.println("Base64 Encoded String : " + new String(encoded));
return s;
}
and output is
PEF1dGggdWlkPSIiIHRpZD0iIiB2ZXI9IiIgdHhuPSIiIGxrPSIiPjxNZXRhIHVkYz0iIiBmZGM9IiIgaWRjPSIiIHBpcD0iIiBsb3Q9IlAiIGxvdj0iIi8+PFNrZXkgY2k9IiI+MDNlYzA2M2U2YTkzMmIxMjEzMDA5MGM3MTEyYTE1YmEzMGJiNWMwZGIwNWI5YTBmZDIyZGJkOGQzM2ZkNjFlYTFmZDFkY2E1NDQ5MDVkNWM3N2MzMTc4MzM3NmEwYjk0YmY2MTM2ZWNmOWUxMGRiYzVjYjNjZDA5ZDMzNWM0YjRjMTE0YjQ5YzUxMjQzMDZhM2VhOWMwY2UxMjRkOGRhNDEzNWRlMjdhOTlkNjAwODVkNzdhMTFjMjc2OTIwN2JkNzAwZmQ2MWQ4MDg2N2UwNDM1NTc3MTNkOWRjMDMwZmVjNjk0OWUxMDQ5ODExMjg1ZTU1YTU2MTY0NGUxMmVhZDczNGRjNjFkM2NhYTg5YjMxMzg3OWRhODMzOWVmZTgxNjNiYmNjNzUwY2M3OTAxYTlmYjczNTM5MjBlNTQ0YmM4NTFhNzZlMDcyMDZhMzU3ZDM0MTIzZmI3NzAzNDQwZDc5MzA2M2Y4YmZjNTQyOTM1MjIyNjZmOGUxMmEwNTlkYWRmZDg4MjJkMzU4NWYyMzAyNjUwOTQ1MDFkODAxNGFiNDViODk2OTI4ZWYwMzA2MmU0NzQ1YTMxZDVlMmJlOWIwYzFkZDYxYzk1ZmE4OWRjOGUxZDY2YWIzNGI0MzhhNDlhYWEwNTUwNTVhMmNlNjgwNzE5ZDNlNjJlM2NlZmE8L1NrZXk+PERhdGEgdHlwZT0iIj41NWM4NjIzYjZiMjZjZTYzYjBjYzRmZDczMzNkN2VmNDdjYzAyZTRiYjI3ZTRmZWY4NzkwOTJmNDg3Njk4YmNjNTc1YjU3MWM1Y2E4MzRkM2I2ZGQyM2FiOGM3Mzg3N2VjZmMyNDM0MjdkMzQyMzljOTI3NmEwMTg3YWI0ZTFhMTM1NTkxNzJjNTdkZTBhOWFhNjZmODQ5N2VjYzc1NzQ0OTg4OWQ5ZTAwY2M5NTg5NjM4MzhlZWRiY2NmNWY2YzVkNGQ0MjZjYmExMjg2OWIyZWU1MzBlNWIzMmQwOGUxMTIzODM5YjM0OGVmYzUxOTk5YTM4NGZjNDY2MjlkMmJlYjc1NjM4NjUwZWM2YTA0MTY5OGFjMTQxMWEzMDYzNjIxNzRjOGQzODY4NzgzMjA5NWIyN2YzNmMzZjU5NTg2ZmJmNzMzNzBlNzk0ODhiMDMxNTdiZDVlMTIzMDUxNDBjMjlmMWZjNGQ4NTAwMTU3ZTkzZWM3MWM1YjgyOWY3ZDIzZjdmY2I1MDc5ZmE5ZDIxYzkwYmM4YWMyYzA0M2I3YzIyMWU0NWViMmVhMWJhMTQwOTFjZGM2MTRhNWExNTkxMTUzMDcyYmNiZGNmYWFiOGI1MTdmOWJmZGNmNjZiZmRlZjFiYTFiNzQ1NWVkZDM3YzJiNWEwOGNlMjk5ZjE2MjE5YTYzOTgzNjhjMWIwOGFhMjM2YzkzZjFhNjk0OWYwZThhYWRhNDI1YzdmZTNhMzE4YWU1MDdlM2VjMWU2MTA4YjY2YzVkZmYxODcyMDcxYzE1ZmNhMGUwZmZkMzk4NmIyZmVjZDAxMDViMTY2NzUyNmIxNzYwYzc0MmEyODEyYjQ2ZjBhNjc3NTQ0N2JkZGY0Mzg3NTc0ZDE2ZmY4NTQxOGJjZTc2MzI4NDk2MDQyZDZiNGQ4ZmRlM2ZhYzkwMjU4MGEzZWIwZjRlYmQxMmIwZGM2NTNmYzI5ZWVlNDVmY2IxODNhMWEzNTJhNThiNDY0Mzg1ZTU1OTYyMDNjOWUxMjczYWI1YjFlNDhhOTgwNTk2MTRlODkyNGYyNWFhNTA2YWFmMzMyOGZiYTRjZjIxOGVjZDI0MmZhNjg5NzZiNzNmZjc1YzkyZjFiYzk2ZTQzNDQ2MzgwNGFkMzQ3NzIxODRmN2UxZTFkNmQ4YTdhNjFmMzNlMzU0YjYwYzM1NzM1NDg4YTNjYTZkZDU3YWM1NDRkNWVkNzczMzhiYjczMTc4ZDYyYmJjZGU2ZTZlMTk5N2M4N2IxODAxODIzZmE5MDRlMzE4ZjJlMzNjYWZhOTliOWI3ZDhiYmUyNWQwMzkyOTZkNGEyMjIyODE5NmEwMGE4OWNlMjdlNjkwZWQyOWM4MWRiMjNkYzY4M2VmNmJjZDhhMWJiNzdkYzhiOTU3MjViMWE4NzRmYzkwNTlhYTkxZDU1Nzg3MzU3ZmZlYTc3YzJiYjcxODYzMzcwZTdjZGY5NmVmYzRmM2Y5NzUxNjhjMWE3NDg2MWQ2Zjc3NjYzMmUxY2MyZTBjNDBhMjQ3ZjczNGRiMDc4NDdkMmViNGQ2YmNiYmU1N2JiOGI3YzYxZjQ0YzFmZjE1NzFjYjcwYzlhY2Y2NTA0MTIyMDUzMjVhNGEwMDVhZDI4NmI3MDU5NDNiNGY5Y2MxNjBlMjY4Y2YzMTc0Y2M1MDMxZjllNGI0N2Y2ZGY4ZjUyYmM2ZmM0MmJhMWJjZWY0ZmI1NmExMjU5MWU3ZDNlZWRlYzAxMjRjMmVkMzAwMjE1YzI2MzAxMDkyNWExYzU2MTYxMzc2MDE5ZDhkZTQ0YTA2MmY4MWY2Njc4MThmOTBiMjY5YjYxNDgyNDUzZmY0Njk1MjljOTFjMGJiNTc3MWZmYWU4NDhhNmRiYmU1NzgxNmE2YmRlODRjZGFmNTc2ZjEyNzlhNjViMWRjM2RjOTJlZDNiZGY4MGNjYmRlMTM3M2Y2YjA5OGYyOGMzM2U5MDVhMzkxMzhiODM3MTNiZGNjYzQ1YjFiZjA1YTc2YmUzMDhlM2YxNmJlZjE3NjU1ZGYwZWY0NWRiZjRiNjYxZThlY2FlNWNiODE5ZTc1MWQxZjliZDgwYWRjZGEzOThkZDcwNDRjN2UyODNhYTg4ZjdjMTc0PC9EYXRhPjxIbWFjPjRjNGE1ODU1NWE0ODRjNmE0NjUxNDg2ZDM0NGE2YjU4NmY1ODYxMzE1MzY1NTU1NzMzNzM1OTRhMzA3NTc3Njg1OTMwNDYzNDU0MzU2NDRjMzM2Njc3NTc2YTJiNjU2ODJiNjg2ZTQxNmU2ZDUyNjM0ODc4MzgzNzUxNzY3NjMyPC9IbWFjPjwvQXV0aD4=
but when I use online BAse 64 encoder it gives output
PEF1dGggdWlkPSIiIHRpZD0iIiB2ZXI9IiIgdHhuPSIiIGxrPSIiPjxNZXRhIHVkYz0iIiBmZGM9IiIgaWRjPSIiIHBpcD0iIiBsb3Q9IlAiIGxvdj0iIi8+PFNrZXkgY2k9IiI+MDNlYzA2M2U2YTkzMmIxMjEzMDA5MGM3MTEyYTE1YmEzMGJiNWMwZGIwNWI5YTBmZDIyZGJkOGQzM2ZkNjFlYTFmZDFkY2E1NDQ5MDVkNWM3N2MzMTc4MzM3NmEwYjk0YmY2MTM2ZWNmOWUxMGRiYzVjYjNjZDA5ZDMzNWM0YjRjMTE0YjQ5YzUxMjQzMDZhM2VhOWMwY2UxMjRkOGRhNDEzNWRlMjdhOTlkNjAwODVkNzdhMTFjMjc2OTIwN2JkNzAwZmQ2MWQ4MDg2N2UwNDM1NTc3MTNkOWRjMDMwZmVjNjk0OWUxMDQ5ODExMjg1ZTU1YTU2MTY0NGUxMmVhZDczNGRjNjFkM2NhYTg5YjMxMzg3OWRhODMzOWVmZTgxNjNiYmNjNzUwY2M3OTAxYTlmYjczNTM5MjBlNTQ0YmM4NTFhNzZlMDcyMDZhMzU3ZDM0MTIzZmI3NzAzNDQwZDc5MzA2M2Y4YmZjNTQyOTM1MjIyNjZmOGUxMmEwNTlkYWRmZDg4MjJkMzU4NWYyMzAyNjUwOTQ1MDFkODAxNGFiNDViODk2OTI4ZWYwMzA2MmU0NzQ1YTMxZDVlMmJlOWIwYzFkZDYxYzk1ZmE4OWRjOGUxZDY2YWIzNGI0MzhhNDlhYWEwNTUwNTVhMmNlNjgwNzE5ZDNlNjJlM2NlZmE8L1NrZXk+PERhdGEgdHlwZT0iIj41NWM4NjIzYjZiMjZjZTYzYjBjYzRmZDczMzNkN2VmNDdjYzAyZTRiYjI3ZTRmZWY4NzkwOTJmNDg3Njk4YmNjNTc1YjU3MWM1Y2E4MzRkM2I2ZGQyM2FiOGM3Mzg3N2VjZmMyNDM0MjdkMzQyMzljOTI3NmEwMTg3YWI0ZTFhMTM1NTkxNzJjNTdkZTBhOWFhNjZmODQ5N2VjYzc1NzQ0OTg4OWQ5ZTAwY2M5NTg5NjM4MzhlZWRiY2NmNWY2YzVkNGQ0MjZjYmExMjg2OWIyZWU1MzBlNWIzMmQwOGUxMTIzODM5YjM0OGVmYzUxOTk5YTM4NGZjNDY2MjlkMmJlYjc1NjM4NjUwZWM2YTA0MTY5OGFjMTQxMWEzMDYzNjIxNzRjOGQzODY4NzgzMjA5NWIyN2YzNmMzZjU5NTg2ZmJmNzMzNzBlNzk0ODhiMDMxNTdiZDVlMTIzMDUxNDBjMjlmMWZjNGQ4NTAwMTU3ZTkzZWM3MWM1YjgyOWY3ZDIzZjdmY2I1MDc5ZmE5ZDIxYzkwYmM4YWMyYzA0M2I3YzIyMWU0NWViMmVhMWJhMTQwOTFjZGM2MTRhNWExNTkxMTUzMDcyYmNiZGNmYWFiOGI1MTdmOWJmZGNmNjZiZmRlZjFiYTFiNzQ1NWVkZDM3YzJiNWEwOGNlMjk5ZjE2MjE5YTYzOTgzNjhjMWIwOGFhMjM2YzkzZjFhNjk0OWYwZThhYWRhNDI1YzdmZTNhMzE4YWU1MDdlM2VjMWU2MTA4YjY2YzVkZmYxODcyMDcxYzE1ZmNhMGUwZmZkMzk4NmIyZmVjZDAxMDViMTY2NzUyNmIxNzYwYzc0MmEyODEyYjQ2ZjBhNjc3NTQ0N2JkZGY0Mzg3NTc0ZDE2ZmY4NTQxOGJjZTc2MzI4NDk2MDQyZDZiNGQ4ZmRlM2ZhYzkwMjU4MGEzZWIwZjRlYmQxMmIwZGM2NTNmYzI5ZWVlNDVmY2IxODNhMWEzNTJhNThiNDY0Mzg1ZTU1OTYyMDNjOWUxMjczYWI1YjFlNDhhOTgwNTk2MTRlODkyNGYyNWFhNTA2YWFmMzMyOGZiYTRjZjIxOGVjZDI0MmZhNjg5NzZiNzNmZjc1YzkyZjFiYzk2ZTQzNDQ2MzgwNGFkMzQ3NzIxODRmN2UxZTFkNmQ4YTdhNjFmMzNlMzU0YjYwYzM1NzM1NDg4YTNjYTZkZDU3YWM1NDRkNWVkNzczMzhiYjczMTc4ZDYyYmJjZGU2ZTZlMTk5N2M4N2IxODAxODIzZmE5MDRlMzE4ZjJlMzNjYWZhOTliOWI3ZDhiYmUyNWQwMzkyOTZkNGEyMjIyODE5NmEwMGE4OWNlMjdlNjkwZWQyOWM4MWRiMjNkYzY4M2VmNmJjZDhhMWJiNzdkYzhiOTU3MjViMWE4NzRmYzkwNTlhYTkxZDU1Nzg3MzU3ZmZlYTc3YzJiYjcxODYzMzcwZTdjZGY5NmVmYzRmM2Y5NzUxNjhjMWE3NDg2MWQ2Zjc3NjYzMmUxY2MyZTBjNDBhMjQ3ZjczNGRiMDc4NDdkMmViNGQ2YmNiYmU1N2JiOGI3YzYxZjQ0YzFmZjE1NzFjYjcwYzlhY2Y2NTA0MTIyMDUzMjVhNGEwMDVhZDI4NmI3MDU5NDNiNGY5Y2MxNjBlMjY4Y2YzMTc0Y2M1MDMxZjllNGI0N2Y2ZGY4ZjUyYmM2ZmM0MmJhMWJjZWY0ZmI1NmExMjU5MWU3ZDNlZWRlYzAxMjRjMmVkMzAwMjE1YzI2MzAxMDkyNWExYzU2MTYxMzc2MDE5ZDhkZTQ0YTA2MmY4MWY2Njc4MThmOTBiMjY5YjYxNDgyNDUzZmY0Njk1MjljOTFjMGJiNTc3MWZmYWU4NDhhNmRiYmU1NzgxNmE2YmRlODRjZGFmNTc2ZjEyNzlhNjViMWRjM2RjOTJlZDNiZGY4MGNjYmRlMTM3M2Y2YjA5OGYyOGMzM2U5MDVhMzkxMzhiODM3MTNiZGNjYzQ1YjFiZjA1YTc2YmUzMDhlM2YxNmJlZjE3NjU1ZGYwZWY0NWRiZjRiNjYxZThlY2FlNWNiODE5ZTc1MWQxZjliZDgwYWRjZGEzOThkZDcwNDRjN2UyODNhYTg4ZjdjMTc0PC9EYXRhPjxIbWFjPjRjNGE1ODU1NWE0ODRjNmE0NjUxNDg2ZDM0NGE2YjU4NmY1ODYxMzE1MzY1NTU1NzMzNzM1OTRhMzA3NTc3Njg1OTMwNDYzNDU0MzU2NDRjMzM2Njc3NTc2YTJiNjU2ODJiNjg2ZTQxNmU2ZDUyNjM0ODc4MzgzNzUxNzY3NjMyPC9IbWFjPjwvQXV0aD4NCg==
Last digits are changed
Why is that?

It's because you have extra "\n" at the end of second string. It can be seen easy if you run this command:
echo "your base 64 encoded string"|base64 -d|hexdump -C
in the linux shell

Related

How to achieve Base64 URL safe encoding for binary hash string in Powershell?

I am getting the “input” from the server in “Base64-encoded” form as shown in picture.
I need to decode it into its original form of Binary string .i.e original = base64_decode(input) .
concatenate i.e to_be_hash =password (known value) + input.
Hash the concatenated string using SHA-256. i.e binary_hash_str =sha256(to_be_hash).
I need Base64 URL-safe encode the binary hash string above to make it suitable for HTTP requests.
final_hash = base64_url_safe_encode(binary_hash_str)
I am using powershell for this. Can someone guide me how to progress please.
If I understand correctly you would like to send a base64 string as a argument in a url? You can escape the characters that are not acceptable in a url using [uri]::EscapeDataString()
$text = "some text!!"
$bytes = [System.Text.Encoding]::Unicode.GetBytes($text)
$base64 = [System.Convert]::ToBase64String($bytes)
$urlSafeString = [uri]::EscapeDataString($base64)
"Base64 : " + $base64
"urlSafeString : " + $urlSafeString
Base64 : cwBvAG0AZQAgAHQAZQB4AHQAIQAhAA==
urlSafeString : cwBvAG0AZQAgAHQAZQB4AHQAIQAhAA%3D%3D

Types of VS FoxPro encodings

I'm trying to decode some strings in a DBF (created by a Foxpro app), and i'm interested in encoding / encrypting methods of FoxPro.
Here's a sample encoded string: "òÙÛÚÓ ½kê3ù[ƒ˜øžÃ+™Þoa-Kh— Gó¯ý""|øHñyäEük#‰fç9æ×ϯyi±:"
Can somebody tell me the encoding method of this string, OR give me any suggestion about Foxpro encoding methods?
Thanks!
It depends on the FoxPro version, the most recent DBF structure (VFP 9) is documented here:
https://msdn.microsoft.com/en-us/library/aa975386%28v=vs.71%29.aspx
It looks like your text could be the result of the "_Crypt.vcx" which will take a given string, apply whatever passphrase and generate an output encrypted string.
VFP has a class that is available in the "FFC" folder where VFP is default installed (via HOME() path resulting such as
C:\PROGRAM FILES (X86)\MICROSOFT VISUAL FOXPRO 9\
Here is a SAMPLE set of code to hook up the _Crypt class and sample to encrypt a string, then decrypt an encrypted string. Your string appears encrypted (obviously), but unless you know more of the encryption (such as finding the passphrase / key, you might be a bit stuck and into more research)...
lcCryptLib = HOME() + "FFC\_Crypt.vcx"
IF NOT FILE( lcCryptLib )
MESSAGEBOX( "No crypt class library." )
RETURN
ENDIF
SET CLASSLIB TO ( lcCryptLib ) ADDITIVE
oCrypt = CREATEOBJECT( "_CryptAPI" )
oCrypt.AddProperty( "myPassKey" )
oCrypt.myPassKey = "Hold property to represent some special 'Key/pass phrase' "
*/ Place-holder to get encrypted value
lcEncryptedValue = ""
? oCrypt.EncryptSessionStreamString( "Original String", oCrypt.myPassKey, #lcEncryptedValue )
*/ Show results of encrypted value
? "Encrypted Value: " + lcEncryptedValue
*/ Now, to get the decrypted from the encrypted...
lcDecryptedValue = ""
? oCrypt.DecryptSessionStreamString( lcEncryptedValue, oCrypt.myPassKey, #lcDecryptedValue )
? "Decrypted Value: " + lcDecryptedValue
*/ Now, try with your string to decrypt
lcYourString = [òÙÛÚÓ ½kê3ù[ƒ˜øžÃ+™Þoa-Kh— Gó¯ý""|øHñyäEük#‰fç9æ×ϯyi±:]
lcDecryptedValue = ""
? oCrypt.DecryptSessionStreamString( lcYourString, oCrypt.myPassKey, #lcDecryptedValue )
? "Decrypted Value: " + lcDecryptedValue

ANTLR4: Using non-ASCII characters in token rules

On page 74 of the ANTRL4 book it says that any Unicode character can be used in a grammar simply by specifying its codepoint in this manner:
'\uxxxx'
where xxxx is the hexadecimal value for the Unicode codepoint.
So I used that technique in a token rule for an ID token:
grammar ID;
id : ID EOF ;
ID : ('a' .. 'z' | 'A' .. 'Z' | '\u0100' .. '\u017E')+ ;
WS : [ \t\r\n]+ -> skip ;
When I tried to parse this input:
Gŭnter
ANTLR throws an error, saying that it does not recognize ŭ. (The ŭ character is hex 016D, so it is within the range specified)
What am I doing wrong please?
ANTLR is ready to accept 16-bit characters but, by default, many locales will read in characters as bytes (8 bits). You need to specify the appropriate encoding when you read from the file using the Java libraries. If you are using the TestRig, perhaps through alias/script grun, then use argument -encoding utf-8 or whatever. If you look at the source code of that class, you will see the following mechanism:
InputStream is = new FileInputStream(inputFile);
Reader r = new InputStreamReader(is, encoding); // e.g., euc-jp or utf-8
ANTLRInputStream input = new ANTLRInputStream(r);
XLexer lexer = new XLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
...
Grammar:
NAME:
[A-Za-z][0-9A-Za-z\u0080-\uFFFF_]+
;
Java:
import org.antlr.v4.runtime.CharStream;
import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.TokenStream;
import com.thalesgroup.dms.stimulus.StimulusParser.SystemContext;
final class RequirementParser {
static SystemContext parse( String requirement ) {
requirement = requirement.replaceAll( "\t", " " );
final CharStream charStream = CharStreams.fromString( requirement );
final StimulusLexer lexer = new StimulusLexer( charStream );
final TokenStream tokens = new CommonTokenStream( lexer );
final StimulusParser parser = new StimulusParser( tokens );
final SystemContext system = parser.system();
if( parser.getNumberOfSyntaxErrors() > 0 ) {
Debug.format( requirement );
}
return system;
}
private RequirementParser() {/**/}
}
Source:
Lexers and Unicode text
For those having the same problem using antlr4 in java code, ANTLRInputStream beeing deprecated, here is a working way to pass multi-char unicode data from a String to a the MyLexer lexer :
String myString = "\u2013";
CharBuffer charBuffer = CharBuffer.wrap(myString.toCharArray());
CodePointBuffer codePointBuffer = CodePointBuffer.withChars(charBuffer);
CodePointCharStream cpcs = CodePointCharStream.fromBuffer(codePointBuffer);
OneLexer lexer = new MyLexer(cpcs);
CommonTokenStream tokens = new CommonTokenStream(lexer);
You can specify the encoding of the file when actually reading the file.
For Kotlin/Java that could look like this, no need to specify the encoding in the grammar!
val inputStream: CharStream = CharStreams.fromFileName(fileName, Charset.forName("UTF-16LE"))
val lexer = BlastFeatureGrammarLexer(inputStream)
Supported Charsets by Java/Kotlin

Convert a string to a byte array in PowerShell version 2

What I'm trying to do is use SHA1 UTF-8 encryption and then base64 encoding and on a password string value. However, I needed to do the encryption first, then the encoding, but I did it the other way around.
Here is the code:
# Create Input Data
$enc = [system.Text.Encoding]::UTF8
$string1 = "This is a string to hash"
$data1 = $enc.GetBytes($string1)
# Create a New SHA1 Crypto Provider
$sha = New-Object System.Security.Cryptography.SHA1CryptoServiceProvider
$# Now hash and display results
$result1 = $sha.ComputeHash($data1)
So, when I went to do the hashing I realized I had to have a byte[] from the string and I'm not sure how to do that. I'm thinking there is a simple way from the .Net libraries, but couldn't find an example.
So if I have a string, like:
$string = "password"
How do I convert that into a byte array that I can use on :: ComputeHash($string)?
So what I have to end up with is an encrypted SHA-1 and base 64 encoded UTF-8 password, which the code above does, but it's coming back different than when I coded this same thing in java, where I encrypted it first, then converted that result to base 64 encoding.
I'm making the assumption that while encrypting a string directly isn't supported in the api, there may be a work-around that will allow you to do this. That is what I'm attempting to do.
So I'm assuming my issue with the code is that I had to encrypt it first and then encode it to get the correct value. Correct or am I missing something here?
Here is the pertinent java code that does work:
//First method call uses a swing component to get the user entered password.
String password = getPassword();
//This line is where the work starts with the second and third methods below.
String hashed = byteToBase64(getHash(password));
//The second method call here gets the encryption.
public static byte[] getHash(String password) {
MessageDigest digest = null;
byte[] input = null;
try {
digest = MessageDigest.getInstance("SHA-1");
} catch (NoSuchAlgorithmException e1) {
e1.printStackTrace();
}
digest.reset();
try {
input = digest.digest(password.getBytes("UTF-8"));
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
return input;
}
//Then the third method call here gets the encoding, FROM THE ENCRYPTED STRING.
public static String byteToBase64(byte[] data){
return new String(Base64.encodeBase64(data));
When I run the java code with the password string of "password" I get
[91, -86, 97, -28, -55, -71, 63, 63, 6, -126, 37, 11, 108, -8, 51, 27, 126, -26, -113, -40]
which is the encryption.
Then I when the encoding in java I get this:
W6ph5Mm5Pz8GgiULbPgzG37mj9g=
but when I run it in PowerShell I get this because it's encoded first for UTF8:
91 170 97 228 201 185 63 63 6 130 37 11 108 248 51 27 126 230 143 216
Then when I run this line of code to convert it I get an error:
$base64 = [System.Convert]::FromBase64String($result)
Exception calling "FromBase64String" with "1" argument(s): "Invalid length for a Base-64 char array."
At line:1 char:45
However, if I run the new line of code to make it hex from below I get:
$hexResult = [String]::Join("", ($result | % { "{0:X2}" -f $_}))
PS C:\Program Files (x86)\PowerGUI> Write-Host $hexResult
5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8
but I need to end up with this value:
W6ph5Mm5Pz8GgiULbPgzG37mj9g=
Again, this may not even be possible to do, but I'm trying to find a work-around to see.
You most likely just need to convert your hash to base64 after the last line.
$enc = [system.Text.Encoding]::UTF8
$string1 = "This is a string to hash"
$data1 = $enc.GetBytes($string1)
# Create a New SHA1 Crypto Provider
$sha = New-Object System.Security.Cryptography.SHA1CryptoServiceProvider
# Now hash and display results
$result1 = $sha.ComputeHash($data1)
[System.Convert]::ToBase64String($result1)
Text->Bytes->Encrypt/Hash->Base64
That's a very common pattern for sending cryptographic data in a text format.
It looks like you're on the right track. You have to pick a character encoding to convert between a string and a byte array. You picked UTF-8 above, but there are other options (e.g. ASCII, UTF-16, etc.).
Encrypting a string directly is not supported.
The problem seems to be that in first bytearray, you are have signed bytes (-86 = 10101010) and in the second one unsigned bytes (170 = 10101010).

WP7's WebBrowser.NavigateToString() and text encoding

does anyone know how to load a UTF8-encoded string using WebBrowser.NavigateToString() method? For now I end up with a bunch of mis-displayed characters.
Here's the simple string that won't display correctly:
webBrowser.NavigateToString("ąęłóńżźćś");
The code file is saved with UTF-8 encoding (with signature).
Thanks.
Using ConvertExtendedASCII as suggested works, but is very slow. Using a StringBuilder instead was (in my case) about 800 times faster:
public string FixHtml(string HTML)
{
StringBuilder sb = new StringBuilder();
char[] s = HTML.ToCharArray();
foreach (char c in s)
{
if (Convert.ToInt32(c) > 127)
sb.Append("&#" + Convert.ToInt32(c) + ";");
else
sb.Append(c);
}
return sb.ToString();
}
First up, NavigateToString() is expecting a full html document.
Secondly, as you're passing HTML, it's best to pass HTML entities, rather than relying on encodings. Unfortunately, not that many entity codes are actually supported by the browser so you should look at using the numeric unicode values where necessary.
Much like this:
webBrowser1.NavigateToString("<html><body><p>ó Õ</p></body></html>");
Try this article. It should help. Shortly speaking, it proposes to use following snippet to convert your string into appropriate format:
private static string ConvertExtendedASCII(string HTML)
{
string retVal = "";
char[] s = HTML.ToCharArray();
foreach (char c in s)
{
if (Convert.ToInt32(c) > 127)
retVal += "&#" + Convert.ToInt32(c) + ";";
else
retVal += c;
}
return retVal;
}
If you have the UTF8 in memory in a byte array then you could try NavigateToStream with a MemoryStream rather than using NavigateToString. You should try to ensure their is a BOM on the UTF8 buffer if you can.
Note that the string in the question is not a UTF8 string. It is a UTF16 string with some garbage in it. By placing zeros between the bytes and storing it in a System.String you corrupted it.