Convert string to utf-8 unicode in ColdFusion - unicode

I need to convert a string to a UTF8 encoded format and I'm not sure how to proceed.
Is there any function within ColdFusion to convert a string into UTF-8, such as on this website?
For example, typing in "stackoverflow.com/questions/ask" into the above website gives the result:
\x73\x74\x61\x63\x6B\x6F\x76\x65\x72\x66\x6C\x6F\x77\x2E\x63\x6F\x6D\x2F\x71\x75\x65\x73\x74\x69\x6F\x6E\x73\x2F\x61\x73\x6B
I am not very familiar with encoding, however my instructions were to encode a string to UTF-8. The example I was given gave an encoded result of the below for example.
/re/r/434/t//4r3/t434/4t/t3/3/4t/43tt/53/
I am not sure if this is a real representation of an encoded string or if it was just typed out to give a visual example. Is there a format that looks like that? And is it different than the format from the first example?
Thank you!

I think you can use a combination of CharsetDecode() and CharsetEncode() to accomplish this.
<cfset my_string = "test">
<cfset binary_my_string = CharsetDecode(my_string, "ASCII")>
<cfset utf8_my_string = CharsetEncode(binary_my_string, "utf-8")>
You'd just need to substitute the correct initial encoding for "ASCII" in my example.

<cfset str = "stackoverflow.com/questions/ask">
<cfset hexStr = "">
<cfloop index="i" from="0" to="#len(str)-1#">
<!--- Pick out each character in the string. Remember that charAt() starts at index 0. --->
<cfset ch = str.charAt(i)>
<!--- The decimal value of the Unicode character. ColdFusion uses the Java UCS-2 representation of Unicode characters, up to a value of 65536. --->
<cfset charDecVal = asc(ch)>
<!--- The decimal value of the character, upper-casing the letters.--->
<cfset charHexVal = uCase(formatBaseN(charDecVal,"16"))>
<!--- Append the characters together into a Hex string, using delimiter '\x' --->
<cfset hexStr = hexStr & "\x" & charHexVal>
</cfloop>
<cfscript>
writeoutput(hexStr);
</cfscript>

Related

ColdFusion Hash

I'm trying to create a password digest with this formula to get the following variables and my code is just not matching. Not sure what I'm doing wrong, but I'll admit when I need help. Hopefully someone is out there who can help.
Formula from documentation: Base64(SHA1(NONCE + TIMESTAMP + SHA1(PASSWORD)))
Correct Password Digest Answer: +LzcaRc+ndGAcZIXmq/N7xGes+k=
ColdFusion Code:
<cfSet PW = "AMADEUS">
<cfSet TS = "2015-09-30T14:12:15Z">
<cfSet NONCE = "secretnonce10111">
<cfDump var="#ToBase64(Hash(NONCE & TS & Hash(PW,'SHA-1'),'SHA-1'))#">
My code outputs:
Njk0MEY3MDc0NUYyOEE1MDMwRURGRkNGNTVGOTcyMUI4OUMxM0U0Qg==
I'm clearly doing something wrong, but for the life of me cannot figure out what. Anyone? Bueller?
The fun thing about hashing is that even if you start with the right string(s), the result can still be completely wrong, if those strings are combined/encoded/decoded incorrectly.
The biggest gotcha is that most of these functions actually work with the binary representation of the input strings. So how those strings are decoded makes a big difference. Notice the same string produces totally different binary when decoded as UTF-8 versus Hex? That means the results of Hash, ToBase64, etcetera will be totally different as well.
// Result: UTF-8: 65-65-68-69
writeOutput("<br>UTF-8: "& arrayToList(charsetDecode("AADE", "UTF-8"), "-"));
// Result: HEX: -86--34
writeOutput("<br>HEX: "& arrayToList(binaryDecode("AADE", "HEX"), "-"));
Possible Solution:
The problem with the current code is that ToBase64 assumes the input string is encoded as UTF-8. Whereas Hash() actually returns a hexadecimal string. So ToBase64() decodes it incorrectly. Instead, use binaryDecode and binaryEncode to convert the hash from hex to base64:
resultAsHex = Hash( NONCE & TS & Hash(PW,"SHA-1"), "SHA-1");
resultAsBase64 = binaryEncode(binaryDecode(resultAsHex, "HEX"), "base64");
writeDump(resultAsBase64);
More Robust Solution:
Having said that, be very careful with string concatenation and hashing. As it does not always yield the expected results. Without knowing more about this specific API, I cannot be completely certain what it expects. However, it is usually safer to only work with the binary values. Unfortunately, CF's ArrayAppend() function lacks support for binary arrays, but you can easily use Apache's ArrayUtils class, which is bundled with CF.
ArrayUtils = createObject("java", "org.apache.commons.lang.ArrayUtils");
// Combine binary of NONCE + TS
nonceBytes = charsetDecode(NONCE, "UTF-8");
timeBytes = charsetDecode(TS, "UTF-8");
combinedBytes = ArrayUtils.addAll(nonceBytes, timeBytes);
// Combine with binary of SECRET
secretBytes = binaryDecode( Hash(PW,"SHA-1"), "HEX");
combinedBytes = ArrayUtils.addAll(combinedBytes, secretBytes);
// Finally, HASH the binary and convert to base64
resultAsHex = hash(combinedBytes, "SHA-1");
resultAsBase64 = binaryEncode(binaryDecode(resultAsHex, "hex"), "base64");
writeDump(resultAsBase64);

Decode a string with both Unicode and Utf-8 codes in Python 2.x

Say we have a string:
s = '\xe5\xaf\x92\xe5\x81\x87\\u2014\\u2014\xe5\x8e\xa6\xe9\x97\xa8'
Somehow two symbols, '—', whose Unicode is \u2014 was not correctly encoded as '\xe2\x80\x94' in UTF-8. Is there an easy way to decode this string? It should be decoded as 寒假——厦门
Manually using the replace function is OK:
t = u'\u2014'
s.replace('\u2014', t.encode('utf-8')
print s
However, it is not automatic. If we extract the Unicode,
index = s.find('\u')
t = s[index : index+6]
then t = '\\u2014'. How to convert it to UTF-8 code?
You're missing extra slashes in your replace()
It should be:
s.replace("\\u2014", u'\u2014'.encode("utf-8") )
Check my warning in the comments of the question. You should not end up in this situation.

Coldfusion 9 Hash Binary SHA-256 [duplicate]

I am using ColdFusion 9
Referencing Ben Nadel's good works on his blog, I tried
ucase(digestUtils.sha512(imageBinary))
For SHA-512 hashing I get that dreaded:
The sha512 method was not found. Either there are no methods with the
specified method name and argument types or the sha512 method is
overloaded with argument types that ColdFusion cannot decipher
reliably. ColdFusion found 0 methods that match the provided
arguments. If this is a Java object and you verified that the method
exists, use the javacast function to reduce ambiguity.
Now I know that sha512 does indeed exist as a method, because I saw it here, but when I perform a
cfdump var="#digestUtils#"
I only get:
md5(byte[]) byte[]
md5(java.lang.String) byte[]
md5Hex(byte[]) java.lang.String
md5Hex(java.lang.String) java.lang.String
sha(java.lang.String) byte[]
sha(byte[]) byte[]
shaHex(java.lang.String) java.lang.String
shaHex(byte[]) java.lang.String
What happened to the other methods? I guess I have to try something else.
Please advise with a ColdFusion solution. A ColdFusion/Java solution would be ok too.
I'm trying to write a SSO application where the 3rd party guys feeds me URL parameters. I have successfully decoded the 1st parameter to get my XML Post. I now need to take the 2nd parameter which is the hash payload and go through the algorithm to ensure my 1st parameter hasn't been tampered with.
=========
Editing begins here: Okay,I tried writing the code again to no avail.
The algorithm sounds simple enough. But trying to implement it is killing me.
1. compute the hash string value of the XMLPost string above:
a. convert the base64 salt string to a UTF-8 byte array.
b. convert the base64 XML payload string to a UTF-8 byte array.
c. create a new byte array consisting of the XML payload bytes from step b, appended with the salt bytes from step a.
d. perform a SHA512 hash on the concatenated byte array from step c, which results in a hashed byte array.
e. create a new byte array consisting of the hashed bytes from step d, appended with the salt bytes from step a.
f. convert the result of step e to a base64-encoded string and should be the value of query string parameter "h" payload hash.
xmlPost was created by my third party guys as such:
This XML payload string was converted to a UTF-8 byte array, which was then converted to a base-64 string. The resulting base-64 string is the value of my xmlPost below.
So I do this:
<code>
<cfset xmlPost = urlDecode("PD94bWwgdmVyc2lvbj0iMS4wIj8%2bPEVzdG9yZVNzb0N1c3RvbWVyIHhtbG5zOnhzaT0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxTY2hlbWEtaW5zdGFuY2UiIHhtbG5zOnhzZD0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxTY2hlbWEiPjxDdXN0b21lcklkPjExMjk0MDwvQ3VzdG9tZXJJZD48RGVhbGVyQ29kZT5OODg4ODg8L0RlYWxlckNvZGU%2bPFBvaW50QmFsYW5jZT4yODA8L1BvaW50QmFsYW5jZT48Rmlyc3ROYW1lPkZhaXRoPC9GaXJzdE5hbWU%2bPExhc3ROYW1lPkh1dHVsYTwvTGFzdE5hbWU%2bPC9Fc3RvcmVTc29DdXN0b21lcj4%3d") />
<cfset salt = "3dfjh674!MujErf98344#090" />
<cfset payload_hash = urlDecode("EtLDRJfcRESFKpY4OGZZnRSN2THqT%2bEelzOuXVU06jotd2kE4yKnlYay7BqyAdcUSATRgSMaHxZa6uBqKKd9rjNkZmpoNjc0IU11akVyZjk4MzQ0QDA5MA%3d%3d") />
<cfset strXML = ToString( ToBinary( xmlpost ) ) /> <!--- to get actual XML --->
<!--- base64 encoding returns a byte array --->
<cfset saltByteArray = toBase64( salt, "utf-8" ) />
<cfset xmlpostByteArray = toBase64( xmlPost, "utf-8" ) />
<!--- append salt to xmlpost --->
<cfset xmlpostsaltByteArray = xmlpostByteArray & saltByteArray />
<!--- now let us perform a sha512 hash on this concatenated byte array --->
<cfscript>
// Create an instance of our DigestUtils class
digestUtils = createObject("java","org.apache.commons.codec.digest.DigestUtils");
// I hash a byte array using the given algorithm and return a
// 32-character Hexadecimal string. Home-made hash function for CF9 and earlier
function hashBytes( bytes, algorithm = "SHA-512" ){
// Get our instance of the digest algorithm that we'll use
// to hash the byte array.
var messageDigest = createObject( "java", "java.security.MessageDigest" ).getInstance( javaCast( "string", algorithm ) );
// Get the digest for the given byte array. This returns the
// digest (i.e., hash) in byte-array format.
var digest = messageDigest.digest( bytes );
// Now that we have our digested byte array (i.e., our hash as another byte
// array), we have to convert that into a HEX string. So, we'll need a HEX buffer.
var hexBuffer = [];
// Each integer in the byte digest needs to be converted into
// a HEX character (with possible leading zero).
for (byte =1 ;byte LTE ArrayLen(digest);byte = byte + 1) {
//for ( var byte in digest){
// Get the hex value for this byte. When converting the
// byte, only use the right-most 8 bits (last 8 bits of the integer)
// otherwise the sign of the byte can create oddities
var tail = bitAnd( 255, byte );
// Get the hex-encoding of the byte.
var hex = ucase( formatBaseN( tail, 16 ) );
// In order to make sure that all of the HEX characters
// are two-digits, we have to prepend a zero for any
// value that was originally LTE to 16 (the largest value
// that won't result in two HEX characters).
arrayAppend( hexBuffer, (tail <= 16 ? ("0" & hex) : hex) );
}
// Return the flattened character buffer.
return( arrayToList( hexBuffer, "" ) );
}
// Get the hash of the byte array using our hashBytes() function
hashByteArray = hashBytes( xmlpostsaltByteArray );
</cfscript>
<!--- The hashByteArray is in HEX format now. Convert to binary --->
<!--- You must binary decode the hashed string before converting it to binary --->
<cfset hashByteArray = toBase64( BinaryDecode( hashByteArray, 'HEX' ) ) />
<!--- The final step is to append this new hashbytearray with the salt byte array --->
<cfset hashByteArray = hashByteArray & saltByteArray />
<!--- now convert this value to a base64 encoded string --->
<cfset hashByteArray2 = toBase64( hashByteArray )/>
Here is what I get for my strXML variable:
Actual xml structure converted from base 64 to string:
<?xml version="1.0"?><EstoreSsoCustomer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><CustomerId>112940</CustomerId><DealerCode>N88888</DealerCode><PointBalance>280</PointBalance><FirstName>Faith</FirstName><LastName>Hutula</LastName></EstoreSsoCustomer>
The final value, hasByteArray2 is not even remotely similar to payload_hash
This is my first time doing this and my understanding of hashing, byte arrays and character conversions flew out of the window decades ago.
What am I doing wrong?
Thank you
Faith Sloan
DigestUtils.sha512 was added in version 1.4. ColdFusion 9 uses an older version, 1.3. That is why the method is not found.
Use the other function based on MessageDigest. Just be sure to pass in the correct algorithm ie:
imageHash = hashBytes( imageBinary, "SHA-512" );
UPDATE: Based on the updated code, some of the instructions may be a bit misleading. I believe they just mean decode the xml and salt strings from their given encoding (base64 and utf-8) into byte arrays, not strings:
// note: salt value has invalid characters for base64
// assuming it is a plain utf-8 string
saltArray = charsetDecode(salt, "utf-8");
xmlByteArray = binaryDecode(xmlPost, "base64");
Then merge the two binary arrays (see custom function)
mergedBytes = mergeArrays( xmlByteArray, saltArray );
Calculate the hash of the new byte array:
messageDigest = createObject( "java", "java.security.MessageDigest" );
messageDigest = messageDigest.getInstance( javaCast( "string", "SHA-512") );
hashedByteArray = messageDigest.digest( javacast("byte[]", mergedBytes) );
Merge the arrays again:
mergedBytes = mergeArrays( hashedByteArray, saltArray);
Finally convert the binary to base64 and compare:
calculatedPayload = binaryEncode( javacast("byte[]", mergedBytes), "base64");
// check results
arePayloadsEqual = compare(calculatedPayload, payload_hash) eq 0;
WriteDump("arePayloadsEqual="& arePayloadsEqual);
WriteDump("calculatedPayload="& calculatedPayload);
WriteDump("payload_hash="& payload_hash);
Note: BinaryDecode/CharsetDecode return java arrays. Unlike CF arrays, they are immutable (ie cannot be changed). So the handy addAll(..) trick will not work here.
// merge immutable arrays the long way
function mergeArrays( array1, array2 ){
var i = 0;
var newArray = [];
for (i = 1; i <= arrayLen(arguments.array1); i++) {
arrayAppend(newArray, arguments.array1[i]);
}
for (i = 1; i <= arrayLen(arguments.array2); i++) {
arrayAppend(newArray, arguments.array2[i]);
}
return newArray;
}

Does Len function only evaluate numerical results?

Why does the following code not output "Error" if the form is submitted with a blank field? Does Len only evaluate numerical values?
<cfif NOT Len(Trim("Form.myField"))>
<cfoutput>Error</cfoutput>
</cfif>
The following also does not evaluate as expected:
<cfif Len(Trim("Form.myField")) IS 0>
<cfoutput>Error</cfoutput>
</cfif>
HTML:
<input type="text" name="myField" value="">
Because it's evaluating the literal string "Form.myField", which is not length 0.
Try: <cfif len(trim(form.myField)) EQ 0>
are you sure you're supposed to pass in the parameter in quotes within the trim function? it may be literally trimming the string "Form.myField"

openbd cfloop over a date

I'm trying to convert my sites from CF8 to openBD. I have a cfloop in a site that loops over a date range.
In essence, I want to insert a new record into the db for every 2 weeks (step) of a date range (from and to)
my loop looks like this...
<cfloop
from = "#form.startDate#"
to = "#form.endDate#"
index = "i"
step = "#theStep#"
>
This works perfectly in CF8, in openBD, I get this error...
Data not supported: value [11/05/09] is not a number
Any ideas of a work around?
Thx
Your problem lies in not checking for ambiguous locale dependent date strings from your FORM.
A more robust version would be this:
<cfset SetLocale("English (US)")> <!--- set expected input locale here --->
<cfif LSIsDate(form.startDate) and LSIsDate(form.endDate)>
<cfset theStep = 14>
<cfloop
from = "#LSParseDate(form.startDate)#"
to = "#LSParseDate(form.endDate)#"
index = "i"
step = "#theStep#"
>
<!--- do stuff --->
</cfloop>
<cfelse>
<!--- output some error message --->
</cfif>
It would be helpful to restrict people to entering unambiguous date formats into the FORM, like "yyyy-mm-dd".
The "value is not a number" error comes from the fact that the loop still goes over numbers, even if you feed it dates. It uses a numerical representation of these dates then, but they must be valid and intelligible for that to work.
I can't see your code, but here's my first suggestion:
<cfset current = [your begin date]>
<cfloop condition = "datecompare(enddate, current)">
[do stuff]
<cfset current = dateadd('d', 14, current)>
</cfloop>
HTH.
As Ben says, your code isn't there - you need to use the 101 010 icon to create a code block for it.
Here's another solution which should work:
<cfloop index="Current" from="#parseDateTime(StartDate)#" to="#parseDateTime(EndDate)#" step="14">
[do stuff]
</cfloop>