Small differences in SHA1 hashes - gwt

A project I am working on uses Apache Shiro as a security framework. Passwords are SHA1 hashed (no salt, no iterations). Login is SSL secured. However, the remaining part of the application is not SSL secured. In this context (no SSL) there should be a form where a user can change the password.
Since it wouldn't be a good idea to transmit it plainly it should be hashed on the client and then transmitted to the server. As the client is GWT (2.3) based, I am trying this library http://code.google.com/p/gwt-crypto, which uses code from bouncycastle.
However, in many cases (not all) the hashes generated by both frameworks differ in 1-4(?) characters.
For instance "happa3" is hashed to
"fe7f3cffd8a5f0512a5f1120f1369f48cd6f47c2"
by both implementations, whereas just "happa" is hashed to
"fb3c3a741b4e07a87d9cb68f3db020d6fbfed00a"
by the Shiro implementation and to
"fb3c3a741b4e07a87d9cb63f3db020d6fbfed00a"
by the gwt-crypto implementation (23rd character differs).
I wonder whether there is a "correct"/standard SHA1 hashing and whether there is a bug in one of the libraries or maybe my usage of them is flawed.
One of my first thoughts was related to different encodings or strange conversions due to different transport mechanisms (RPC vs. Post). To my knowledge though (and what puzzles me most), SHA1 hashes should differ completely with a high probability if there is just a difference of a single bit. So different encodings shouldn't be the issue here.
I am using this code on the client (GWT) for hashing:
String hashed = toHex(createSHA1Hash("password"));
...
private String createSHA1Hash(String passwordString){
SHA1Digest sha1 = new SHA1Digest();
byte[] bytes;
byte[] result = new byte[sha1.getDigestSize()];
try {
bytes = passwordString.getBytes();
sha1.update(bytes, 0, bytes.length);
int val = sha1.doFinal(result, 0);
} catch (UnsupportedEncodingException e) {}
return new String(result);
}
public String toHex(String arg) {
return new BigInteger(1, arg.getBytes()).toString(16);
}
And this on the server (Shiro):
String hashed = new Sha1Hash("password").toHex()
which afaics does something very similar behind the scenes (had a quick view on the source code).
Did I miss something obvious here?
EDIT: Seems like the GWT code does not run natively for some reason (i.e. just in development mode) and silently fails (it does compile, though). Have to find out why...
Edit(2): "int val = sha1.doFinal(result, 0);" is the line that makes trouble, i.e. if present, the whole code does not run natively (JS) but only in dev-mode (with wrong results)

You could test this version:
public class SHA1 {
public static native String calcSHA1(String s) /*-{
//
// A JavaScript implementation of the Secure Hash Algorithm, SHA-1, as defined
// in FIPS 180-1
// Version 2.2 Copyright Paul Johnston 2000 - 2009.
// Other contributors: Greg Holt, Andrew Kepert, Ydnar, Lostinet
// Distributed under the BSD License
// See http://pajhome.org.uk/crypt/md5 for details.
//
//
// Configurable variables. You may need to tweak these to be compatible with
// the server-side, but the defaults work in most cases.
//
var hexcase = 0; // hex output format. 0 - lowercase; 1 - uppercase
var b64pad = ""; // base-64 pad character. "=" for strict RFC compliance
//
// These are the functions you'll usually want to call
// They take string arguments and return either hex or base-64 encoded strings
//
function b64_sha1(s) { return rstr2b64(rstr_sha1(str2rstr_utf8(s))); }
function any_sha1(s, e) { return rstr2any(rstr_sha1(str2rstr_utf8(s)), e); }
function hex_hmac_sha1(k, d)
{ return rstr2hex(rstr_hmac_sha1(str2rstr_utf8(k), str2rstr_utf8(d))); }
function b64_hmac_sha1(k, d)
{ return rstr2b64(rstr_hmac_sha1(str2rstr_utf8(k), str2rstr_utf8(d))); }
function any_hmac_sha1(k, d, e)
{ return rstr2any(rstr_hmac_sha1(str2rstr_utf8(k), str2rstr_utf8(d)), e); }
//
// Perform a simple self-test to see if the VM is working
//
function sha1_vm_test()
{
return hex_sha1("abc").toLowerCase() == "a9993e364706816aba3e25717850c26c9cd0d89d";
}
//
// Calculate the SHA1 of a raw string
//
function rstr_sha1(s)
{
return binb2rstr(binb_sha1(rstr2binb(s), s.length * 8));
}
//
// Calculate the HMAC-SHA1 of a key and some data (raw strings)
//
function rstr_hmac_sha1(key, data)
{
var bkey = rstr2binb(key);
if(bkey.length > 16) bkey = binb_sha1(bkey, key.length * 8);
var ipad = Array(16), opad = Array(16);
for(var i = 0; i < 16; i++)
{
ipad[i] = bkey[i] ^ 0x36363636;
opad[i] = bkey[i] ^ 0x5C5C5C5C;
}
var hash = binb_sha1(ipad.concat(rstr2binb(data)), 512 + data.length * 8);
return binb2rstr(binb_sha1(opad.concat(hash), 512 + 160));
}
//
// Convert a raw string to a hex string
//
function rstr2hex(input)
{
try { hexcase } catch(e) { hexcase=0; }
var hex_tab = hexcase ? "0123456789ABCDEF" : "0123456789abcdef";
var output = "";
var x;
for(var i = 0; i < input.length; i++)
{
x = input.charCodeAt(i);
output += hex_tab.charAt((x >>> 4) & 0x0F)
+ hex_tab.charAt( x & 0x0F);
}
return output;
}
//
// Convert a raw string to a base-64 string
//
function rstr2b64(input)
{
try { b64pad } catch(e) { b64pad=''; }
var tab = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
var output = "";
var len = input.length;
for(var i = 0; i < len; i += 3)
{
var triplet = (input.charCodeAt(i) << 16)
| (i + 1 < len ? input.charCodeAt(i+1) << 8 : 0)
| (i + 2 < len ? input.charCodeAt(i+2) : 0);
for(var j = 0; j < 4; j++)
{
if(i * 8 + j * 6 > input.length * 8) output += b64pad;
else output += tab.charAt((triplet >>> 6*(3-j)) & 0x3F);
}
}
return output;
}
//
// Convert a raw string to an arbitrary string encoding
//
function rstr2any(input, encoding)
{
var divisor = encoding.length;
var remainders = Array();
var i, q, x, quotient;
// Convert to an array of 16-bit big-endian values, forming the dividend
var dividend = Array(Math.ceil(input.length / 2));
for(i = 0; i < dividend.length; i++)
{
dividend[i] = (input.charCodeAt(i * 2) << 8) | input.charCodeAt(i * 2 + 1);
}
//
// Repeatedly perform a long division. The binary array forms the dividend,
// the length of the encoding is the divisor. Once computed, the quotient
// forms the dividend for the next step. We stop when the dividend is zero.
// All remainders are stored for later use.
//
while(dividend.length > 0)
{
quotient = Array();
x = 0;
for(i = 0; i < dividend.length; i++)
{
x = (x << 16) + dividend[i];
q = Math.floor(x / divisor);
x -= q * divisor;
if(quotient.length > 0 || q > 0)
quotient[quotient.length] = q;
}
remainders[remainders.length] = x;
dividend = quotient;
}
// Convert the remainders to the output string
var output = "";
for(i = remainders.length - 1; i >= 0; i--)
output += encoding.charAt(remainders[i]);
// Append leading zero equivalents
var full_length = Math.ceil(input.length * 8 /
(Math.log(encoding.length) / Math.log(2)))
for(i = output.length; i < full_length; i++)
output = encoding[0] + output;
return output;
}
//
// Encode a string as utf-8.
// For efficiency, this assumes the input is valid utf-16.
//
function str2rstr_utf8(input)
{
var output = "";
var i = -1;
var x, y;
while(++i < input.length)
{
// Decode utf-16 surrogate pairs
x = input.charCodeAt(i);
y = i + 1 < input.length ? input.charCodeAt(i + 1) : 0;
if(0xD800 <= x && x <= 0xDBFF && 0xDC00 <= y && y <= 0xDFFF)
{
x = 0x10000 + ((x & 0x03FF) << 10) + (y & 0x03FF);
i++;
}
// Encode output as utf-8
if(x <= 0x7F)
output += String.fromCharCode(x);
else if(x <= 0x7FF)
output += String.fromCharCode(0xC0 | ((x >>> 6 ) & 0x1F),
0x80 | ( x & 0x3F));
else if(x <= 0xFFFF)
output += String.fromCharCode(0xE0 | ((x >>> 12) & 0x0F),
0x80 | ((x >>> 6 ) & 0x3F),
0x80 | ( x & 0x3F));
else if(x <= 0x1FFFFF)
output += String.fromCharCode(0xF0 | ((x >>> 18) & 0x07),
0x80 | ((x >>> 12) & 0x3F),
0x80 | ((x >>> 6 ) & 0x3F),
0x80 | ( x & 0x3F));
}
return output;
}
//
// Encode a string as utf-16
//
function str2rstr_utf16le(input)
{
var output = "";
for(var i = 0; i < input.length; i++)
output += String.fromCharCode( input.charCodeAt(i) & 0xFF,
(input.charCodeAt(i) >>> 8) & 0xFF);
return output;
}
function str2rstr_utf16be(input)
{
var output = "";
for(var i = 0; i < input.length; i++)
output += String.fromCharCode((input.charCodeAt(i) >>> 8) & 0xFF,
input.charCodeAt(i) & 0xFF);
return output;
}
//
// Convert a raw string to an array of big-endian words
// Characters >255 have their high-byte silently ignored.
//
function rstr2binb(input)
{
var output = Array(input.length >> 2);
for(var i = 0; i < output.length; i++)
output[i] = 0;
for(var i = 0; i < input.length * 8; i += 8)
output[i>>5] |= (input.charCodeAt(i / 8) & 0xFF) << (24 - i % 32);
return output;
}
//
// Convert an array of big-endian words to a string
//
function binb2rstr(input)
{
var output = "";
for(var i = 0; i < input.length * 32; i += 8)
output += String.fromCharCode((input[i>>5] >>> (24 - i % 32)) & 0xFF);
return output;
}
//
// Calculate the SHA-1 of an array of big-endian words, and a bit length
//
function binb_sha1(x, len)
{
// append padding
x[len >> 5] |= 0x80 << (24 - len % 32);
x[((len + 64 >> 9) << 4) + 15] = len;
var w = Array(80);
var a = 1732584193;
var b = -271733879;
var c = -1732584194;
var d = 271733878;
var e = -1009589776;
for(var i = 0; i < x.length; i += 16)
{
var olda = a;
var oldb = b;
var oldc = c;
var oldd = d;
var olde = e;
for(var j = 0; j < 80; j++)
{
if(j < 16) w[j] = x[i + j];
else w[j] = bit_rol(w[j-3] ^ w[j-8] ^ w[j-14] ^ w[j-16], 1);
var t = safe_add(safe_add(bit_rol(a, 5), sha1_ft(j, b, c, d)),
safe_add(safe_add(e, w[j]), sha1_kt(j)));
e = d;
d = c;
c = bit_rol(b, 30);
b = a;
a = t;
}
a = safe_add(a, olda);
b = safe_add(b, oldb);
c = safe_add(c, oldc);
d = safe_add(d, oldd);
e = safe_add(e, olde);
}
return Array(a, b, c, d, e);
}
//
// Perform the appropriate triplet combination function for the current
// iteration
//
function sha1_ft(t, b, c, d)
{
if(t < 20) return (b & c) | ((~b) & d);
if(t < 40) return b ^ c ^ d;
if(t < 60) return (b & c) | (b & d) | (c & d);
return b ^ c ^ d;
}
//
// Determine the appropriate additive constant for the current iteration
//
function sha1_kt(t)
{
return (t < 20) ? 1518500249 : (t < 40) ? 1859775393 :
(t < 60) ? -1894007588 : -899497514;
}
//
// Add integers, wrapping at 2^32. This uses 16-bit operations internally
// to work around bugs in some JS interpreters.
//
function safe_add(x, y)
{
var lsw = (x & 0xFFFF) + (y & 0xFFFF);
var msw = (x >> 16) + (y >> 16) + (lsw >> 16);
return (msw << 16) | (lsw & 0xFFFF);
}
//
// Bitwise rotate a 32-bit number to the left.
//
function bit_rol(num, cnt)
{
return (num << cnt) | (num >>> (32 - cnt));
}
return rstr2hex(rstr_sha1(str2rstr_utf8(s)));
}-*/;
}
I'm using it in my client side sha generation and it worked well.

Related

How can I pack a float3 into one float

I am doing some animation jobs. I need to pack some pivots into UV and then my shader can read them.
I need to pack 4 float3 into a float4. Therefore, I need to pack each float3 into a float.
These 4 float3 are (model space position1, direction1, model space position2, direction2). I know how to handle the directions because they are normalized. I can use something like:
#define f3_f(c) (dot(round((c) * 255), float3(65536, 256, 1)))
#define f_f3(f) (frac((f) / float3(16777216, 65536, 256)))
But how can I handle positions? I am using SM3.0 and I can't use bitwise operation.
Do you really need to pack it into a float (4 bytes), or can you pack it into a 32-bit unsigned integer (i.e. 4 bytes)?
If, so take a look at the code in DirectXMath for converting to/from various formats as done in DirectXTex such as DXGI_FORMAT_R11G11B10_FLOAT. Since this format is positive only, you'll have to do a scale and bias to/from the format to handle the [-1,+1] range, but that's easy to do (0.5*value + 0.5 <-> 2*value - 1).
// 3D vector: 11/11/10 floating-point components
// The 3D vector is packed into 32 bits as follows: a 5-bit biased exponent
// and 6-bit mantissa for x component, a 5-bit biased exponent and
// 6-bit mantissa for y component, a 5-bit biased exponent and a 5-bit
// mantissa for z. The z component is stored in the most significant bits
// and the x component in the least significant bits. No sign bits so
// all partial-precision numbers are positive.
// (Z10Y11X11): [32] ZZZZZzzz zzzYYYYY yyyyyyXX XXXxxxxx [0]
struct XMFLOAT3PK
{
union
{
struct
{
uint32_t xm : 6; // x-mantissa
uint32_t xe : 5; // x-exponent
uint32_t ym : 6; // y-mantissa
uint32_t ye : 5; // y-exponent
uint32_t zm : 5; // z-mantissa
uint32_t ze : 5; // z-exponent
};
uint32_t v;
};
XMFLOAT3PK() = default;
XMFLOAT3PK(const XMFLOAT3PK&) = default;
XMFLOAT3PK& operator=(const XMFLOAT3PK&) = default;
XMFLOAT3PK(XMFLOAT3PK&&) = default;
XMFLOAT3PK& operator=(XMFLOAT3PK&&) = default;
explicit XM_CONSTEXPR XMFLOAT3PK(uint32_t Packed) : v(Packed) {}
XMFLOAT3PK(float _x, float _y, float _z);
explicit XMFLOAT3PK(_In_reads_(3) const float *pArray);
operator uint32_t () const { return v; }
XMFLOAT3PK& operator= (uint32_t Packed) { v = Packed; return *this; }
};
// Converts float3 to the 11/11/10 format
inline void XM_CALLCONV XMStoreFloat3PK
(
XMFLOAT3PK* pDestination,
FXMVECTOR V
)
{
assert(pDestination);
__declspec(align(16)) uint32_t IValue[4];
XMStoreFloat3A( reinterpret_cast<XMFLOAT3A*>(&IValue), V );
uint32_t Result[3];
// X & Y Channels (5-bit exponent, 6-bit mantissa)
for(uint32_t j=0; j < 2; ++j)
{
uint32_t Sign = IValue[j] & 0x80000000;
uint32_t I = IValue[j] & 0x7FFFFFFF;
if ((I & 0x7F800000) == 0x7F800000)
{
// INF or NAN
Result[j] = 0x7c0;
if (( I & 0x7FFFFF ) != 0)
{
Result[j] = 0x7c0 | (((I>>17)|(I>>11)|(I>>6)|(I))&0x3f);
}
else if ( Sign )
{
// -INF is clamped to 0 since 3PK is positive only
Result[j] = 0;
}
}
else if ( Sign )
{
// 3PK is positive only, so clamp to zero
Result[j] = 0;
}
else if (I > 0x477E0000U)
{
// The number is too large to be represented as a float11, set to max
Result[j] = 0x7BF;
}
else
{
if (I < 0x38800000U)
{
// The number is too small to be represented as a normalized float11
// Convert it to a denormalized value.
uint32_t Shift = 113U - (I >> 23U);
I = (0x800000U | (I & 0x7FFFFFU)) >> Shift;
}
else
{
// Rebias the exponent to represent the value as a normalized float11
I += 0xC8000000U;
}
Result[j] = ((I + 0xFFFFU + ((I >> 17U) & 1U)) >> 17U)&0x7ffU;
}
}
// Z Channel (5-bit exponent, 5-bit mantissa)
uint32_t Sign = IValue[2] & 0x80000000;
uint32_t I = IValue[2] & 0x7FFFFFFF;
if ((I & 0x7F800000) == 0x7F800000)
{
// INF or NAN
Result[2] = 0x3e0;
if ( I & 0x7FFFFF )
{
Result[2] = 0x3e0 | (((I>>18)|(I>>13)|(I>>3)|(I))&0x1f);
}
else if ( Sign )
{
// -INF is clamped to 0 since 3PK is positive only
Result[2] = 0;
}
}
else if ( Sign )
{
// 3PK is positive only, so clamp to zero
Result[2] = 0;
}
else if (I > 0x477C0000U)
{
// The number is too large to be represented as a float10, set to max
Result[2] = 0x3df;
}
else
{
if (I < 0x38800000U)
{
// The number is too small to be represented as a normalized float10
// Convert it to a denormalized value.
uint32_t Shift = 113U - (I >> 23U);
I = (0x800000U | (I & 0x7FFFFFU)) >> Shift;
}
else
{
// Rebias the exponent to represent the value as a normalized float10
I += 0xC8000000U;
}
Result[2] = ((I + 0x1FFFFU + ((I >> 18U) & 1U)) >> 18U)&0x3ffU;
}
// Pack Result into memory
pDestination->v = (Result[0] & 0x7ff)
| ( (Result[1] & 0x7ff) << 11 )
| ( (Result[2] & 0x3ff) << 22 );
}
// Converts the 11/11/10 format to float3
inline XMVECTOR XM_CALLCONV XMLoadFloat3PK
(
const XMFLOAT3PK* pSource
)
{
assert(pSource);
__declspec(align(16)) uint32_t Result[4];
uint32_t Mantissa;
uint32_t Exponent;
// X Channel (6-bit mantissa)
Mantissa = pSource->xm;
if ( pSource->xe == 0x1f ) // INF or NAN
{
Result[0] = static_cast<uint32_t>(0x7f800000 | (static_cast<int>(pSource->xm) << 17));
}
else
{
if ( pSource->xe != 0 ) // The value is normalized
{
Exponent = pSource->xe;
}
else if (Mantissa != 0) // The value is denormalized
{
// Normalize the value in the resulting float
Exponent = 1;
do
{
Exponent--;
Mantissa <<= 1;
} while ((Mantissa & 0x40) == 0);
Mantissa &= 0x3F;
}
else // The value is zero
{
Exponent = static_cast<uint32_t>(-112);
}
Result[0] = ((Exponent + 112) << 23) | (Mantissa << 17);
}
// Y Channel (6-bit mantissa)
Mantissa = pSource->ym;
if ( pSource->ye == 0x1f ) // INF or NAN
{
Result[1] = static_cast<uint32_t>(0x7f800000 | (static_cast<int>(pSource->ym) << 17));
}
else
{
if ( pSource->ye != 0 ) // The value is normalized
{
Exponent = pSource->ye;
}
else if (Mantissa != 0) // The value is denormalized
{
// Normalize the value in the resulting float
Exponent = 1;
do
{
Exponent--;
Mantissa <<= 1;
} while ((Mantissa & 0x40) == 0);
Mantissa &= 0x3F;
}
else // The value is zero
{
Exponent = static_cast<uint32_t>(-112);
}
Result[1] = ((Exponent + 112) << 23) | (Mantissa << 17);
}
// Z Channel (5-bit mantissa)
Mantissa = pSource->zm;
if ( pSource->ze == 0x1f ) // INF or NAN
{
Result[2] = static_cast<uint32_t>(0x7f800000 | (static_cast<int>(pSource->zm) << 17));
}
else
{
if ( pSource->ze != 0 ) // The value is normalized
{
Exponent = pSource->ze;
}
else if (Mantissa != 0) // The value is denormalized
{
// Normalize the value in the resulting float
Exponent = 1;
do
{
Exponent--;
Mantissa <<= 1;
} while ((Mantissa & 0x20) == 0);
Mantissa &= 0x1F;
}
else // The value is zero
{
Exponent = static_cast<uint32_t>(-112);
}
Result[2] = ((Exponent + 112) << 23) | (Mantissa << 18);
}
return XMLoadFloat3A( reinterpret_cast<const XMFLOAT3A*>(&Result) );
}

Google Translate TTS API blocked

Google implemented a captcha to block people from accessing the TTS translate API https://translate.google.com/translate_tts?ie=UTF-8&q=test&tl=zh-TW. I was using it in my mobile application. Now, it is not returning anything. How do I get around the captcha?
Add the qualifier '&client=tw-ob' to the end of your query.
https://translate.google.com/translate_tts?ie=UTF-8&q=test&tl=zh-TW&client=tw-ob
This answer no longer works consistently. Your ip address will be blocked by google temporarily if you abuse this too much.
there are 3 main issues:
you must include "client" in your query string (client=t seems to work).
(in case you are trying to retrieve it using AJAX) the Referer of the HTTP request must be https://translate.google.com/
"tk" field changes for every query, and it must be populated with a matching hash:
tk = hash(q, TKK), where q is the text to be TTSed, and TKK is a var in the global scope when you load translate.google.com: (type 'window.TKK' in the console). see the hash function at the bottom of this reply (calcHash).
to summarize:
function generateGoogleTTSLink(q, tl, tkk) {
var tk = calcHash(q, tkk);
return `https://translate.google.com/translate_tts?ie=UTF-8&total=1&idx=0&client=t&ttsspeed=1&tl=${tl}&tk=${tk}&q=${q}&textlen=${q.length}`;
}
generateGoogleTTSLink('ciao', 'it', '410353.1336369826');
// see definition of "calcHash" in the bottom of this comment.
=> to get your hands on a TKK, you can open Google Translate website, then type "TKK" in developer tools' console (e.g.: "410353.1336369826").
NOTE that TKK value changes every hour, and so, old TKKs might get blocked at some point, and refreshing it may be necessary (although so far it seems like old keys can work for a LONG time).
if you DO wish to periodically refresh TKK, it can be automated pretty easily, but not if you're running your code from the browser.
you can find a full NodeJS implementation here:
https://github.com/guyrotem/google-translate-server.
it exposes a minimal TTS API (query, language), and is deployed to a free Heroku server, so you can test it online if you like.
function shiftLeftOrRightThenSumOrXor(num, opArray) {
return opArray.reduce((acc, opString) => {
var op1 = opString[1]; // '+' | '-' ~ SUM | XOR
var op2 = opString[0]; // '+' | '^' ~ SLL | SRL
var xd = opString[2]; // [0-9a-f]
var shiftAmount = hexCharAsNumber(xd);
var mask = (op1 == '+') ? acc >>> shiftAmount : acc << shiftAmount;
return (op2 == '+') ? (acc + mask & 0xffffffff) : (acc ^ mask);
}, num);
}
function hexCharAsNumber(xd) {
return (xd >= 'a') ? xd.charCodeAt(0) - 87 : Number(xd);
}
function transformQuery(query) {
for (var e = [], f = 0, g = 0; g < query.length; g++) {
var l = query.charCodeAt(g);
if (l < 128) {
e[f++] = l; // 0{l[6-0]}
} else if (l < 2048) {
e[f++] = l >> 6 | 0xC0; // 110{l[10-6]}
e[f++] = l & 0x3F | 0x80; // 10{l[5-0]}
} else if (0xD800 == (l & 0xFC00) && g + 1 < query.length && 0xDC00 == (query.charCodeAt(g + 1) & 0xFC00)) {
// that's pretty rare... (avoid ovf?)
l = (1 << 16) + ((l & 0x03FF) << 10) + (query.charCodeAt(++g) & 0x03FF);
e[f++] = l >> 18 | 0xF0; // 111100{l[9-8*]}
e[f++] = l >> 12 & 0x3F | 0x80; // 10{l[7*-2]}
e[f++] = l & 0x3F | 0x80; // 10{(l+1)[5-0]}
} else {
e[f++] = l >> 12 | 0xE0; // 1110{l[15-12]}
e[f++] = l >> 6 & 0x3F | 0x80; // 10{l[11-6]}
e[f++] = l & 0x3F | 0x80; // 10{l[5-0]}
}
}
return e;
}
function normalizeHash(encondindRound2) {
if (encondindRound2 < 0) {
encondindRound2 = (encondindRound2 & 0x7fffffff) + 0x80000000;
}
return encondindRound2 % 1E6;
}
function calcHash(query, windowTkk) {
// STEP 1: spread the the query char codes on a byte-array, 1-3 bytes per char
var bytesArray = transformQuery(query);
// STEP 2: starting with TKK index, add the array from last step one-by-one, and do 2 rounds of shift+add/xor
var d = windowTkk.split('.');
var tkkIndex = Number(d[0]) || 0;
var tkkKey = Number(d[1]) || 0;
var encondingRound1 = bytesArray.reduce((acc, current) => {
acc += current;
return shiftLeftOrRightThenSumOrXor(acc, ['+-a', '^+6'])
}, tkkIndex);
// STEP 3: apply 3 rounds of shift+add/xor and XOR with they TKK key
var encondingRound2 = shiftLeftOrRightThenSumOrXor(encondingRound1, ['+-3', '^+b', '+-f']) ^ tkkKey;
// STEP 4: Normalize to 2s complement & format
var normalizedResult = normalizeHash(encondingRound2);
return normalizedResult.toString() + "." + (normalizedResult ^ tkkIndex)
}
// usage example:
var tk = calcHash('hola', '409837.2120040981');
console.log('tk=' + tk);
// OUTPUT: 'tk=70528.480109'
You can also try this format :
pass q= urlencode format of your language
(In JavaScript you can use the encodeURI() function & PHP has the rawurlencode() function)
pass tl = language short name (suppose bangla = bn)
Now try this :
https://translate.google.com.vn/translate_tts?ie=UTF-8&q=%E0%A6%A2%E0%A6%BE%E0%A6%95%E0%A6%BE+&tl=bn&client=tw-ob
First, to avoid captcha, you have to set a proper user-agent like: "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0"
Then to not being blocked you must provide a proper token ("tk" get parameter) for each single request.
On the web you can find many different kind of scripts that try to calculate the token after a lot of reverse engineering...but every time the big G change the algorithm you're stuck again, so it's much easier to retrieve your token just observing in deep similar requests to translate page (with your text in the url).
You can read the token time by time grepping "tk=" from the output of this simple code with phantomjs:
"use strict";
var page = require('webpage').create();
var system = require('system');
var args = system.args;
if (args.length != 2) { console.log("usage: "+args[0]+" text"); phantom.exit(1); }
page.onConsoleMessage = function(msg) { console.log(msg); };
page.onResourceRequested = function(request) { console.log('Request ' + JSON.stringify(request, undefined, 4)); };
page.open("https://translate.google.it/?hl=it&tab=wT#fr/it/"+args[1], function(status) {
if (status === "success") { phantom.exit(0); }
else { phantom.exit(1); }
});
so in the end you can get your speech with something like:
wget -U "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0"
"http://translate.google.com/translate_tts?ie=UTF-8&tl=it&tk=52269.458629&q=ciao&client=t" -O ciao.mp3
(token are probably time based so this link may not work tomorrow)
I rewrote Guy Rotem's answer in Java, so if you prefer Java over Javascript, feel free to use:
public class Hasher {
public long shiftLeftOrRightThenSumOrXor(long num, String[] opArray) {
long result = num;
int current = 0;
while (current < opArray.length) {
char op1 = opArray[current].charAt(1); // '+' | '-' ~ SUM | XOR
char op2 = opArray[current].charAt(0); // '+' | '^' ~ SLL | SRL
char xd = opArray[current].charAt(2); // [0-9a-f]
assertError(op1 == '+'
|| op1 == '-', "Invalid OP: " + op1);
assertError(op2 == '+'
|| op2 == '^', "Invalid OP: " + op2);
assertError(('0' <= xd && xd <= '9')
|| ('a' <= xd && xd <='f'), "Not an 0x? value: " + xd);
int shiftAmount = hexCharAsNumber(xd);
int mask = (op1 == '+') ? ((int) result) >>> shiftAmount : ((int) result) << shiftAmount;
long subresult = (op2 == '+') ? (((int) result) + ((int) mask) & 0xffffffff)
: (((int) result) ^ mask);
result = subresult;
current++;
}
return result;
}
public void assertError(boolean cond, String e) {
if (!cond) {
System.err.println();
}
}
public int hexCharAsNumber(char xd) {
return (xd >= 'a') ? xd - 87 : Character.getNumericValue(xd);
}
public int[] transformQuery(String query) {
int[] e = new int[1000];
int resultSize = 1000;
for (int f = 0, g = 0; g < query.length(); g++) {
int l = query.charAt(g);
if (l < 128) {
e[f++] = l; // 0{l[6-0]}
} else if (l < 2048) {
e[f++] = l >> 6 | 0xC0; // 110{l[10-6]}
e[f++] = l & 0x3F | 0x80; // 10{l[5-0]}
} else if (0xD800 == (l & 0xFC00) &&
g + 1 < query.length() && 0xDC00 == (query.charAt(g + 1) & 0xFC00)) {
// that's pretty rare... (avoid ovf?)
l = (1 << 16) + ((l & 0x03FF) << 10) + (query.charAt(++g) & 0x03FF);
e[f++] = l >> 18 | 0xF0; // 111100{l[9-8*]}
e[f++] = l >> 12 & 0x3F | 0x80; // 10{l[7*-2]}
e[f++] = l & 0x3F | 0x80; // 10{(l+1)[5-0]}
} else {
e[f++] = l >> 12 | 0xE0; // 1110{l[15-12]}
e[f++] = l >> 6 & 0x3F | 0x80; // 10{l[11-6]}
e[f++] = l & 0x3F | 0x80; // 10{l[5-0]}
}
resultSize = f;
}
return Arrays.copyOf(e, resultSize);
}
public long normalizeHash(long encondindRound2) {
if (encondindRound2 < 0) {
encondindRound2 = (encondindRound2 & 0x7fffffff) + 0x80000000L;
}
return (encondindRound2) % 1_000_000;
}
/*
/ EXAMPLE:
/
/ INPUT: query: 'hola', windowTkk: '409837.2120040981'
/ OUTPUT: '70528.480109'
/
*/
public String calcHash(String query, String windowTkk) {
// STEP 1: spread the the query char codes on a byte-array, 1-3 bytes per char
int[] bytesArray = transformQuery(query);
// STEP 2: starting with TKK index,
// add the array from last step one-by-one, and do 2 rounds of shift+add/xor
String[] d = windowTkk.split("\\.");
int tkkIndex = 0;
try {
tkkIndex = Integer.valueOf(d[0]);
}
catch (Exception e) {
e.printStackTrace();
}
long tkkKey = 0;
try {
tkkKey = Long.valueOf(d[1]);
}
catch (Exception e) {
e.printStackTrace();
}
int current = 0;
long result = tkkIndex;
while (current < bytesArray.length) {
result += bytesArray[current];
long subresult = shiftLeftOrRightThenSumOrXor(result,
new String[] {"+-a", "^+6"});
result = subresult;
current++;
}
long encondingRound1 = result;
//System.out.println("encodingRound1: " + encondingRound1);
// STEP 3: apply 3 rounds of shift+add/xor and XOR with they TKK key
long encondingRound2 = ((int) shiftLeftOrRightThenSumOrXor(encondingRound1,
new String[] {"+-3", "^+b", "+-f"})) ^ ((int) tkkKey);
//System.out.println("encodingRound2: " + encondingRound2);
// STEP 4: Normalize to 2s complement & format
long normalizedResult = normalizeHash(encondingRound2);
//System.out.println("normalizedResult: " + normalizedResult);
return String.valueOf(normalizedResult) + "."
+ (((int) normalizedResult) ^ (tkkIndex));
}
}

Type conversion - string of characters to integer

Hello I am writing my program in C, using PSoC tools to program my Cypress development kit. I am facing an issue regarding type conversion of a string of characters collected in my circular buffer (buffer) to a local variable "input_R", ultimately to a global variable st_input_R. The event in my FSM calling this action function is given below:
void st_state_5_event_0(void) //S6 OR S4
{
char buffer[ST_NODE_LIMIT] = {0};
st_copy_buffer(buffer);
uint32 input_R = {0};
mi_utoa(input_R, buffer);
if ((input_R >= 19000) && (input_R <= 26000))
{
st_input_R = input_R;
_st_data.state = ST_STATE_6;
}
else
{
_st_data.status = ST_STATE_4;
}
UART_1_Stop();
st_stop();
st_empty_buffer();
}
ST_NODE_LIMIT = 64
st_copy_buffer copies the the numbers I type in using hyper terminal to the circular buffer named "buffer".
input_R is the 32 bit integer I want the buffer content to be converted to.
mi_utoa is the function I am using for converting the contents in the buffer to input_R and is detailed below:
uint8 mi_utoa(uint32 number, char *string)
{
uint8 result = MI_BAD_ARGUMENT;
if (string != NULL)
{
uint8 c = 0;
uint8 i = 0;
uint8 j = 0;
do
{
string[i++] = number % 10 + '0';
} while ((number /=10) > 0);
string[i] = '\0';
for (i = 0, j = strlen(string) - 1 ; i < j ; i++, j--)
{
c = string[i];
string[i] = string[j];
string[j] = c;
}
result = MI_SUCCESS;
}
return result;
}
The problem is, suppose if I enter 21500(+\r), the mi_utoa function converts the first digit to 0 the second digit to \000 while the other digits including the carriage return "\r" remains unaltered. As a result the input_R is NOT = 21500. Its happening for any string of digits I input. So the condition "if ((input_R >= 19000) && (input_R <= 26000))" is never satisfied. Hence the FSM returns to state 4 all the time and I am going in circles.
Can you please advice where the bug is in the mi_utoa function? Let me know if you want to know any other details.
Your function st_state_5_event_0() sets the value input_R to zero. Then you call mi_utoa(), which converts the value input_R to an ascii string, "0".
void st_state_5_event_0(void) //S6 OR S4
{
char buffer[ST_NODE_LIMIT] = {0};
//what is the value of buffer after this statement?
st_copy_buffer(buffer);
//the value of input_R after the next statement is =0
uint32 input_R = {0};
//conversion of input_R to string will give ="0"
mi_utoa(input_R, buffer);
if ((input_R >= 19000) && (input_R <= 26000))
{
st_input_R = input_R;
_st_data.state = ST_STATE_6;
}
//...
}
You probably want a function which converts your ascii buffer to a number.
uint8
mi_atou(uint32* number, char *string)
{
uint8 result = MI_BAD_ARGUMENT;
if (!string) return result;
if (!number) return result;
uint8 ndx = 0;
uint32 accum=0;
for( ndx=0; string[ndx]; ++ndx )
{
if( (string[ndx] >= '0') && (string[ndx] <= '9') )
{
accum = accum*10 + (string[ndx]-'0');
//printf("[%d] %s -> %d\n",ndx,string,accum);
}
else break;
}
//printf("[%d] %s -> %d\n",ndx,string,accum);
*number = accum;
result = MI_SUCCESS;
return result;
}
Which you would call by providing the address of the number to store the result,
mi_atou(&input_R, buffer);

Windows C API for UTF8 to 1252

I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like:
UTF8 -> UTF16 -> 1252
I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single call?
I should probably just pull in the iconv library, but am feeling lazy.
Thanks
Windows 1252 is mostly equivalent to latin-1, aka ISO-8859-1: Windows-1252 just has some additional characters allocated in the latin-1 reserved range 128-159. If you are ready to ignore those extra characters, and stick to latin-1, then conversion is rather easy. Try this:
#include <stddef.h>
/*
* Convert from UTF-8 to latin-1. Invalid encodings, and encodings of
* code points beyond 255, are replaced by question marks. No more than
* dst_max_len bytes are stored in the destination array. Returned value
* is the length that the latin-1 string would have had, assuming a big
* enough destination buffer.
*/
size_t
utf8_to_latin1(char *src, size_t src_len,
char *dst, size_t dst_max_len)
{
unsigned char *sb;
size_t u, v;
u = v = 0;
sb = (unsigned char *)src;
while (u < src_len) {
int c = sb[u ++];
if (c >= 0x80) {
if (c >= 0xC0 && c < 0xE0) {
if (u == src_len) {
c = '?';
} else {
int w = sb[u];
if (w >= 0x80 && w < 0xC0) {
u ++;
c = ((c & 0x1F) << 6)
+ (w & 0x3F);
} else {
c = '?';
}
}
} else {
int i;
for (i = 6; i >= 0; i --)
if (!(c & (1 << i)))
break;
c = '?';
u += i;
}
}
if (v < dst_max_len)
dst[v] = (char)c;
v ++;
}
return v;
}
/*
* Convert from latin-1 to UTF-8. No more than dst_max_len bytes are
* stored in the destination array. Returned value is the length that
* the UTF-8 string would have had, assuming a big enough destination
* buffer.
*/
size_t
latin1_to_utf8(char *src, size_t src_len,
char *dst, size_t dst_max_len)
{
unsigned char *sb;
size_t u, v;
u = v = 0;
sb = (unsigned char *)src;
while (u < src_len) {
int c = sb[u ++];
if (c < 0x80) {
if (v < dst_max_len)
dst[v] = (char)c;
v ++;
} else {
int h = 0xC0 + (c >> 6);
int l = 0x80 + (c & 0x3F);
if (v < dst_max_len) {
dst[v] = (char)h;
if ((v + 1) < dst_max_len)
dst[v + 1] = (char)l;
}
v += 2;
}
}
return v;
}
Note that I make no guarantee about this code. This is completely untested.

form a number using consecutive numbers

I was puzzled with one of the question in Microsoft interview which is as given below:
A function should accept a range( 3 - 21 ) and it should print all the consecutive numbers combinations to form each number as given below:
3 = 1+2
5 = 2+3
6 = 1+2+3
7 = 3+4
9 = 4+5
10 = 1+2+3+4
11 = 5+6
12 = 3+4+5
13 = 6+7
14 = 2+3+4+5
15 = 1+2+3+4+5
17 = 8+9
18 = 5+6+7
19 = 9+10
20 = 2+3+4+5+6
21 = 10+11
21 = 1+2+3+4+5+6
could you please help me in forming this sequence in C#?
Thanks,
Mahesh
So here is a straightforward/naive answer (in C++, and not tested; but you should be able to translate). It uses the fact that
1 + 2 + ... + n = n(n+1)/2,
which you have probably seen before. There are lots of easy optimisations that can be made here which I have omitted for clarity.
void WriteAsSums (int n)
{
for (int i = 0; i < n; i++)
{
for (int j = i; j < n; j++)
{
if (n = (j * (j+1) - i * (i+1))/2) // then n = (i+1) + (i+2) + ... + (j-1) + j
{
std::cout << n << " = ";
for (int k = i + 1; k <= j; k++)
{
std::cout << k;
if (k != j) // this is not the interesting bit
std::cout << std::endl;
else
std::cout << " + ";
}
}
}
}
}
This is some pseudo code to find all the combinations if any exists:
function consecutive_numbers(n, m)
list = [] // empty list
list.push_back(m)
while m != n
if m > n
first = list.remove_first
m -= first
else
last = list.last_element
if last <= 1
return []
end
list.push_back(last - 1)
m += last - 1
end
end
return list
end
function all_consecutive_numbers(n)
m = n / 2 + 1
a = consecutive_numbers(n, m)
while a != []
print_combination(n, a)
m = a.first - 1
a = consecutive_numbers(n, m)
end
end
function print_combination(n, a)
print(n + " = ")
print(a.remove_first)
foreach element in a
print(" + " + element)
end
print("\n")
end
A call to all_consecutive_numbers(21) would print:
21 = 11 + 10
21 = 8 + 7 + 6
21 = 6 + 5 + 4 + 3 + 2 + 1
I tested it in ruby (code here) and it seems to work. I'm sure the basic idea could easily be implemented in C# as well.
I like this problem. Here is a slick and slightly mysterious O(n) solution:
void DisplaySum (int n, int a, int b)
{
std::cout << n << " = ";
for (int i = a; i < b; i++) std::cout << i << " + ";
std::cout << b;
}
void WriteAsSums (int n)
{
N = 2*n;
for (int i = 1; i < N; i++)
{
if (~(N%i))
{
int j = N/i;
if (j+i%2)
{
int a = (j+i-1)/2;
int b = (j-i+1)/2;
if (a>0 & a<b) // exclude trivial & negative solutions
DisplaySum(n,a,b);
}
}
}
}
Here's something in Groovy, you should be able to understand what's going on. It's not the most efficient code and doesn't create the answers in the order you cite in your question (you seem to be missing some though) but it might give you a start.
def f(a,b) {
for (i in a..b) {
for (j in 1..i/2) {
def (sum, str, k) = [ 0, "", j ]
while (sum < i) {
sum += k
str += "+$k"
k++
}
if (sum == i) println "$i=${str[1..-1]}"
}
}
}
Output for f(3,21) is:
3=1+2
5=2+3
6=1+2+3
7=3+4
9=2+3+4
9=4+5
10=1+2+3+4
11=5+6
12=3+4+5
13=6+7
14=2+3+4+5
15=1+2+3+4+5
15=4+5+6
15=7+8
17=8+9
18=3+4+5+6
18=5+6+7
19=9+10
20=2+3+4+5+6
21=1+2+3+4+5+6
21=6+7+8
21=10+11
Hope this helps. It kind of conforms to the tenet of doing the simplest thing that could possibly work.
if we slice a into 2 digit, then a = b + (b+1) = 2*b + (0+1)
if we slice a into 3 digit, then a = b + (b+1) + (b+2) = 3*b + (0+1+2)
...
if we slice a into n digit, then a = b + (b+1) +...+ (b+n) = nb + (0+1+n-1)
the last result is a = nb + n*(n-1)/2, a,b,n are all ints.
so O(N) Algorithm is:
void seq_sum(int a)
{
// start from 2 digits
int n=2;
while(1)
{
int value = a-n*(n-1)/2;
if(value < 0)
break;
// meet the quotation we deduct
if( value%n == 0 )
{
int b=value/n;
// omit the print stage
print("......");
}
n++;
}
}