Find a string for which hash() starts with 0000 - hash

I've got a task from my professor and unfortunately I'm really confused.
The task:
Find a string D1 for which hash(D1) contains 4 first bytes equal 0.
So it should look like "0000....."
As I know we cannot just decrypt a hash, and checking them one by one is kind of pointless work.

I've got a task from my professor...
Find a string D1 for which hash(D1) contains 4 first bytes equal 0. So it should look like "0000....."
As I know we cannot just decrypt a hash, and checking them one by one is kind of pointless work.
In this case it seem like the work is not really "pointless." Rather, you are doing this work because your professor asked you to do it.
Some commenters have mentioned that you could look at the bitcoin blockchain as a source of hashes, but this will only work if your hash of interest is the same one use by bitcoin (double-SHA256!)
The easiest way to figure this out in general is just to brute force it:
Pseudo-code a la python
for x in range(10*2**32): # Any number bigger than about 4 billion should work
x_str = str(x) # Any old method to generate some bytes to hash should work
x_bytes = x_str.encode('utf-8')
hash_bytes = hash(x_bytes) # assuming hash() returns bytes
if hash_bytes[0:4] == b'\x00\x00\x00\x00':
print("Found string: {}".format(x_str))
break

I wrote a short python3 script, which repeatedly tries hashing random values until it finds a value whose SHA256 hash has four leading zero bytes:
import secrets
import hashlib
while(True):
p=secrets.token_bytes(64)
h=hashlib.sha256(p).hexdigest()
if(h[0:8]=='00000000'): break
print('SHA256(' + p.hex() + ')=' + h)
After running for a few minutes (on my ancient Dell laptop), it found a value whose SHA256 hash has four leading zero bytes:
SHA256(21368dc16afcb779fdd9afd57168b660b4ed786872ad55cb8355bdeb4ae3b8c9891606dc35d9f17c44219d8ea778d1ee3590b3eb3938a774b2cadc558bdfc8d4)=000000007b3038e968377f887a043c7dc216961c22f8776bbf66599acd78abf6
The following command-line command verifies this result:
echo -n '21368dc16afcb779fdd9afd57168b660b4ed786872ad55cb8355bdeb4ae3b8c9891606dc35d9f17c44219d8ea778d1ee3590b3eb3938a774b2cadc558bdfc8d4' | xxd -r -p | sha256sum
As expected, this produces:
000000007b3038e968377f887a043c7dc216961c22f8776bbf66599acd78abf6
Edit 5/8/21
Optimized version of the script, based on my conversation with kelalaka in the comments below.
import secrets
import hashlib
N=0
p=secrets.token_bytes(32)
while(True):
h=hashlib.sha256(p).digest()
N+=1
if(h.hex()[0:8]=='0'*8): break
p=h
print('SHA256(' + p.hex() + ')=' + h.hex())
print('N=' + str(N))
Instead of generating a new random number in each iteration of the loop to use as the input to the hash function, this version of the script uses the output of the hash function from the previous iteration as the input to the hash function in the current iteration. On my system, this quadruples the number of iterations per second. It found a match in 1483279719 iterations in a little over 20 minutes:
$ time python3 findhash2.py
SHA256(69def040a417caa422dff20e544e0664cb501d48d50b32e189fba5c8fc2998e1)=00000000d0d49aaaf9f1e5865c8afc40aab36354bc51764ee2f3ba656bd7c187
N=1483279719
real 20m47.445s
user 20m46.126s
sys 0m0.088s

The sha256 hash of the string $Eo is 0000958bc4dc132ad12abd158073204d838c02b3d580a9947679a6
This was found using the code below which restricts the string to only UTF8 keyboard characters. It cycles through the hashes of each 1 character string (technically it hashes bytes, not strings), then each 2 character string, then each 3 character string, then each 4 character string (it never had to go to 4 characters, so I'm not 100% sure the math for that part of the function is correct).
The 'limit" value is included to prevent the code from running forever in case a match is not found. This ended up not being necessary as a match was found in 29970 iterations and the execution time was nearly instantaneous.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from hashlib import sha256
utf8_chars = list(range(0x21,0x7f))
def make_str(attempt):
if attempt < 94:
c0 = [attempt%94]
elif attempt >= 94 and attempt < 8836:
c2 = attempt//94
c1 = attempt%94
c0 = [c2,c1]
elif attempt >= 8836 and attempt < 830584:
c3 = attempt//8836
c2 = (attempt-8836*c3)//94
c1 = attempt%94
c0 = [c3,c2,c1]
elif attempt >= 830584 and attempt < 78074896:
c4 = attempt//830584
c3 = (attempt-830584*c4)//8836
c2 = ((attempt-830584*c4)-8836*c3)//94
c1 = attempt%94
c0 = [c4,c3,c2,c1]
return bytes([utf8_chars[i] for i in c0])
target = '0000'
limit = 1200000
attempt = 0
hash_value = sha256()
hash_value.update(make_str(attempt))
while hash_value.hexdigest()[0:4] != target and attempt <= limit:
hash_value = sha256()
attempt += 1
hash_value.update(make_str(attempt))
t = ''.join([chr(i) for i in make_str(attempt)])
print([t, attempt])

Related

Perl - Forming a value from bits lying between two (16 bit) fields (data across word boundaries)

I have am reading some data into 16 bit data words, and extracting VALUES from parts of the 16 bit words. Some of the values I need straddles the word boundaries.
I need to take the bits from the first word and some from the second word and join them to form a value.
I am thinking of the best way to do this. I could bit shift stuff all over the place and compose the data that way, but I am thinking there must be perhaps an easier/better way because I have many cases like this and the values are in some case different sizes (which I know since I have a data map).
For instance:
[TTTTTDDDDPPPPYYY] - 16 bit field
[YYYYYWWWWWQQQQQQ] - 16 bit field
TTTTT = 5 bit value, easily extracted
DDDD = 4 bit value, easily extracted
WWWWW = 5 bit value, easily extracted
QQQQQQ = 6 bit value, easily extracted
YYYYYYYY = 8 bit value, which straddles the word boundaries. What is the best way to extract this? In my case I have a LOT of data like this, so elegance/simplicity in a solution is what I seek.
Aside - In Perl what are the limits of left shifting? I am on a 32 bit computer, am I right to guess that my (duck) types are 32 bit variables and that I can shift that far, even though I unpacked the data as 16 bits (unpack with type n) into a variable? This situation came up in the case of trying to extract a 31 bit variable that lies between two 16 bit fields.
Lastly (someone may ask), reading/unpacking the data into 32 bit words does not help me as I still face the same issue - Data is not aligned on word boundaries but crosses it.
The size of your integers are given (in bytes) by perl -V:ivsize or programatically using use Config qw( %Config ); $Config{ivsize}. They'll have 32 bit in a 32-bit build (since they are guaranteed to be large enough to hold a pointer). That means you can use
my $i = ($hi << 16 | $lo); # TTTTTDDDDPPPPYYYYYYYYWWWWWQQQQQQ
my $q = ($i >> 0) & (2**6-1);
my $w = ($i >> 6) & (2**5-1);
my $y = ($i >> 11) & (2**8-1);
my $p = ($i >> 19) & (2**4-1);
my $d = ($i >> 23) & (2**4-1);
my $t = ($i >> 27) & (2**5-1);
If you wanted to stick to 16 bits, you could use the following:
my $y = ($hi & 0x7) << 5 | ($lo >> 11);
00000[00000000YYY ]
[ YYYYY]WWWWWQQQQQQ
------------------
[00000000YYYYYYYY]

Different value when using fprintf or sprintf

I've written a function (my first, so don't be too quick to judge) in MATLAB, which is supposed to write a batch file based on 3 input parameters:
write_BatchFile(setup,engine,np)
Here setup consists of one or more strings, engine consists of one string only and np is a number, e.g.:
setup = {'test1.run';'test2.run';'test3.run'};
engine = 'Engine.exe';
np = 4; % number of processors/cores
I'll leave out the first part of my script, which is a bit more extensive, but in case necessary I can provide the entire script afterwards. Anyhow, once all 3 parameters have been determined, which it does successfully, I wrote the following, which is the last part of my script:
%==========================================================================
% Start writing the batch file
%==========================================================================
tmpstr = sprintf('\nWriting batch file batchRunMPI.bat...');
disp(tmpstr); clear tmpstr;
filename = 'batchRunMPI.bat';
fid = fopen(filename,'w');
fprintf(fid,'set OMP_NUM_THREADS=1\n');
for i = 1:length(setup);
fprintf(fid,'mpiexec -n %d -localonly "%s" "%s"\n',np,engine,setup{i});
fprintf(fid,'move %s.log %s.MPI_%d.log\n',setupname{i},setupname{i},np);
end
fclose all;
disp('Done!');
NOTE setupname follows using fileparts:
[~,setupname,setupext] = fileparts(setup);
However, when looking at the resulting batch file I end up getting the value 52 where I indicate my number of cores (= 4), e.g.:
mpiexec -n 52 -localonly "Engine.exe" "test1.run"
mpiexec -n 52 -localonly "Engine.exe" "test2.run"
mpiexec -n 52 -localonly "Engine.exe" "test3.run"
Instead, I'd want the result to be:
mpiexec -n 4 -localonly "Engine.exe" "test3.run", etc
When I check the value of np it returns 4, so I'm confused where this 52 comes from.
My feeling is that it's a very simple solution which I'm just unaware of, but I haven't been able to find anything on this so far, which is why I'm posting here. All help is appreciated!
-Daniel
It seems that at some stage np is being converted to a string. The character '4' has the integer value 52, which explains what you're getting. You've got a few options:
a) Figure out where np is being converted to a string and change it
b) the %d to a %s, so you get '4' instead of 52
c) change the np part of the printf statement to str2double(np).

Algorithm for finding all possible key combinations in given range

Last time I got curious about how long would it take to break my password using brute force attack. I'd like to check it.
So, how should I implement algorithm to find all possible key combinations in given range (for eg. 15 letters)? I found algorithms for permutations around but they all swap letters for given word, it's not what I'm looking for.
Assuming that passwords can consist of combinations of 89 possible characters (a-z, A-z, 0-9, space, and all the different symbol keys on a Windows keyboard), a there there are 82 the the 15th power different combinations of 15 characters (82 * 82 * 82 ... ). In other words, a lot.
If you want to use just letters, and you differentiate between upper and lower case, there would be 52 ** 15 possible 15-letter combinations. If you want to take in the possibility of shorter strings as well you could write something like (pseudocode):
long combos = 0
for i = 6 TO 20 -- legal password lengths
combos = combos + POW(52, i)
print "there are " + combos.ToString()
+ " possible passwords between 6 and 20 characters"
To actually enumerate and print the permutations in C# you could do:
void AddNextCharAndPrintIfDone(string pwd, int maxLen)
{
for (char c = 'a'; c < 'Z'; c++)
{
pwd = pwd + c;
if (pwd.Length >= maxLen)
System.Console.WriteLine(pwd);
else AddNextCharAndPrintIfDone(pwd, maxLen)
}
}
Main()
{
for (int i=6; i < 20; i++)
AddNextCharAndPrintIfDone("", i);
}
Not really written for efficiency, but if you have enough memory and time, you'll get every possible permutation.
You can download php pear project math combinatoric to generate those passwords.

XOR, MD5 and Base64 encoding issue

i need to get value which first 16 characters are TZxy2o2h2I2NMVR+ for which I have a formula. The formula goes like this: Base64(XOR("KonstantaZaLDAP", MD5(521009)) + XOR(521009, "KonstantaZaLDAP")) or in a word:
I have two values:
int radID = 521009
String konst = "KonstantaZaLDAP"
The first step is to apply XOR operation to konst and MD5 hash value of konst >>XOR(kost, MD5(radID))
Second, I need to apply XOR operation to radID and konst >> XOR(radID, konst).
After that i should concatenate values from first and second step >> XOR(kost, MD5(radID)) + XOR(radID, konst) and finaly Base64 encode concatenated value.
That is Base64(XOR(konst, MD5(radID)) + XOR(radID, konst)).
I have tried to achieve wanted value, and whatever I do, I get first 13 characters right, and after that it's all wrong. The value I get is TZxy2o2h2l2NMfUfpPmJNA==
Can anyone help!?

Code Golf - Word Scrambler

Please answer with the shortest possible source code for a program that converts an arbitrary plaintext to its corresponding ciphertext, following the sample input and output I have given below. Bonus points* for the least CPU time or the least amount of memory used.
Example 1:
Plaintext: The quick brown fox jumps over the lazy dog. Supercalifragilisticexpialidocious!
Ciphertext: eTh kiquc nobrw xfo smjup rvoe eth yalz .odg !uioiapeislgriarpSueclfaiitcxildcos
Example 2:
Plaintext: 123 1234 12345 123456 1234567 12345678 123456789
Ciphertext: 312 4213 53124 642135 7531246 86421357 975312468
Rules:
Punctuation is defined to be included with the word it is closest to.
The center of a word is defined to be ceiling((strlen(word)+1)/2).
Whitespace is ignored (or collapsed).
Odd words move to the right first. Even words move to the left first.
You can think of it as reading every other character backwards (starting from the end of the word), followed by the remaining characters forwards. Corporation => XoXpXrXtXoX => niaorCoprto.
Thank you to those who pointed out the inconsistency in my description. This has lead many of you down the wrong path, which I apologize for. Rule #4 should clear things up.
*Bonus points will only be awarded if Jeff Atwood decides to do so. Since I haven't checked with him, the chances are slim. Sorry.
Python, 50 characters
For input in i:
' '.join(x[::-2]+x[len(x)%2::2]for x in i.split())
Alternate version that handles its own IO:
print ' '.join(x[::-2]+x[len(x)%2::2]for x in raw_input().split())
A total of 66 characters if including whitespace. (Technically, the print could be omitted if running from a command line, since the evaluated value of the code is displayed as output by default.)
Alternate version using reduce:
' '.join(reduce(lambda x,y:y+x[::-1],x) for x in i.split())
59 characters.
Original version (both even and odd go right first) for an input in i:
' '.join(x[::2][::-1]+x[1::2]for x in i.split())
48 characters including whitespace.
Another alternate version which (while slightly longer) is slightly more efficient:
' '.join(x[len(x)%2-2::-2]+x[1::2]for x in i.split())
(53 characters)
J, 58 characters
>,&.>/({~(,~(>:#+:#i.#-#<.,+:#i.#>.)#-:)#<:##)&.><;.2,&' '
Haskell, 64 characters
unwords.map(map snd.sort.zip(zipWith(*)[0..]$cycle[-1,1])).words
Well, okay, 76 if you add in the requisite "import List".
Python - 69 chars
(including whitespace and linebreaks)
This handles all I/O.
for w in raw_input().split():
o=""
for c in w:o=c+o[::-1]
print o,
Perl, 78 characters
For input in $_. If that's not acceptable, add six characters for either $_=<>; or $_=$s; at the beginning. The newline is for readability only.
for(split){$i=length;print substr$_,$i--,1,''while$i-->0;
print"$_ ";}print $/
C, 140 characters
Nicely formatted:
main(c, v)
char **v;
{
for( ; *++v; )
{
char *e = *v + strlen(*v), *x;
for(x = e-1; x >= *v; x -= 2)
putchar(*x);
for(x = *v + (x < *v-1); x < e; x += 2)
putchar(*x);
putchar(' ');
}
}
Compressed:
main(c,v)char**v;{for(;*++v;){char*e=*v+strlen(*v),*x;for(x=e-1;x>=*v;x-=2)putchar(*x);for(x=*v+(x<*v-1);x<e;x+=2)putchar(*x);putchar(32);}}
Lua
130 char function, 147 char functioning program
Lua doesn't get enough love in code golf -- maybe because it's hard to write a short program when you have long keywords like function/end, if/then/end, etc.
First I write the function in a verbose manner with explanations, then I rewrite it as a compressed, standalone function, then I call that function on the single argument specified at the command line.
I had to format the code with <pre></pre> tags because Markdown does a horrible job of formatting Lua.
Technically you could get a smaller running program by inlining the function, but it's more modular this way :)
t = "The quick brown fox jumps over the lazy dog. Supercalifragilisticexpialidocious!"
T = t:gsub("%S+", -- for each word in t...
function(w) -- argument: current word in t
W = "" -- initialize new Word
for i = 1,#w do -- iterate over each character in word
c = w:sub(i,i) -- extract current character
-- determine whether letter goes on right or left end
W = (#w % 2 ~= i % 2) and W .. c or c .. W
end
return W -- swap word in t with inverted Word
end)
-- code-golf unit test
assert(T == "eTh kiquc nobrw xfo smjup rvoe eth yalz .odg !uioiapeislgriarpSueclfaiitcxildcos")
-- need to assign to a variable and return it,
-- because gsub returns a pair and we only want the first element
f=function(s)c=s:gsub("%S+",function(w)W=""for i=1,#w do c=w:sub(i,i)W=(#w%2~=i%2)and W ..c or c ..W end return W end)return c end
-- 1 2 3 4 5 6 7 8 9 10 11 12 13
--34567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
-- 130 chars, compressed and written as a proper function
print(f(arg[1]))
--34567890123456
-- 16 (+1 whitespace needed) chars to make it a functioning Lua program,
-- operating on command line argument
Output:
$ lua insideout.lua 'The quick brown fox jumps over the lazy dog. Supercalifragilisticexpialidocious!'
eTh kiquc nobrw xfo smjup rvoe eth yalz .odg !uioiapeislgriarpSueclfaiitcxildcos
I'm still pretty new at Lua so I'd like to see a shorter solution if there is one.
For a minimal cipher on all args to stdin, we can do 111 chars:
for _,w in ipairs(arg)do W=""for i=1,#w do c=w:sub(i,i)W=(#w%2~=i%2)and W ..c or c ..W end io.write(W ..' ')end
But this approach does output a trailing space like some of the other solutions.
For an input in s:
f=lambda t,r="":t and f(t[1:],len(t)&1and t[0]+r or r+t[0])or r
" ".join(map(f,s.split()))
Python, 90 characters including whitespace.
TCL
125 characters
set s set f foreach l {}
$f w [gets stdin] {$s r {}
$f c [split $w {}] {$s r $c[string reverse $r]}
$s l "$l $r"}
puts $l
Bash - 133, assuming input is in $w variable
Pretty
for x in $w; do
z="";
for l in `echo $x|sed 's/\(.\)/ \1/g'`; do
if ((${#z}%2)); then
z=$z$l;
else
z=$l$z;
fi;
done;
echo -n "$z ";
done;
echo
Compressed
for x in $w;do z="";for l in `echo $x|sed 's/\(.\)/ \1/g'`;do if ((${#z}%2));then z=$z$l;else z=$l$z;fi;done;echo -n "$z ";done;echo
Ok, so it outputs a trailing space.