Decrypt a password - hash

How do I decrypt a password given this :
Password:
Code:
01001101 01111010 01100111 00110001 01011010 01101010 01000001 00110101 01001101 01000111 01001001 00110000 01001111 01010111 01010001 01110111 01011001 01111010 01000101 01111010 01001101 01010100 01010101 00110000 01001101 01000100 01000110 01101100 01001110 00110010 01010101 01111000 01011010 01010111 01001110 01101101 01011010 01010100 01000001 01111001 01001101 00110010 01001101 00111101
password: ascii, base //////// md5 Happy (+1 x2)
Ive been stumped for two days trying to figure this out :(

Wow, binary. Hexadecimal is generally used to display binary data.
The data looks like characters and Base64 at that based on the trailing "=" byte (00111101 is hex 3D which is "=").
In short MD5 is a non-reversible one-way hash function. The only way is brute-force, that is trying passwords until you find a match or give-up.

Related

What does the \u{...} notation mean in UNICODE and why are only some characters displayed like this in the CLDR project?

In this link you will find the most used characters for each language. Why are some characters in some languages displayed under the \u{...} notation?
I think that what is in the brackets is the hexadecimal code of the character, but I can't understand why they would only do it with some characters.
The character sequences enclosed in curly brackets {} are digraphs (trigraphs, …) counted as a distinct letter in given language (supposedly with its own place in the alphabet), for instance
digraph {ch} in cs (Czech language);
trigraph {dzs} in hu (Hungarian alphabet);
more complex digraph examples in kkj (Kako language) shows the following Python code snippet:
>>> kkj='[a á à â {a\u0327} b ɓ c d ɗ {ɗy} e é è ê ɛ {ɛ\u0301} {ɛ\u0300} {ɛ\u0302} {ɛ\u0327} f g {gb} {gw} h i í ì î {i\u0327} j k {kp} {kw} l m {mb} n {nd} nj {ny} ŋ {ŋg} {ŋgb} {ŋgw} o ó ò ô ɔ {ɔ\u0301} {ɔ\u0300} {ɔ\u0302} {ɔ\u0327} p r s t u ú ù û {u\u0327} v w y]'
>>> print( kkj)
[a á à â {a̧} b ɓ c d ɗ {ɗy} e é è ê ɛ {ɛ́} {ɛ̀} {ɛ̂} {ɛ̧} f g {gb} {gw} h i í ì î {i̧} j k {kp} {kw} l m {mb} n {nd} nj {ny} ŋ {ŋg} {ŋgb} {ŋgw} o ó ò ô ɔ {ɔ́} {ɔ̀} {ɔ̂} {ɔ̧} p r s t u ú ù û {u̧} v w y]
>>>
For instance, {a\u0327} renders as {a̧} i.e. something like Latin Small Letter A with Combining Cedilla which has no Unicode equivalent. A counterexample:
ņ (U+0146) Latin Small Letter N With Cedilla with decomposition 004E 0327:
>>> import unicodedata
>>> print( 'ņ', unicodedata.normalize('NFC','{n\u0327}'))
ņ {ņ}
Edit:
Characters presented as unicode literals (\uxxxx = a character with 16-bit hex value xxxx) are unrenderable ones (or hard to render, at least). The following Python script shows some of them (Bidi_Class Values L-Left_To_Right, R-Right_To_Left, NSM-Nonspacing_Mark, BN-Boundary_Neutral):
# -*- coding: utf-8 -*-
import unicodedata
pa = 'ੱੰ਼੍ੁੂੇੈੋੌ'
pa = '\u0327 \u0A71 \u0A70 \u0A3C ੦ ੧ ੨ ੩ ੪ ੫ ੬ ੭ ੮ ੯ ੴ ੳ ਉ ਊ ਓ ਅ ਆ ਐ ਔ ੲ ਇ ਈ ਏ ਸ {ਸ\u0A3C} ਹ ਕ ਖ {ਖ\u0A3C} ਗ {ਗ\u0A3C} ਘ ਙ ਚ ਛ ਜ {ਜ\u0A3C} ਝ ਞ ਟ ਠ ਡ ਢ ਣ ਤ ਥ ਦ ਧ ਨ ਪ ਫ {ਫ\u0A3C} ਬ ਭ ਮ ਯ ਰ ਲ ਵ ੜ \u0A4D ਾ ਿ ੀ \u0A41 \u0A42 \u0A47 \u0A48 \u0A4B \u0A4C'
pa = '\u0300 \u0301 \u0302 \u1DC6 \u1DC7 \u0A71 \u0A70 \u0A3C \u0A4D \u0A41 \u0A42 \u0A47 \u0A48 \u0A4B \u0A4C \u05B7 \u05B8 \u05BF \u200C \u200D \u200E \u200F \u064B \u064C \u064E \u064F \u0650'
# above examples from ·kkj· ·bas· ·pa· ·yi· ·kn· ·ur· ·mzn·
print( pa )
for chr in pa:
if chr != ' ':
if chr == '{' or chr == '}':
print( chr )
else:
print( '\\u%04x' % ord(chr), chr,
unicodedata.category(chr),
unicodedata.bidirectional(chr) + '\t',
str( unicodedata.combining(chr)) + '\t',
unicodedata.name(chr, '?') )
Result: .\SO\63659122.py
̀ ́ ̂ ᷆ ᷇ ੱ ੰ ਼ ੍ ੁ ੂ ੇ ੈ ੋ ੌ ַ ָ ֿ ‌ ‍ ‎ ‏ ً ٌ َ ُ ِ
\u0300 ̀ Mn NSM 230 COMBINING GRAVE ACCENT
\u0301 ́ Mn NSM 230 COMBINING ACUTE ACCENT
\u0302 ̂ Mn NSM 230 COMBINING CIRCUMFLEX ACCENT
\u1dc6 ᷆ Mn NSM 230 COMBINING MACRON-GRAVE
\u1dc7 ᷇ Mn NSM 230 COMBINING ACUTE-MACRON
\u0a71 ੱ Mn NSM 0 GURMUKHI ADDAK
\u0a70 ੰ Mn NSM 0 GURMUKHI TIPPI
\u0a3c ਼ Mn NSM 7 GURMUKHI SIGN NUKTA
\u0a4d ੍ Mn NSM 9 GURMUKHI SIGN VIRAMA
\u0a41 ੁ Mn NSM 0 GURMUKHI VOWEL SIGN U
\u0a42 ੂ Mn NSM 0 GURMUKHI VOWEL SIGN UU
\u0a47 ੇ Mn NSM 0 GURMUKHI VOWEL SIGN EE
\u0a48 ੈ Mn NSM 0 GURMUKHI VOWEL SIGN AI
\u0a4b ੋ Mn NSM 0 GURMUKHI VOWEL SIGN OO
\u0a4c ੌ Mn NSM 0 GURMUKHI VOWEL SIGN AU
\u05b7 ַ Mn NSM 17 HEBREW POINT PATAH
\u05b8 ָ Mn NSM 18 HEBREW POINT QAMATS
\u05bf ֿ Mn NSM 23 HEBREW POINT RAFE
\u200c ‌ Cf BN 0 ZERO WIDTH NON-JOINER
\u200d ‍ Cf BN 0 ZERO WIDTH JOINER
\u200e ‎ Cf L 0 LEFT-TO-RIGHT MARK
\u200f ‏ Cf R 0 RIGHT-TO-LEFT MARK
\u064b ً Mn NSM 27 ARABIC FATHATAN
\u064c ٌ Mn NSM 28 ARABIC DAMMATAN
\u064e َ Mn NSM 30 ARABIC FATHA
\u064f ُ Mn NSM 31 ARABIC DAMMA
\u0650 ِ Mn NSM 32 ARABIC KASRA
It seems like all codepoints that don't have a well-defined stand-alone look (or are not meant to be used as stand-alone characters) are represented with this notation.
For example U+0A3C is present in the "character" {ਫ\u0A3C}. U+0A3C is a combining codepoint that modifies the one that is before it.

convert 16 bits signed (x2) to 32bits unsigned

I've got a problem with a modbus device :
The device send data in modbus protocol.
I read 4 bytes from the modbus communication that represent a pressure value
I have to convert theses 4 bytes in a unsigned 32bits integer.
There is the modbus documentation :
COMBINING 16bit REGISTERS TO 32bit VALUE
Pressure registers 2 & 3 in SENSOR INPUT REGISTER MAP of this guide are stored as u32 (UNSIGNED 32bit INTEGER)
You can calculate pressure manually :
1) Determine what display you have - if register values are positive skip to step 3.
2) Convert negative register 2 & 3 values from Signed to Unsigned (note: 65536 = 216 ):
(reg 2 value) + 65536* = 35464 ; (reg 3 value) + 65536 = 1
3) Shift register #3 as this is the upper 16 bits: 65536 * (converted reg 3 value) = 65536
4) Put two 16bit numbers together: (converted reg 2 value) + (converted reg 3 value) = 35464 + 65536 = 101000 Pa
Pressure information is then 101000 Pascal.
I don't find it very clear... For exemple, we don't have the 4 bytes that gives this calcul.
So, if anybody has a formula to convert my bytes into a 32bits unsigned int it could be very helpful
You should be able to read your bytes in some kind of type representation (hex, dec, bin, oct...)
let's assume you're receiving the following bytes frame:
in hex:
0x00, 0x06, 0x68, 0xA0
in bin:
0000 0000, 0000 0110, 0110 1000, 1010 0000
all of these are different representation of the same 4 bytes values.
Another thing that you should know is the bytes position (endianess):
If you're frame is transmitted in big endian, you're going to read the bytes in the order that you have them ( so 0x00, 0x06, 0x68, 0xA0 is correct).
If the frame is transmitted in little endian, you need to perform the following operation:
Switch the first 2 bytes with the last 2:
0x68, 0xA0, 0x00, 0x06
and then switch the position between the first and the second byte and the third and the fourth byte:
0xA0, 0x68, 0x06, 0x00
so if your frame is in little endian, the correct frame will be 0xA0, 0x68, 0x06, 0x00.
If you don't know the endianess, assume it's in big endian.
Now you simply have to 'put' your values togheter:
0x00, 0x06, 0x68, 0xA0 will become 0x000668A0
or
0000 0000, 0000 0110, 0110 1000, 1010 0000 will become 00000000000001100110100010100000
Once you have your hex or bin, you can convert your bin to an integer or convert your hex to an integer
Here you can find an interesting tool for converting HEX to float, unit32, int32, int16 in all endianess.
TL;DR
if you can use python, you should use struct:
import struct
frame = [0x00, 0x06, 0x68, 0xA0] # or [0, 6, 104, 160] in dec or [0b00000000, 0b00000110, 0b01101000, 0b10100000] in bin
print struct.unpack('>L', ''.join(map(chr, frame)))[0]

The number of characters in each unicode block

Does anyone know any reference showing the number of characters in each Unicode block? (in newer version such as 5.x.x or 6.0.0)
Thanks a lot.
http://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt contains the data you are interested in.
http://www.unicode.org/Public/6.0.0/ucd/ReadMe.txt contains some instructions and refers to http://unicode.org/reports/tr44/ for interpreting the data. In that document you should read http://unicode.org/reports/tr44/#UnicodeData.txt.
unichars
Does this answer your question:
% unichars '\p{InCyrillic}' | wc -l
256
% unichars '\p{InEthiopic}' | wc -l
356
% unichars '\p{InLatin1}' | wc -l
128
% unichars '\p{InCombiningDiacriticalMarks}' | wc -l
To include the 16 astral planes, add -a:
112
% unichars -a '\p{InAncientGreekNumbers}' | wc -l
75
If you want unassigned or Han or Hangul, you need -u:
% unichars -u '\p{InEthiopic}' | wc -l
384
% unichars -u '\p{InCJKUnifiedIdeographsExtensionA}' | wc -l
6592
You can get other information, too:
% unichars '\P{IsGreek}' '\p{InGreek}'
ʹ 884 0374 GREEK NUMERAL SIGN
; 894 037E GREEK QUESTION MARK
΅ 901 0385 GREEK DIALYTIKA TONOS
· 903 0387 GREEK ANO TELEIA
Ϣ 994 03E2 COPTIC CAPITAL LETTER SHEI
ϣ 995 03E3 COPTIC SMALL LETTER SHEI
Ϥ 996 03E4 COPTIC CAPITAL LETTER FEI
ϥ 997 03E5 COPTIC SMALL LETTER FEI
Ϧ 998 03E6 COPTIC CAPITAL LETTER KHEI
ϧ 999 03E7 COPTIC SMALL LETTER KHEI
Ϩ 1000 03E8 COPTIC CAPITAL LETTER HORI
ϩ 1001 03E9 COPTIC SMALL LETTER HORI
Ϫ 1002 03EA COPTIC CAPITAL LETTER GANGIA
ϫ 1003 03EB COPTIC SMALL LETTER GANGIA
Ϭ 1004 03EC COPTIC CAPITAL LETTER SHIMA
ϭ 1005 03ED COPTIC SMALL LETTER SHIMA
Ϯ 1006 03EE COPTIC CAPITAL LETTER DEI
ϯ 1007 03EF COPTIC SMALL LETTER DEI
% unichars '\p{IsGreek}' '\P{InGreek}' | wc -l
250
% unichars '\P{IsGreek}' '\p{InGreek}' | wc -l
18
% unichars '\p{In=1.1}' | wc -l
6362
% unichars '\p{In=6.0}' | wc -l
15087
uniprops
Here’s uniprops:
% uniprops -l | grep -c 'Block='
84
% uniprops digamma 450 %
U+03DC ‹Ϝ› \N{ GREEK LETTER DIGAMMA }:
\w \pL \p{LC} \p{L_} \p{L&} \p{Lu}
All Any Alnum Alpha Alphabetic Assigned Greek Is_Greek InGreek Cased Cased_Letter LC Changes_When_Casefolded CWCF
Changes_When_Casemapped CWCM Changes_When_Lowercased CWL Changes_When_NFKC_Casefolded CWKCF Lu L Gr_Base
Grapheme_Base Graph GrBase Grek Greek_And_Coptic ID_Continue IDC ID_Start IDS Letter L_ Uppercase_Letter Print
Upper Uppercase Word XID_Continue XIDC XID_Start XIDS XPosixAlnum XPosixAlpha XPosixGraph XPosixPrint XPosixUpper
XPosixWord
U+0450 ‹ѐ› \N{ CYRILLIC SMALL LETTER IE WITH GRAVE }:
\w \pL \p{LC} \p{L_} \p{L&} \p{Ll}
All Any Alnum Alpha Alphabetic Assigned InCyrillic Cyrillic Is_Cyrillic Cased Cased_Letter LC Changes_When_Casemapped
CWCM Changes_When_Titlecased CWT Changes_When_Uppercased CWU Cyrl Ll L Gr_Base Grapheme_Base Graph GrBase
ID_Continue IDC ID_Start IDS Letter L_ Lowercase_Letter Lower Lowercase Print Word XID_Continue XIDC XID_Start XIDS
XPosixAlnum XPosixAlpha XPosixGraph XPosixLower XPosixPrint XPosixWord
U+0025 ‹%› \N{ PERCENT SIGN }:
\pP \p{Po}
All Any ASCII Assigned Common Zyyy Po P Gr_Base Grapheme_Base Graph GrBase Other_Punctuation Punct Pat_Syn
Pattern_Syntax PatSyn PosixGraph PosixPrint PosixPunct Print Punctuation XPosixGraph XPosixPrint XPosixPunct
Or even all these:
% uniprops -vag 777
U+0777 ‹ݷ› \N{ ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW }:
\w \pL \p{L_} \p{Lo}
\p{All} \p{Any} \p{Alnum} \p{Alpha} \p{Alphabetic} \p{Arab} \p{Arabic} \p{Assigned} \p{Is_Arabic} \p{InArabicSupplement} \p{L} \p{Lo} \p{Gr_Base} \p{Grapheme_Base} \p{Graph}
\p{GrBase} \p{ID_Continue} \p{IDC} \p{ID_Start} \p{IDS} \p{Letter} \p{L_} \p{Other_Letter} \p{Print} \p{Word} \p{XID_Continue} \p{XIDC} \p{XID_Start} \p{XIDS} \p{XPosixAlnum}
\p{XPosixAlpha} \p{XPosixGraph} \p{XPosixPrint} \p{XPosixWord}
\p{Age:5.1} \p{Script=Arabic} \p{Bidi_Class:AL} \p{Bidi_Class=Arabic_Letter} \p{Bidi_Class:Arabic_Letter} \p{Bc=AL} \p{Block:Arabic_Supplement} \p{Canonical_Combining_Class:0}
\p{Canonical_Combining_Class=Not_Reordered} \p{Canonical_Combining_Class:Not_Reordered} \p{Ccc=NR} \p{Canonical_Combining_Class:NR} \p{Decomposition_Type:None} \p{Dt=None}
\p{East_Asian_Width=Neutral} \p{East_Asian_Width:Neutral} \p{General_Category:L} \p{General_Category=Letter} \p{General_Category:Letter} \p{Gc=L} \p{General_Category:Lo}
\p{General_Category=Other_Letter} \p{General_Category:Other_Letter} \p{Gc=Lo} \p{Grapheme_Cluster_Break:Other} \p{GCB=XX} \p{Grapheme_Cluster_Break:XX}
\p{Grapheme_Cluster_Break=Other} \p{Hangul_Syllable_Type:NA} \p{Hangul_Syllable_Type=Not_Applicable} \p{Hangul_Syllable_Type:Not_Applicable} \p{Hst=NA} \p{Joining_Group:Yeh}
\p{Jg=Yeh} \p{Joining_Type:D} \p{Joining_Type=Dual_Joining} \p{Joining_Type:Dual_Joining} \p{Jt=D} \p{Line_Break:AL} \p{Line_Break=Alphabetic} \p{Line_Break:Alphabetic}
\p{Lb=AL} \p{Numeric_Type:None} \p{Nt=None} \p{Numeric_Value:NaN} \p{Nv=NaN} \p{Present_In:5.1} \p{In=5.1} \p{Present_In:5.2} \p{In=5.2} \p{Present_In:6.0} \p{In=6.0}
\p{Script:Arab} \p{Script:Arabic} \p{Sc=Arab} \p{Sentence_Break:LE} \p{Sentence_Break=OLetter} \p{Sentence_Break:OLetter} \p{SB=LE} \p{Word_Break:ALetter} \p{WB=LE}
\p{Word_Break:LE} \p{Word_Break=ALetter}
My uniprops and unichars should run anywhere running Perl version 5.10 or better. There’s also a uninames script that goes with them.
There's a list available here although it does not specific for which version of the standard it applies:

UTF-8 & Unicode, what's with 0xC0 and 0x80?

I've been reading about Unicode and UTF-8 in the last couple of days and I often come across a bitwise comparison similar to this :
int strlen_utf8(char *s)
{
int i = 0, j = 0;
while (s[i])
{
if ((s[i] & 0xc0) != 0x80) j++;
i++;
}
return j;
}
Can someone clarify the comparison with 0xc0 and checking if it's the most significant bit ?
Thank you!
EDIT: ANDed, not comparison, used the wrong word ;)
It's not a comparison with 0xc0, it's a logical AND operation with 0xc0.
The bit mask 0xc0 is 11 00 00 00 so what the AND is doing is extracting only the top two bits:
ab cd ef gh
AND 11 00 00 00
-- -- -- --
= ab 00 00 00
This is then compared to 0x80 (binary 10 00 00 00). In other words, the if statement is checking to see if the top two bits of the value are not equal to 10.
"Why?", I hear you ask. Well, that's a good question. The answer is that, in UTF-8, all bytes that begin with the bit pattern 10 are subsequent bytes of a multi-byte sequence:
UTF-8
Range Encoding Binary value
----------------- -------- --------------------------
U+000000-U+00007f 0xxxxxxx 0xxxxxxx
U+000080-U+0007ff 110yyyxx 00000yyy xxxxxxxx
10xxxxxx
U+000800-U+00ffff 1110yyyy yyyyyyyy xxxxxxxx
10yyyyxx
10xxxxxx
U+010000-U+10ffff 11110zzz 000zzzzz yyyyyyyy xxxxxxxx
10zzyyyy
10yyyyxx
10xxxxxx
So, what this little snippet is doing is going through every byte of your UTF-8 string and counting up all the bytes that aren't continuation bytes (i.e., it's getting the length of the string, as advertised). See this wikipedia link for more detail and Joel Spolsky's excellent article for a primer.
An interesting aside by the way. You can classify bytes in a UTF-8 stream as follows:
With the high bit set to 0, it's a single byte value.
With the two high bits set to 10, it's a continuation byte.
Otherwise, it's the first byte of a multi-byte sequence and the number of leading 1 bits indicates how many bytes there are in total for this sequence (110... means two bytes, 1110... means three bytes, etc).

Need help identifying and computing a number representation

I need help identifying the following number format.
For example, the following number format in MIB:
0x94 0x78 = 2680
0x94 0x78 in binary: [1001 0100] [0111 1000]
It seems that if the MSB is 1, it means another character follows it. And if it is 0, it is the end of the number.
So the value 2680 is [001 0100] [111 1000], formatted properly is [0000 1010] [0111 1000]
What is this number format called and what's a good way for computing this besides bit manipulation and shifting to a larger unsigned integer?
I have seen this called either 7bhm (7-bit has-more) or VLQ (variable length quantity); see http://en.wikipedia.org/wiki/Variable-length_quantity
This is stored big-endian (most significant byte first), as opposed to the C# BinaryReader.Read7BitEncodedInt method described at Encoding an integer in 7-bit format of C# BinaryReader.ReadString
I am not aware of any method of decoding other than bit manipulation.
Sample PHP code can be found at
http://php.net/manual/en/function.intval.php#62613
or in Python I would do something like
def encode_7bhm(i):
o = [ chr(i & 0x7f) ]
i /= 128
while i > 0:
o.insert(0, chr(0x80 | (i & 0x7f)))
i /= 128
return ''.join(o)
def decode_7bhm(s):
o = 0
for i in range(len(s)):
v = ord(s[i])
o = 128*o + (v & 0x7f)
if v & 0x80 == 0:
# found end of encoded value
break
else:
# out of string, and end not found - error!
raise TypeError
return o