Error in hexadecimal conversion to base58-btc - perl

I am using the conversion algorithm of http://lenschulwitz.com/base58 , this perl code.
MANY GOOD CONVERSIONS, AS: 18e559fc6cb0e8de2ce8b50007d474a0d886208e698a07948671e0df520c1525 was converted to 2gBdDRXoLPEhgf9Zd7zw5ujK1qcoPZoendBQJ22VjgqS, all 44 digits.
BAD CONVERSION: 0ab3de5e16675aeb0c4831f5218901fec56f39cc8ad16e5559be4a0ee211f5d0 was converted to in9v3fi1cntD6ERD6QryMJq4r5BncjYZ32xZA6Uj4ST, 43 digits!
other BAD: 00000000000000000000000000000000000000000000000000000000000000d0 to 11111111111111111111111111111114b
What is wrong with the Perl code?
I can use some kind of padding in base58-btc?
PS: I can use something as sudo apt-get install libbase58-0 that is reliable at UBUNTU... But need a Perl interface for it.

Rather than debugging that poorly laid-out code, I suggest that you install Encode::Base58
or
Encode::Base58::GMP
Both of those modules have maintainers who will answer questions if you find you are getting incorrect results

Tested with other conversors, as bs58 js lib, all producing same and consistent results.
It seems XY problem... perhaps the real question is "base58 bitcoin use fixed size representation?" Can I use something as pad 1s?
But it is also part of the answer, not edited the question (reverted) and proced an answer.
... It is a conversion between "incompatible" bases (!), so I think is impossible
... 58=2*29 is not a multiple of 16=2^4... only when digits are multiple of 2... But:
base50("E") = hex("0D"); base58("1E") = hex("000D"); ... the number of padding digits are converted as multiple...
what the problem of cut/add for padding? base50("E") = hex("D"); base58("1E") = hex("D"); base50("E")=hex("000000D"); base50("1111111111E")=hex("D"); ... Not seems a problem, so padding-algorithm (fill 0s or 1s) can be used.
SOLUTION: ok, lets pad, fill 1s when converted have less than 44 digits.

Related

Two digits after the decimal point in a real number in Pascal

So for an example i have a real number let's say 17.4578 but i want to display it in pascal with only two digits after the point so it's 17.45.
What do i write in my program?
Write(17.4578:0:2)
Will display number 17.46
Generally, arguments look like this → Value : field_width : decimal_field_width
For more info click here
This would work. However, if this is the last line of code always remember a readln at the end.
Writeln(17.4578:0:2)
This would lead to 17.46 because it rounds up as it is followed by a 7.
Hope this helps
Use :0:2 at the end of Real number:
writeln(17.4578:0:2)

Barcode4j UPCA scancode with leading zero

I'm trying to add a barcode to a report using Jasper Reports in UPCA format. My scancodes are strings, but Barcode4j expects a 12 digit Integer. I can convert this to an Integer, but if there is a leading zero, it is lost, and thus my scancode is now 1 digit too short.
So, how can I use the UPCA format with scancodes that have a leading 0 and keep the leading 0?
Barbecue seems to have the same issues, so I don't imagine using it as opposed to Barcode4j will solve this issue.
Well, it all came down to my test data. Using random numbers is what was causing it to fail. When I looked up UPCA examples and used those, then it worked fine. I didn't have to parse it as an Integer either.

Doing a hash by hand/mathematically

I want to learn how to do a hash by hand (like with paper and pencil). Is this feasible? Any pointers on where to learn about this would be appreciated.
That depends on the hash you want to do. You can do a really simple hash by hand pretty easily -- for example, one trivial one is to take the ASCII values of the string, and add them together, typically doing something like a left-rotate between characters. So, to hash the string "Hash", we'd start with the ASCII values of the letters (in hex): 48 61 73 68. We'll add those together, rotating our result left 4 bits (in a 16-bit word) between letters:
0048 + 0061 = 00A9
00A9 <<< 4 = 0A90
0A90 + 0073 = 0B03
B03 <<< 4 = B030
B030 + 68 = B098
Result: B098
Doing a cryptographic hash by hand would be a rather different story. It's certainly still possible, but would be extremely tedious, to put it mildly. A cryptographic hash is typically quite a bit more complex, and (more importantly) almost always has a lot of "rounds", meaning that you basically repeat a set of steps a number of times to get from the input to the output. Speaking from experience, just stepping through SHA-1 in a debugger to be sure you've implemented it correctly is a pain -- doing it all by hand would be pretty awful (but as I said, certainly possible anyway).
You can start by looking at
Hash function
I would suggest trying a CRC, since it seems to me to be the easiest to do by hand: https://en.wikipedia.org/wiki/CRC32#Computation .
You can do a smaller length than standard (it's usually 32 bit) to make things easier.

How should I handle digits from different sets of UNICODE digits in the same string?

I am writing a function that transliterates UNICODE digits into ASCII digits, and I am a bit stumped on what to do if the string contains digits from different sets of UNICODE digits. So for example, if I have the string "\x{2463}\x{24F6}" ("④⓶"). Should my function
return 42?
croak that the string contains mixed sets?
carp that the string contains mixed sets and return 42?
give the user an additional argument to specify one of the three above behaviours?
do something else?
Your current function appears to do #1.
I suggest that you should also write another function to do #4, but only when the requirement appears, and not before .
I'm sure Joel wrote about "premature implementation" in a blog article sometime recently, but I can't find it.
I'm not sure I see a problem.
You support numeric conversion from a range of scripts, which is to say, you are aware of the Unicode codepoints for their numeric characters.
If you find an unknown codepoint in your input data, it is an error.
It is up to you what you do in the event of an error; you may insert a space or underscore, or you may abort conversion. What you would do will depend on the environment in which your function executes; it is not something we can tell you.
My initial thought was #4; strictly based on the fact that I like options. However, I changed my mind, when I viewed your function.
The purpose of the function seems to be, simply, to get the resulting digits 0..9. Users may find it useful to send in mixed sets (a feature :) . I'll use it.
If you ever have to handle input in bases greater than 10, you may end up having to treat many variants on the first 6 letters of the Latin alphabet ('ABCDEF') as digits in all their forms.

How can I figure out what code page I am looking at?

I have a device with some documentation on how to send it text. It uses 0x00-0x7F to send 'special' characters like accented characters, euro signs, ...
I am guessing they copied an existing code page and made some changes, but I have no idea how to figure out what code page is closest to the one in my documentation.
In theory, this should be easy to do. For example, they map Á to 0x41, so if I could find some way to go through all code pages and find the ones that have this character on that position, it would be a piece of cake.
However, all I can find on the internet are links to code page dumps just like the one I'm looking at, or software that uses heuristics to read text and guess the most likely code page. Surely someone out there has made it possible to look up what code page one is looking at ?
If it uses 0x00 to 0x7F for the "special" characters, how does it encode the regular ASCII characters?
In most of the charsets that support the character Á, its codepoint is 193 (0xC1). If you subtract 128 from that, you get 65 (0x41). Maybe your "codepage" is just the upper half of one of the standard charsets like ISO-8859-1 or windows-1252, with the high-order bit set to zero instead of one (that is, subtracting 128 from each one).
If that's the case, I would expect to find a flag you can set to tell it whether the next bunch of codepoints should be converted using the "upper" or "lower" encoding. I don't know of any system that uses that scheme, but it's the most sensible explanation I can come with for the situation you describe.
There is no way to auto-detect the codepage without additional information. Below the display layer it’s just bytes and all bytes are created equal. There’s no way to say “I’m a 0x41 from this and that codepage”, there’s only “I’m 0x41. Display me!”
What endian is the system? Perhaps you're flipping bit orders?
In most codepages, 0x41 is just the normal "A", I don't think any standard codepages have "Á" in that position. It could have a control character somewhere before the A that added the accent, or uses a non-standard codepage.
I don't see any use in knowing the "closest codepage", you just need to use the docs you got with the device.
Your last sentence is puzzling, what do you mean by "possible to look up what code page one is looking at"?
If you include your whole codepage, people here on SO could be more helpful and give you more insight about this issue, having one data point 0x41=Á doesn't help much.
Somewhat random idea, but if you can get replicate a significant amount of the text off the device, you could try running it through something like the detect function in http://chardet.feedparser.org/.