Reverse hashing function - encoding

I'm trying to understand how a system I'm working on creates some hash code starting from a numeric code. I'm collecting some of these pairs “small_number, big_number”, but I cannot figure out how the system encodes small_number to obtain big_number. Decoding is possible, because system can obtain small_number from big_number.
Numbers look like these:
197 >> 29337857947107767585
1078 >> 84080031635590040762
1083 >> 32373003898332638476
1409 >> 79402294967209014727
1498 >> 25254493910542918727
2945 >> 85687067382221703925
2946 >> 88767616208189692328
I have no clue at all. Can you point to some reading too?
Thank you

If the system can reverse the function, the function isn't a hashing function at all, but a cipher.
It looks like the output is always a 20 digit number - did you try testing it on non-numeric input, or strings larger than 20 digits?
In either case, its likely that the system uses a well known encryption algorithm (my guess is AES or DES), so without the key, it's infeasible for you to guess at the function.
Worse, if the system is not directly taking the input but adding some other information, you might have the right algorithm and key but still not realize it.

Related

Curious behavior with csvwrite and writing dates

When writing numbers into .csv in Matlab, it seems to modify the data.
This is alarming to me, something I have never seen before.
>> csvwrite('FirstCol.csv',[201210;201211])
>> twodates =csvread('FirstCol.csv')
twodates =
201210
201210
Now compare with xlswrite
>> xlswrite('FirstCol.xls',[201210;201211])
>> aa=xlsread('FirstCol.xls')
aa =
201210
201211
Could the reason be some automatic formatting underneath date-similar numbers? (My explanation is just mystcism)
From the csvwrite documentation
csvwrite writes a maximum of five significant digits. If you need greater precision, use dlmwrite with a precision argument.
So doing:
csvwrite('FirstCol.csv',[201210;201211])
csvread('FirstCol.csv')
you do indeed lose the final digit.
But by using dlmwrite, you can do
dlmwrite('FirstCol.csv',[201210;201211],'precision',6)
dlmread('FirstCol3.csv')
which does indeed result in the correct output.
I am using a Mac and I can't use xlswrite, but obviously that is a reasonable method as well.

Logical indexing and double precision numbers

I am trying to solve a non-linear system of equations using the Newton-Raphson iterative method, and in order to explore the parameter space of my variables, it is useful to store the previous solutions and use them as my first initial guess so that I stay in the basin of attraction.
I currently save my solutions in a structure array that I store in a .mat file, in about this way:
load('solutions.mat','sol');
str = struct('a',Param1,'b',Param2,'solution',SolutionVector);
sol=[sol;str];
save('solutions.mat','sol');
Now, I do another run, in which I need the above solution for different parameters NewParam1 and NewParam2. If Param1 = NewParam1-deltaParam1, and Param2 = NewParam2 - deltaParam2, then
load('solutions.mat','sol');
index = [sol.a]== NewParam1 - deltaParam1 & [sol.b]== NewParam2 - deltaParam2;
% logical index to find solution from first block
SolutionVector = sol(index).solution;
I sometimes get an error message saying that no such solution exists. The problem lies in the double precisions of my parameters, since 2-1 ~= 1 can happen in Matlab, but I can't seem to find an alternative way to achieve the same result. I have tried changing the numerical parameters to strings in the saving process, but then I ran into problems with logical indexing with strings.
Ideally, I would like to avoid multiplying my parameters by a power of 10 to make them integers as this will make the code quite messy to understand due to the number of parameters. Other than that, any help will be greatly appreciated. Thanks!
You should never use == when comparing double precision numbers in MATLAB. The reason is, as you state in the the question, that some numbers can't be represented precisely using binary numbers the same way 1/3 can't be written precisely using decimal numbers.
What you should do is something like this:
index = abs([sol.a] - (NewParam1 - deltaParam1)) < 1e-10 & ...
abs([sol.b] - (NewParam2 - deltaParam2)) < 1e-10;
I actually recommend not using eps, as it's so small that it might actually fail in some situations. You can however use a smaller number than 1e-10 if you need a very high level of accuracy (but how often do we work with numbers less than 1e-10)?

Save 4D matrix to a file with high precision (%1.40f) in Matlab

I need to write 4D matrix (M-(16x,101x,101x,6x) to a file with high precision ('precision'-'%1.40f') in MATLAB.
I've found save('filename.mat', 'M' ); for multidimensional matrix but precision cannot be set (only -double). On the other hand I've found dlmwrite('filename.txt', M, 'delimiter', '\t', 'precision', '%1.40f'); to set the precision but only limited to 2-D array.
Can somebody suggest a way to tackle with my problem?
What is the point in storing 40 digits of fractional part if double precision number in MATLAB keeps only 16 of them?
Try this code:
t=pi
whos
fprintf('%1.40f\n',t)
The output is
Name Size Bytes Class Attributes
t 1x1 8 double
3.1415926535897931000000000000000000000000
The command save('filename.mat', 'M' ); will store numbers in their binary representation (8 bytes per double-precision number). This is unbeatable in terms of space-saving comparing with plain-text representation.
As for the 4D shape the way j_kubik suggested seems simple enough.
I always thought that save will store exactly the same numbers you already have, with the precision that is already used to store them in matlab - you are not losing anything. The only problems might be disk space consumption (too precise numbers?) and closed format of .mat files (cannot be read by outside programs). If I wanted to just store the data and read them with matlab later on, I would definitely go with save.
save can also print ascii data, but it is (as dlmwrite) limited to 2D arrays, so using dlmwrite will be better in your case.
Another solution:
tmpM = [size(M), 0, reshape(M, [], 1)];
dlmwrite('filename.txt', tmpM, 'delimiter', '\t', 'precision', '%1.40f');
reading will be a bit more difficult, but only a bit ;)
Then you can just write your own function to write stuff to a file using fopen & fprintf (just as dlmwrite does) - there you can control every aspect of your file format (including precision).
Something I would have done if I really cared about precision, file-size and execution time (this is probably not the way for you) would be to write a mex function that takes a matrix parameter and stores it in a binary file by just copying raw data buffer from matlab. It would also need some indication of array dimensions, and would probably be the quickest (not sure if save doesn't already do something similar).

'exact' numerical value after numerical optimization MATLAB

I'm having the following issue with my code. I've been trying to use some other posts that I found on line, like this one. But they didn't what I'm looking for.
My code uses a MATLAB Exchange function which optimize a numerical value that is important to be with 32 digits after the dot such as
0.59329669191989231613604260928696
The optimization function can be found here and it is called fminsearchbnd
The optimization function calculate this and store the value in a variable that I use all over my code. In order not to perform the optimization everytime I want to store the variable (I tried either on a *.mat and on a label in the string form.
But when I retrieve it, MATLAB transforms it in a double precision variable 'cutting' all the numbers after the 14th. However I need all of them because they are important!
Is it possible to read a number like that w/o using vpa() because with a symbolic value I can't do anything.
Any help is really appreciated.
Thanks
EDIT:
fminsearchbnd gives me this class(bb) -> double and when I want to see it on the workspace it is 0.586675392365899. But when I set formatSpec = '%.32f\n'; because I want to see all the numbers that the optimization gives me, typing set(editLabel,'String',num2str(bb,formatSpec))
You're trying to store/use a number that cannot be represented exactly in an IEEE754 64-bit double-precision floating point number.
I'm not sure how you got that number without using vpa() in the first place, since 64-bit double is Matlab's maximum of precision...
You could use the multiple precision toolbox by Ben Barrowes, or HPF by John d'Errico, also from the FEX. You'll have to convert/construct to/from string if you want to store/load it to/from file.
But I have to agree with John's comment there:
The fact is, most of the time, if you can't do it with a double, you
are doing something wrong
so...why exactly are those 32-or-more digits important?

Bug in Matlab Symbolic ToolBox

I stumbled over the following strange thing while using matlab's symbolic toolbox
/ >> syms e
/>> y=11111111111111111^e
y =
11111111111111112^e
Seems like there is a limitation when working with large numbers. Can this be solved without changing to a completely different system, like sage?
I think the problem is that Matlab parses the number into a double before it converts it to
a symbolic expression. As a double has a 52-bit mantissa, you have approximately 16 significant digits but your number is longer.
As an alternative, you could try to create the number directly from a string:
y=sym('11111111111111111')^e
Unfortunately, I do not have Matlab available right now, so this answer is untested.
It's not a bug, it's a feature. And it's called "round-off error"
matlab uses a double format to store normal variable, just like the double in C programming language, and many other languages like C++.
Actually, the "bug" has nothing to do with the "^x",as we can see:
>> clear
>> syms y
>> format bank
>> y=11111111111111111
y =
11111111111111112.00
Even a simple assign triggers the "bug".
And we can see how a double variable is really stored in memory in VS, using debug mode:
As you can see in the screenshot, both a and b are stored as "2ea37c58cccccccc" in the memory, which means the computer can't tell one from the other.
And that's the reason of the "bug" you found.
To avoid this, you can use symbolic constant instead:
>> y=sym('11111111111111111')
y =
11111111111111111
In this way, the computer will store the "y" in memory in a different format, which will avoid round-off error and cost more memory.