Periodicity of hash functions

Periodicity of hash functions - hash

Consider the naive hash function: HASH = INPUT % 4. This function is periodic in the sense that if we call it with sequential numbers 0, 1, 2, 3, 4, 5, ... the produced hashed sequence will have periodicity of four: 0, 1, 2, 3, 0, 1, 2, 3, 0, ....
My question is whether modern cryptographic hash functions, such as SHA256, are periodic in this sense? In other words, are there some integers 0 <= n and 0 < k such that HASH(n + b) = HASH(n + b + ak) for all integers b in [0, k - 1] and all positive integers a? For example, will the sequence SHA256(0), SHA256(1), SHA256(2), SHA256(3), ... be periodic after some point?

Absolutely not. If that was the case it would be trivial to find a collision. The strength of a cryptographic hash function is defined by how hard it is to find Hash(a) == Hash(b). Ideally you need to find all values of Hash(b) to find a collision, which is infeasible if Hash is a lot of bits.

Related

Strategy to get the 'best' number

If there is a vector of numbers (size = n), we want to find the number that is the 'best'.
The criteria for the best number is its frequency should be > 50% of the total size of the vector of numbers.
Given that there will always be only one best number.
How will you solve this with O(1) space complexity and O(n) time complexity?
eg. input : [1, 1, 1, 3, 3, 3, 3]
ans : 3 because (its frequency i.e. 4 is greater than 50% of 7 = 3)

Incorrect results with vDSP_conv()

I am getting inconsitent results when attempting to do convolution using vDSP_conv() from Accelerate when compared to the MATLAB implementation. There have been a couple of StackOverflow posts about weird results when using this function to calculate convolution, however as far as I can tell, I am using the framework correctly and have incorporated the suggestions from the other Stack Overflow posts. Here is my code:
public func conv(x: [Float], k: [Float]) -> [Float] {
let resultSize = x.count + k.count - 1
var result = [Float](count: resultSize, repeatedValue: 0)
let kEnd = UnsafePointer<Float>(k).advancedBy(k.count - 1)
let xPad: [Float] = [Float](count: (2*k.count)+1, repeatedValue: 0.0)
let xPadded = x + xPad
vDSP_conv(xPadded, 1, kEnd, -1, &result, 1, vDSP_Length(resultSize), vDSP_Length(k.count))
}
As far as I can tell, I am doing the correct zero padding as specified in the Accelerate framework documentation here
I defined two test arrays A: [Float] = [0, 0, 1, 0, 0] and B: [float] = [1, 0, 0].
In MATLAB, when I run conv(A, B), I get [0, 0, 1, 0, 0, 0, 0].
However, when I run the above vDSP conv() I get, [1, 0, 0, 0, 0, 0, 0].
What is wrong with my implementation? I have gone over this a number of times and looked through all the SO posts that I could find, and still haven't been able to account for this inconsistency.
Beyond that, is there a more efficient method to zero-pad the array then what I have here? In order to keep x immutable, I created the new xPadded array but there is undoubtedly a more efficient method of performing this padding.
** EDIT **
As suggested by Martin R, I padded k.count -1 equally at the beginning and end of the array as shown below.
public func conv(x: [Float], k: [Float]) -> [Float] {
let resultSize = x.count + k.count - 1
var result = [Float](count: resultSize, repeatedValue: 0)
let kEnd = UnsafePointer<Float>(k).advancedBy(k.count - 1)
let xPad: [Float] = [Float](count: k.count-1, repeatedValue: 0.0)
let xPadded = xPad + x + xPad
vDSP_conv(xPadded, 1, kEnd, -1, &result, 1, vDSP_Length(resultSize), vDSP_Length(k.count))
return result
}
Using this code, conv(A, B) still returns [1, 0, 0, 0, 0, 0, 0].
I am calling the function as shown below:
let A: [Float] = [0, 0, 1, 0, 0]
let B: [Float] = [1, 0, 0]
let C: [Float] = conv(A, k: B)

For two arrays A and B of length m and n,
the vDSP_conv() function from the Accelerate framework computes a new array of length m - n + 1.
This corresponds to the result of the MATLAB function conv() with the shape
parameter set to "valid":
Only those parts of the convolution that are computed without the zero-padded edges. ...
To get the same result as the with "full" convolution from MATLAB
you have to zero-pad the A array with n-1 elements at the beginning and the end, this gives a result array of length m + n - 1.
Applied to your function:
let xPad = Repeat(count: k.count - 1, repeatedValue: Float(0.0))
let xPadded = xPad + x + xPad
Using Repeat() might be slightly more performant because it
creates a sequence and not an array. But ultimately, a new array
has to be created as an argument to thevDSP_conv() function,
so there is not much room for improvement.

Some clarifications for the next poor soul who stumbles into this:
Apple provides some sample code on how to use vDSP_conv but it's pretty useless. In fact, it was confusing me because a comment in that code says that the input buffer needs to be padded without specifying where the actual input samples should be placed:
The SignalLength defined below is used to allocate space, and it is the filter length rounded up to a multiple of four elements and added to the result length.
SignalLength = (FilterLength+3 & -4u) + ResultLength;
So, the above formula gives you a different length (bigger) than the xPad + x + xPad where xPad is the k.count - 1.
The important thing is where in that padded buffer you copy your input (signal) samples: it needs to be at k.count - 1.
So, the above accepted solution works. But if you trust that comment in Apple's example (which BTW doesn't show up in the official docs) then you can do a compromise: use their formula (the SignalLength above) to calculate and allocate the padded buffer (it will be a bit larger) and use the k.count - 1 (i.e. filter length - 1) as the starting offset for your signal (x in this case). I did this and the results now match ippsConvolve_32f and Matlab.
(Sorry, this should have been a comment but I don't have enough reputation for that).

#MartinR I figured out why my code doesn't work with Arrays. I was writing this code in a project that was using Surge as a linked framework. Surge overloads the + operator for [Float] and [Double] arrays so that it becomes element-wise addition of array elements. So when I was doing x + xPad it wasn't extending the size of the array as expected, it was simply returning x as xPad only contained zeros. However, Surge had not overloaded the +operator for sequences, so using Repeat() successfully extended the array. Thanks for your help - never would have thought to try sequences!

Sequence in MATLAB

Write a single MATLAB expression to generate a vector that contains first 100 terms of the following sequence: 2, -4, 8, -16, 32, …
My attempt :
n = -1
for i = 1:100
n = n * 2
disp(n)
end
The problem is that all values of n is not displayed in a single (1 x 100) vector. Neither the alternating positive and negative terms are shown. How to do that ?

You are having a geometric series where r = -2.
To produce 2, -4, 8, -16, 32, type this:
>>-(-2).^[1:5]
2, -4, 8, -16, 32
You can change the value of 5 accordingly.

Though there are better methods, as mentioned in the answer by #lakesh. I will point out the mistakes in your code.
By typing n = n * 2, how can it become a vector?
By doing n=n * 2, you are going to generate -2, -4, -8, -16, ...
Therefore, the correct code should be:
n = -1
for i = 2:101 % 1 extra term since first term has to be discarded later
n(i) = -n(i-1) * 2;
disp(n)
end
You can discard first element of n, to get the exact series you want.
n(end)=[];

How to find all index pairs of unequal elements in vector (Matlab)

Lets say I have the following vector in Matlab:
V = [4, 5, 5, 7];
How can I list (in a n-by-2 matrix for example) all the index pairs corresponding to unequal elements in the vector. For example for this particular vector the index pairs would be:
index pairs (1, 2) and (1, 3) corresponding to element pair (4,5)
index pair (1, 4) corresponding to element pair (4,7)
index pairs (2, 4) and (3, 4) corresponding to element pair (5,7)
The reason I need this is because I have a cost-function which takes a vector such as V as input and produces a cost-value.
I want to see how does the random swapping of two differing elements in the vector affect the cost value (using this for steepest descent hill climbing).
The order of the index pairs doesn't also matter. For my purposes (1,2) is the same as (2,1).
For example if my cost-function was evalCost(), then I could have V = [4, 5, 5, 7] and
evalCost(V) = 14
whereas for W = [4, 7, 5, 5] the cost could be:
evalCost(W) = 10
How to get the list of "swapping" pair indexes in Matlab. Hope my question is clear =)

I don't understand the cost function part, but the first part is simple:
[a,b]=unique(V)
C = combnk(b,2)
C contains the indices, and V(C) the values:
C = combnk(b,2)
C =
1 2
1 4
2 4
V(C)
ans =
4 5
4 7
5 7

Use bsxfun and then the two-ouput version of find to get the pairs. triu is applied to the output of bsxfun to consider only one of the two possible orders.
[ii jj] = find(triu(bsxfun(#ne, V, V.')));
pairs = [ii jj];

How to get a regular sampled matrix in Scilab

I'm trying to program a function (or even better it it already exists) in scilab that calculates a regular timed samples of values.
IE: I have a vector 'values' which contains the value of a signal at different times. This times are in the vector 'times'. So at time times(N), the signal has value values(N).
At the moment the times are not regular, so the variable 'times' and 'values' can look like:
times = [0, 2, 6, 8, 14]
values= [5, 9, 10, 1, 6]
This represents that the signal had value 5 from second 0 to second 2. Value 9 from second 2 to second 6, etc.
Therefore, if I want to calculate the signal average value I can not just calculate the average of vector 'values'. This is because for example the signal can be for a long time with the same value, but there will be only one value in the vector.
One option is to take the deltaT to calculate the media, but I will also need to perform other calculations:average, etc.
Other option is to create a function that given a deltaT, samples the time and values vectors to produce an equally spaced time vector and corresponding values. For example, with deltaT=2 and the previous vectors,
[sampledTime, sampledValues] = regularSample(times, values, 2)
sampledTime = [0, 2, 4, 6, 8, 10, 12, 14]
sampledValues = [5, 9, 9, 10, 1, 1, 1, 6]
This is easy if deltaT is small enough to fit exactly with all the times. If the deltaT is bigger, then the average of values or some approximation must be done...
Is there anything already done in Scilab?
How can this function be programmed?
Thanks a lot!
PS: I don't know if this is the correct forum to post scilab questions, so any pointer would also be useful.

If you like to implement it yourself, you can use a weighted sum.
times = [0, 2, 6, 8, 14]
values = [5, 9, 10, 1, 6]
weightedSum = 0
highestIndex = length(times)
for i=1:(highestIndex-1)
// Get the amount of time a certain value contributed
deltaTime = times(i+1) - times(i);
// Add the weighted amount to the total weighted sum
weightedSum = weightedSum + deltaTime * values(i);
end
totalTimeDelta = times($) - times(1);
average = weightedSum / totalTimeDelta
printf( "Result is %f", average )
Or If you want to use functionally the same, but less readable code
timeDeltas = diff(times)
sum(timeDeltas.*values(1:$-1))/sum(timeDeltas)