How many nodes are in a perfect binary tree that is 32 nodes deep? - discrete-mathematics

According to a formula I found on Google it is 2^(32-1) = 2147483648, but that seems a little high. Is this correct?

Actually, it isn't 2^(32-1). It's (2^32)-1, or 4294967295.
So... even bigger than you had calculated.

We can prove the actual result ((2^r) - 1 where r is the number of rows) using induction!! Each row r in the tree contains 2^(r-1) nodes This is clear as each row contains double the number of elements of the last row, so the first row contains 2^(1 - 1) = 1 node, the second row contains twice this number 2^(2 - 1), 2 nodes.
Suppose that if we have r rows the formula for the total number of nodes is 2^r - 1. Check this for the trivial cases, 1 row has 2^1 - 1 = 1 nodes...
Now show that to go from the rth case to the rth + 1 row works So at the rth case we have (2^r) - 1 nodes to get the total number of rows for the (r + 1)th case we need to add on the (r + 1)th's nodes. The (r + 1)th row has 2^((r + 1) -1) nodes as discussed in the first paragraph. Adding this onto our current node total gives:
(2^r) - 1 + 2^(r + 1 - 1) = 2^r - 1 + 2^r = 2 * 2^r - 1 = 2^(r + 1) - 1
As this is the correct formula for the (r+1)th row we've proved the result.

Related

What is the expected length of the linked list in the bucket h(x) where x is inserted?

Imagine you have a weird hash function h: with probability 1/2 it maps a key uniformly at
random to one of the array slots in the range [0,m/4−1]; with probability 1/2, it maps a key
uniformly at random to one of the array slots in the range [m/4,m−1]. (Note: this assumption
is a replacement for the simple uniform hashing assumption.)
Assume chaining is used to resolve collisions, and that n items have been previously inserted
into the hash table. If a new item x is inserted, what is the expected length of the linked list in
the bucket h(x) where x is inserted?
(n/m)
(m/n)
(1/2)(n/m)
(1/2)(m/n)
(4/3)(n/m)
(3/4)(n/m)
(3/4)(m/n)
None of the above.
The answer is: (4/3)(n/m)
Can someone explain to me how do you calculate the expected value? My probability background is pretty weak so any explanation will help. Thanks!
No. of buckets in 1st half = (m/4 - 1) - (0) + 1 = m/4.
No. of buckets in 2nd half = (m - 1) - (m/4) + 1 = 3m/4.
Half of the n items would have gone into each side.
So n/2 items in 1st half to fit into m/4 buckets; n/2 items in 2nd half to fit into 3m/4 buckets.
E(length of linked list, i.e. “repeats” in 1st half) = n/2 / m/4 = 2(n/m).
E(length of linked list, i.e. “repeats” in 2nd half) = n/2 / 3m/4 = (2/3)(n/m).
x has an equal chance of being inserted into 1st half and 2nd half,
i.e. P(x in 1st half) = 1/2, P(x in 2nd half) = 1/2.
E(length of linked list in x’s bucket) = (1/2) (2(n/m)) + (1/2) ((2/3)(n/m) = n/m + (1/3)(n/m) = (4/3)(n/m).

how to prove that the hash function h(x) = x² mod 4 yields only to 0 and 1

How can I prove that the hash function h(x) = x² mod 4 yields only to {0, 1}, with x as an element of the natural numbers?
Let's first cover the even numbers, 2n (where n is a natural number):
(2n)2
= (2n)(2n)
= (2)(n)(2)(n)
= (2)(2)(n)(n)
= 4n2
= 4(n2)
That's an exact multiple of four so the remainder when dividing by four will always be zero.
Now let's cover the odd numbers, 2n + 1:
(2n + 1)2
= (2n + 1)(2n + 1)
= (2n)(2n) + (2n)(1) + (1)(2n) + (1)(1)
= 4n2 + 2n + 2n + 1
= 4n2 + 4n + 1
= 4(n2 + n) + 1
That's exactly one more than a multiple of four hence the remainder when dividing by four will always be one.
Now, let's look at any natural numbers that are neither even nor odd.
Wait a minute, there aren't any. I guess that means we're done :-)
And, before anyone points out that some languages may give a negative remainder when the arguments are negative, that doesn't actually apply here since the square of a natural number can never be negative.

How do I count the number of occurrences of values in 2 arrays (one is part of the other) in Matlab?

So say I have an array A = [1,2,2,3,4,5,5,5,7]
This is part of another larger array B = [1,2,2,2,2,3,3,3,4,5,5,5,6,7]
What I'd like to do is to count how many times each element appears in A, and divide it by the times it appears in B, and hopefully tabulate the result using the tabulate function.
I'd like my final tabulated result to look as follows
Element - Occurrence - %age of occurrence
1 - 1 - 100%
2 - 2 - 50%
3 - 1 - 33.3%
4 - 1 - 100%
5 - 3 - 100%
6 - 0 - 0%
7 - 1 - 100%
I believe this would involve a for loop where I create a new array C such that it identifies which elements in A appear in B and each time it does add 1 to its respective value, and if it doesn't exist in B, return 0. I don't know how to proceed though and some direction would be greatly appreciated!
This is a good use case for hist, which is usually quite fast. You can bin the data in A to histogram bins ranging from min(A) to max(A), and apply the same bins to allocate the data in B. Then you can get your percentage values by simply dividing the numbers of occurrences in both arrays.
For example:
[nA, uA] = hist(A, min(A):max(A));
nB = hist(B, uA);
result = 100*(nA./nB)'
Edit: the numbers of occurrences of elements of A is given by nA.

Create matrix of row-wise increasing differences based on vector

I have a column vector in MATLAB and am trying to construct a matrix of differences with row-wise varying size of difference.
It is hard to explain in words, so I will illustrate with an example:
lets say my data is:
data = [ 1 2 3 4 5 6 ];
what i am trying to do, is make a matrix that takes the differences as such (each column difference size changes [increasing by one]):
diff =
[(2 - 1) ...
(3 - 2) (3 - 1) ...
(4 - 3) (4 - 2) (4 - 1) ...
(5 - 4) (5 - 3) (5 - 2) (5 - 1) ...
(6 - 5) (6 - 4) (6 - 3) (6- 2) (6 - 1)]
My best guess of doing this was to make a triangle matrix with nested loops. My MATLAB code looks like this:
differences = zeros(length(data) - 1, length(data) - 1);
step = 0;
for j = 1:1:size(data) - 1;
for i = 1:size(logquarterly) - 1 - step;
if j <= i;
differences(i,j) = data(i + 1 + step , 1) - data(i,1);
step = step + 1;
end
end
end
What I am trying to do is calculate the first column of differences with distance 1, then calculate the second column of differences with distance 2 and so on... To accommodate the necessary row values I need, I am using the "step" variable which is set to zero for calculating column one, I then want it to increase by 1 when calculating column 2 to have the correct dimensions. But I can not make it work. If I take the "step" out and use this:
differences = zeros(length(data) - 1, length(data) - 1);
for j = 1:1:size(data) - 1;
for i = 1:size(logquarterly) - 1;
if j <= i;
differences(i,j) = data(i + 1 , 1) - data(i,1);
end
end
end
everything works, but each column has the same distance of differences and it does not increase by one. Any ideas guys?
If I understand right, you want to do that:
data = [ 1 2 3 4 5 6 ];
n = numel(data);
%// calculate differences
diffs = bsxfun(#minus, data(end:-1:1), data(end:-1:1).')
%'
%// get linear indices from circulant sub-indices for rows and
%// linear indices for columns
idx = sub2ind([n n], gallery('circul',n:-1:1), ndgrid(1:n,1:n))
%// mask output and get lower triangular matrix
output = tril(diffs(idx(n-1:-1:1,n-1:-1:1)))
so the output is:
output =
1 0 0 0 0
1 2 0 0 0
1 2 3 0 0
1 2 3 4 0
1 2 3 4 5
The problem with your solution is that it will only work with column vectors, because of the loop j = 1:1:size(data)-1. The call of size will return [1,6]; then the -1 is applied yielding [0,5]. Then only the first value of this vector is taken and in turn the for loop will only run from 1 to 1-1==0, i.e. NOT.
Use numel or size(.,1)/size(.,2) instead. (Also don't use semicola ; after the loop initialization). (Try out the MATLAB debugger!)
Here is my take on how to repair your approach:
differences = zeros(length(data)-1, length(data)-1);
for j = 1:size(differences,2)
for i = j:size(differences,1)
differences(i,j) = data(i+1) - data(i-j+1);
end
end
I like the use of gallery('circul',n:-1:1), in thewaywewalk's answer, I do however find the rest a bit too complicated.
Here is my take reusing his idea:
n = numel(data);
L = ndgrid(2:n,2:n); % // Generate indices for Left side of operator
R = gallery('circul',1:n-1).'; %'// Generate indices for Right side of operator
out = tril(data(L) - data(R)) % // Do subtraction of corresponding indices

Find the number of pairs whose sum is divisible by k?

Given a value of k. Such that k<=100000
We have to print the number of pairs such that sum of elements of each pair is divisible by k.
under the following condition first element should be smaller than second, and both element should be less than 109.
I've found a solution, let a and b numbers such that (a+b)%k=0 then we have to find that pairs (a,b), where a<b, so let's count how many pairs (a,b) satisfy the condition that a+b=k, for example if k=3 0+3=3, 1+2=3, 2+1=3, 3+0=3 there are 4 pairs but only 2 pairs which is (K+1)/2 (integer division) so similar for find the pairs (a,b) which sum is 2k, 3k,.. nk, and the solution will be (k+1)/2 + (2k+1)/2 + (3k+1)/2 + ... + (nk+1)/2, and that is equal to (k*n*(n+1)/2 + n)/2 with time complexity O(1), take care in the case if n*k=2*10^9, because a can't be more than 10^9 for the given constraint.
Solution in O(N) time and O(N) space using hash map.
The concept is as follows:
If (a+b)%k=0 where
a=k*SOME_CONSTANT_1+REMAINDER_1
b=k*SOME_CONSTANT_2+REMAINDER_2
then (REMAINDER_1 +REMAINDER_2 )%k will surely be 0
so for an array (4,2,3,31,14,16,8) and k =5 if you have some information like below , you can figure out which all pairs sum %k =0
Note that, Bottom most row consist of all the remainders from 0 to k-1 and all the numbers corresponding to it.
Now all you need to do is move both the pointer towards each other until they meet. If both the pointers locations have number associated with it their sum%k will be 0
To solve it, you can keep track of all the remainder you have seen so far by using hash table
create a hash map Map<Integer, List>.
Pre-populate its keys with 0 to k-1;
iterate over array and put remainder of each number in the map with Key = remainder and put the actual number in the list,
Iterate over the key set using two pointers moving each other. And sum += listSizeAsPointedByPointer1 * listSizeAsPointedByPointer2
One way is brute force:
int numPairs = 0;
for (i = 0; i < 10e9; i++)
{
for (j = i+1; j < 10e9; j++)
{
int sum = i + j;
if (sum % k == 0) numPairs++;
}
}
return numPairs;
I'll leave it up to you to optimize this for performance. I can think of at least one way to significantly speed this up.
Some psuedo-code to get you started. It uses the brute-force technique you say you tried, but maybe something was wrong in your code?
max = 1000000000
numberPairs = 0
for i = 1 to max - 2 do
for j = i + 1 to max - 1 do
if (i + j) mod k = 0 then
numberPairs = numberPairs + 1
end if
end do
end do